Abstract
Dryland ecosystems are dominant influences on both the trend and interannual variability of the terrestrial carbon sink. Despite their importance, dryland carbon dynamics are not well-characterized by current models. Here, we present DryFlux, an upscaled product built on a dense network of eddy covariance sites in the North American Southwest. To estimate dryland gross primary productivity, we fuse in situ fluxes with remote sensing and meteorological observations using machine learning. DryFlux explicitly accounts for intra-annual variation in water availability, and accurately predicts interannual and seasonal variability in carbon uptake. Applying DryFlux globally indicates existing products may underestimate impacts of large-scale climate patterns on the interannual variability of dryland carbon uptake. We anticipate DryFlux will be an improved benchmark for earth system models in drylands, and prompt a more sensitive accounting of water limitation on the carbon cycle.
Introduction
Dryland (arid and semi-arid) systems have a dominant influence on both the trend and interannual variability in the global terrestrial carbon sink1,2, yet land surface and remote sensing models of primary production perform poorly in these regions3,4,5. Dryland ecosystems are distributed throughout the world and occupy 40% of global land area6. During the early development of Land Surface Models, model tests were more common in temperate and tropical ecosystems than in arid regions (e.g.,7,8 and many early Land Surface Models did not distinguish critical dryland plant functional types9,10,11,12,13,14,15. Additionally, dryland ecosystems are often poorly represented in datasets used to drive and calibrate remote sensing-based models of primary production16. Accurate representation of dryland carbon dynamics in global-scale process- and remote sensing-based models could improve the accuracy of terrestrial carbon sink estimates and advance our understanding of the global carbon cycle.
Several key features of dryland ecohydrology, phenology, and biogeography make it particularly challenging to predict carbon dynamics in these systems. Drylands are highly sensitive to variations in water availability17, and persistent water limitation in these systems has resulted in physiological adaptations that lead to tight coupling of biogeochemical and water cycles6,18,19,20. Interannual variability in dryland precipitation can exceed 50% of mean annual precipitation, compared to 5–10% in more mesic systems, resulting in high interannual variability in dryland carbon uptake21. Strong sensitivity to highly variable hydroclimatic conditions can also manifest in ‘flashy’ ecosystem responses: rapid carbon uptake and growth in response to precipitation pulses22,23. Flashy responses to moisture inputs propagate at longer timescales in drylands, relative to more mesic systems, and strongly influence annual carbon uptake23,24,25. Drought, defined here as moisture stress that impacts ecosystem functioning26, is a particularly impactful and increasingly prevalent water availability condition in drylands27. Drylands also have high degrees of spatial heterogeneity, with diverse ecosystem types and moisture regimes contained within a single model pixel, which further complicates modeling efforts13,14. Although existing models are designed to represent patterns of the global carbon cycle, they are not attuned to these key features of drylands. Dryland carbon uptake strongly influences the variability of the global terrestrial carbon sink1 and the magnitude of interannual variability in dryland carbon uptake is likely underestimated by existing models of productivity19.
Here we used satellite and gridded climate observations, measurements from eddy covariance towers, and machine learning to predict spatial and temporal patterns in plant carbon uptake via photosynthesis (gross primary productivity; GPP). We upscaled carbon flux observations using remotely sensed and gridded meteorological inputs to produce spatially and temporally continuous high-resolution estimates of GPP28,29,30. Existing continental and global upscaled products, including the FLUXCOM31 and FluxSat32 products, were designed for continental-scale and global analysis and are generally based on remotely sensed estimates of photosynthetically active vegetation, such as the fraction of absorbed photosynthetically active radiation (fAPAR) and vegetation indices like the normalized difference vegetation index (NDVI). The global Moderate Resolution Imaging Spectroradiometer (MODIS) GPP products are based on satellite data and include biome-specific light use efficiency values and atmospheric demand scalars, but still rely heavily on measures of satellite greenness. These models do not include important ecohydrological controls over carbon dynamics in drylands, such as explicit representation of water limitation or antecedent moisture conditions. To accurately project changes in the global carbon cycle, we need models that sufficiently represent seasonal and interannual variability in dryland carbon dynamics.
The southwest region of North America is an exemplary location to develop a model tuned to drought-carbon dynamics because of its high spatial heterogeneity, including broad biogeographic regions (California Mediterranean, Intermountain, and Mojave, Sonoran, and Chihuahuan deserts); complex topography and associated ecological transitions (grasslands, shrublands, and forests); and a large degree of variation in precipitation and climate regimes (i.e., a gradient from winter- to summer-dominated precipitation regimes and mean annual precipitation range from 100 to 700 mm). In this study, we evaluated the heterogeneous, region-wide sensitivity of carbon uptake to climate in Southwestern dryland systems. Since drought is a prevalent mode of water availability in drylands that is projected to increase in frequency and intensity in the future, we explicitly accounted for drought and other moisture anomalies in our approach. We evaluated the heterogeneous, region-wide sensitivity of carbon uptake to climate and used this sensitivity to estimate spatial and temporal responses of carbon uptake in dryland ecosystems to global patterns of moisture anomalies. We used a network of 24 dryland eddy covariance sites in the Southwest (southwest United States and northwest Mexico) representing diverse dryland ecosystems and climate spaces19 (Supplementary Fig. 1) to develop a machine learning model that predicted GPP.
Our product, DryFlux, is specifically tuned to key ecohydrological characteristics of dryland systems33. Our approach focused on accounting for the tight ecohydrological coupling between water and carbon cycles in drylands to represent the seasonal and interannual variability in carbon uptake in these systems. A crucial difference between our upscaled product and other remote sensing-driven upscaled flux products is the explicit consideration of the impact of antecedent moisture conditions (through the inclusion of previous months’ precipitation and the Standardized Precipitation Evapotranspiration Index (SPEI) at various temporal windows as predictors) on GPP. Our approach consisted of two main components: first we derived relationships between climate and vegetation predictors with GPP from flux towers using the random forest machine learning algorithm, and second, we applied the trained model to remotely sensed data inputs to generate spatially and temporally continuous carbon uptake estimates from 2000 to 2016 at 0.5° spatial resolution. We compared our DryFlux product to a machine learning upscaled project, FLUXCOM31, and the MODIS GPP data product (MOD17A2H v006), which are widely used to evaluate terrestrial primary production in Land Surface Models34,35.
Results and discussion
DryFlux more accurately characterized inter- and intra-annual variation in dryland GPP than both FLUXCOM and MODIS GPP (Fig. 1). Representation of interannual variability in DryFlux GPP across Southwest sites exceeded an R2 of 0.9 in 18 out of 24 sites, compared to 1 out of 24 sites in the FLUXCOM GPP estimates (Fig. 1a, b and Supplementary Table 1). The relationship between modeled (DryFlux) and observed (tower) monthly GPP for 2000–2015 with DryFlux had an R2 of 0.88 across all sites, compared to R2 = 0.43 for FLUXCOM and R2 = 0.41 for MODIS GPP (Fig. 1c). DryFlux predicted large-magnitude GPP months better than MODIS GPP or FLUXCOM, maintaining accuracy at high levels of GPP based on the slope of the linear regression between modeled and observed GPP (Fig. 1c; m = 0.88). The FLUXCOM and MODIS GPP products consistently underestimated months with high GPP values (Fig. 1c; m = 0.48 and m = 0.45, respectively). DryFlux, which explicitly incorporates ecohydrological water–carbon coupling, showed improved estimates of interannual variability and seasonal cycling. DryFlux accurately reproduced the dynamic seasonal cycle of GPP in sites representing the dominant vegetation cover types typical of dryland regions including: semi-arid forests, shrublands, and grasslands better than either FLUXCOM or MODIS GPP (Fig. 1d–f). DryFlux better captured seasonal variation in dryland fluxes than FLUXCOM or MODIS at 23 of 24 Southwest sites based on the correlation coefficient between modeled and observed values (Fig. 1 and Supplementary Fig. 2). DryFlux also better captured within-season variation in dryland fluxes than FLUXCOM or MODIS (at 22 of 24 and 24 of 24 Southwest sites, respectively; Supplementary Fig. 2). Characterization of the seasonal cycle of GPP captured the flashy seasonal dynamics of GPP in grasslands, shrublands, and forests related to intra-annual changes in water supply with both the preceding month’s precipitation and the current month’s potential evapotranspiration (PET) emerging as important predictors in the random forest models (Supplementary Fig. 3). Accurate representation of land cover, particularly in highly heterogeneous dryland regions like the Southwest, is important for generating estimates of carbon fluxes. Due to challenges in spatial resolution and classification accuracy, land cover maps can be a major source of uncertainty in regional-scale upscaled flux estimates36. In DryFlux, we chose to include mean annual precipitation. (MAP), mean annual temperature (MAT), elevation, and vegetation indices in lieu of a land cover classification. Both MAP and vegetation indices like NDVI were important predictors (Supplementary Fig. 3), implying these variables captured spatial heterogeneity in vegetation in the Southwest. DryFlux captures the typical bimodality in Southwest ecosystems—the summer peak driven by monsoon rains in grasslands (Fig. 1d), shrublands (Fig. 1e), and forests (Fig. 1f), and the springtime GPP peak driven by snowmelt in high-elevation forests (Fig. 1f and Supplementary Fig. 4). While snowmelt was not a predictor in our model, the model’s strong performance at high-elevation forest sites implies other predictor variables provided information related to the onset of springtime GPP.
a Coefficient of determination (R2) between annual FLUXCOM GPP and annual tower GPP (gCm−2) for all flux sites. Sites are colored by vegetation class, with diagonal hash mark indicating sites left out of model building (testing sites). The dashed horizontal line indicates an R2 of 0.9 to facilitate visual comparison of model performance. Sites are arranged according to MAP, with MX-Lpa having the lowest MAP (182 mm) and US-Vcm having the highest (724 mm). b Same as (a) but R2 between annual DryFlux GPP and tower GPP. c Relationships between modeled GPP and observed monthly GPP. Monthly (in gCm−2 day−1) FLUXCOM GPP (teal), MODIS GPP (green), and DryFlux GPP (dark gray) and the associated linear equations are displayed, with the 1:1 line between modeled and observed GPP in black. d–f Accuracy of DryFlux GPP seasonal cycle compared to observed, MODIS, and FLUXCOM GPP for a representative grassland (d), shrubland (e), and forest (f) site. The seasonal cycle is represented as the average monthly GPP (in gCm−2 day−1) for all available site years. Lines represent: observed (brown), DryFlux (dark gray), MODIS (green), and FLUXCOM GPP (teal). Error bars represent one standard deviation above and below the monthly mean.
Although improved model performance in the North American Southwest is expected (since our model was specifically trained in this region), investigation of model performance metrics revealed the importance of ecohydrological coupling in drylands. DryFlux accurately captured the large degree of interannual GPP variability in Southwestern drylands (Fig. 2). There was a large degree of variation in observed GPP values across all sites with the largest amount of variation observed in forests during the summer months (σ = 1.19) and savanna/shrublands having the least amount of variation (σ = 0.94; Fig. 2a). DryFlux represented more variability than the MODIS or FLUXCOM GPP data products in the summer months for sites included in the present analysis (Fig. 2a). DryFlux also captured interannual variability in GPP more accurately than MODIS or FLUXCOM data products (Fig. 2b and Supplementary Table 1). Since our model captured interannual variability well, when we evaluated the differences between GPP predictions from the years following strong El Niño (2015) and La Niña (2011) events, total DryFlux GPP predictions encompassed a larger range of values than FLUXCOM predictions (Fig. 2c, d).
a Summer interannual variability (IAV), represented by the standard deviation around the long-term (2000–2016) GPP mean (gCm−2) pooled for forest, grassland, and savanna/shrubland sites. Bars represent: observed (brown), DryFlux (dark gray), MODIS (green), and FLUXCOM GPP (teal). b Differences in interannual GPP (in gCm−2 day−1) across all sites. Lines represent: observed (brown), DryFlux (dark gray), MODIS (green), and FLUXCOM GPP (teal). Differences in estimated annual GPP (gCm−2 year−1) at 0.5° resolution between 2011 and 2015 in (c) the DryFlux and (d) FLUXCOM models. Southwest flux tower sites used in DryFlux model building and testing are symbolized as crosses on (c).
To assess the implications of our more realistic climatic control of dryland carbon uptake beyond the Southwest, we applied DryFlux to dryland regions globally at a 0.5° resolution (Fig. 3a, b). The El Niño Southern Oscillation, triggers global climate teleconnections that result in both positive and negative departures from normal rainfall patterns37. To assess the impacts of these variable moisture conditions on carbon uptake variability in dryland systems beyond the Southwest, we selected the strongest El Niño year (2015–2016) and strongest La Niña year (2010–2011) in the MODIS data record based on the Oceanic Niño Index38. We compared annual carbon uptake per-pixel between 2011 and 2015 to assess the impacts of drought on carbon uptake (Fig. 3). The 2010–2011 La Niña, which was the strongest in the past eight decades39, led to strong carbon uptake in Australian semi-arid systems (Fig. 3a) that explained most of the exceptionally large global carbon sink in 20112. Carbon uptake in the North American Southwest was spatially variable—portions of Texas and northeastern Mexico had abnormally low GPP and western regions (i.e., Nevada, Oregon) had anomalously high GPP in 2011 (Fig. 3a). During the strong El Niño year in 2015, these trends were reversed (but weaker) in Southwestern North America (Fig. 3b), as much of the West had low GPP and the portions of Texas and northeastern Mexico had high GPP (based on per-pixel Z-scores). The El Niño event in 2015–2016 led to severe drought in much of Australia40. To test the ability of DryFlux to capture strong interannual variability related to global weather-producing phenomena, we calculated the per-pixel difference in GPP between 2011 and 2015 (Fig. 3c, d). Overall, both FLUXCOM and DryFlux showed an increase in carbon uptake over Australia in the La Niña year compared to the El Niño year (Fig. 3c, d), but, importantly, DryFlux showed a larger (38.6%) reduction in carbon uptake in the El Niño year compared to the La Niña year, compared to only a 11.8% reduction estimated by FLUXCOM (Fig. 3c, d). Together, these results suggest that models not tuned to dryland dynamics could underestimate interannual variability in dryland carbon fluxes. Supporting this idea, we find that in the Southwest, FLUXCOM and MODIS GPP uniformly overestimate carbon uptake in semi-arid grasslands and shrublands, and underestimate carbon uptake in semi-arid forests (Supplementary Fig. 4 and Supplementary Fig. 5). We chose to compare spatial patterns in DryFlux to FLUXCOM instead of MODIS GPP in the Southwest and Australia (Figs. 2c, d and 3) for two reasons: first, we anticipate the two upscaled products will have similar applications and user bases and thus the comparison is more relevant to the research community, and second, FLUXCOM generally outperformed MODIS GPP in drylands (Supplementary Tables 1 and 2), so the comparison is more informative of DryFlux performance.
Our validation of globally upscaled DryFlux using eddy covariance sites with contrasting phenology in Africa, Australia, Europe, and South America (Supplementary Table 3) showed comparable or better performance than MODIS and FLUXCOM at the majority of global dryland sites (Supplementary Fig. 6 and Supplementary Table 2). The recent FluxSat32 product was not compared with DryFlux, but will be included in future analyses. Like FLUXCOM, FluxSat was designed for global analyses, included few sites from the US Southwest in model building (four), and had poor performance at some dryland sites32. While eddy covariance data is not widely available throughout global drylands, our preliminary validation shows that DryFlux is likely broadly applicable beyond the Southwest. Sites where DryFlux performed particularly poorly were Australian sites with high measurement variability (AU-Lox) and savanna sites with phenology novel to the model (AU-Dry, AU-GWW, AU-DaS; Supplementary Fig. 6 and Supplementary Table 5). Future iterations of DryFlux should: incorporate sites with vegetation types poorly represented in the Southwest training dataset (savanna, deciduous broadleaf forest) in model building; pinpoint dryland regions where additional flux data is needed; and comprehensively validate DryFlux performance at global dryland sites. Furthermore, the potential for additional ecohydrological variables to inform GPP estimates, including soil moisture and actual evapotranspiration, should be explored. We anticipate the framework we developed will be readily extensible beyond Southwestern North America as dryland flux measurements become more available across the globe.
To establish that ecohydrological coupling was important for the improvements we saw in DryFlux, we built and validated a version of the DryFlux model without ecohydrological variables. The ecohydrological variables had the largest impact on predictions of interannual variation in fluxes driven by year-to-year variation in weather patterns in both the Southwest and Australia (Supplementary Figs. 7 and 8). Without ecohydrological variables, the model underestimated interannual variability in summer GPP at Southwest sites compared to the full DryFlux model (Supplementary Fig. 7a, b). The model without ecohydrological variables applied to the full Southwest region had muted responses to differences in carbon uptake between a strong La Nina (2011) and strong El Nino (2015) years (Supplementary Fig. 7c, d). This trend was also shown in Australia dryland regions—when ecohydrological variables were excluded from DryFlux, the difference in GPP uptake between 2011 and 2015 decreased (Supplementary Fig. 8). These results suggest that DryFlux’s sensitivity to moisture conditions results in GPP estimates that are more responsive to interannual variability in weather patterns than existing models. Overall, water–carbon coupling appears particularly important for capturing interannual variability in dryland fluxes.
DryFlux more accurately represented other key features of dryland ecosystem dynamics including the characteristic dual peak in SW forests driven by springtime snowmelt and summer rains (Fig. 1f) and rapid carbon uptake in flashy response to moisture inputs in grasslands (as reflected in the root mean squared error (RMSE) and the root mean square of successive differences (RMSSD), a measure of variability in a time series in Fig. 1d–f) than the alternate models analyzed41,42. Furthermore, we consistently saw that months with abnormally high precipitation were often followed by months with abnormally high GPP (Supplementary Fig. 9). When applied to the global scale, the DryFlux model appeared more sensitive to the impact of moisture conditions on regional carbon cycling. Accounting for these dryland dynamics could improve global predictions of carbon uptake in earth system models. Our work underscores the tight ecohydrological coupling between water and carbon dynamics in water-limited ecosystem, and highlights the need to accurately represent these processes in models of terrestrial carbon uptake.
Drought impacts on the carbon cycle extend beyond dryland ecosystems29,43. Future climate change is likely to increase drought frequency and intensity in many regions globally44. Temperature rises and associated increases in atmospheric vapor-pressure deficit45,46 are likely to cause decreases in carbon uptake in ecosystems like forests that are not currently water limited47, potentially reducing the strength of the terrestrial carbon sink48. Drought duration, intensity, and frequency are expected to increase in dryland regions, making these systems especially vulnerable to climate change49,50. Already, the early 21st century has brought prolonged drought, warm temperatures, and extreme rainfall events to drylands including the Southwest51,52 and Australia53. Models that account for drought impacts on the carbon cycle are crucial for predicting and understanding the global carbon cycle. Furthermore, accounting for drought impacts on carbon dynamics will become more important as temperatures increase and precipitation patterns change.
Overall, this study highlights the crucial need to better represent coupled water and carbon dynamics in dryland ecosystem models. Based on the comparison of our GPP product with a global model similar to those routinely used to benchmark Earth System models54, we suggest that dryland-driven interannual variability in the global carbon cycle may be underestimated by existing models that represent mainly vegetation greenness and therefore do not adequately account for the ecohydrological effects of annual and sub-annual moisture dynamics on vegetation productivity. Our DryFlux model indicated greater interannual variability than FLUXCOM or MODIS GPP, and was more highly correlated to observed interannual variability in eddy covariance data. Our product improves on comparable products tested because it is informed by a dense network of flux sites across varied dryland ecosystems, accounts for flashy ecosystem dynamics, and accounts for tight ecohydrological coupling between carbon and water cycling dynamics in drylands.
Methods
Carbon fluxes were upscaled from 24 eddy covariance sites across the North American Southwest and Northwestern Mexico (Supplementary Table 4) using remote sensing and gridded meteorological inputs using a machine learning (random forest) approach55. We used all available sites in each year, which are detailed in Supplementary Table 5. There were two steps to the upscaling process: first, relationships between predictor variables and monthly eddy covariance from flux towers were derived using random forest models. Second, the random forest models were applied to per-pixel gridded inputs to create spatially and temporally continuous GPP estimates from 2000 to 2016. All analyses were conducted in the R language (R version 4.0.2) and environment for statistical computing56.
Data acquisition
Relationships between predictor variables and monthly fluxes from eddy covariance towers were quantified using random forest models using the R ‘caret’ package57. Data inputs included 0.05° 16-day Enhanced Vegetation Index (EVI) and NDVI data products from MODIS (MOD13C1v006) downloaded from the date closest to the 15th of each month for 2000–201658. Elevation was acquired from the Shuttle Radar Topography Mission using the ‘getData’ function in the ‘raster’ package59. Precipitation, PET, vapor pressure, daily mean temperature, and monthly average daily maximum and minimum temperature (Tmax, Tmin) at 0.5° spatial resolution were downloaded from the Climatic Research Unit (CRU)60,61. For a given month, the previous month’s precipitation from CRU was also included as a predictor. Day length was determined for the 15th or 16th of each month (corresponding to CRU dates) using the ‘daylength’ function in the ‘geosphere’ package and site latitude coordinates62. MAT and MAP were downloaded in 30 arc seconds from WorldClim using the ‘getData’ function in the ‘raster’ package59,63. All data products were aggregated to 0.5° spatial resolution using bilinear interpolation, and aligned to the same projection and extent using the ‘projectRaster’ function from the raster package in R.
To evaluate changing water regime effects on dryland system productivity, we used the SPEI. This multiscalar drought index is a good predictor of change in ecological responses to drought in drylands64,65. It accounts for the impacts of both supply- and demand-side limitations to carbon uptake (i.e., soil moisture and atmospheric vapor-pressure deficit) and also can be calculated to assess both intra- and interannual water deficits66. SPEI was calculated from data using the ‘SPEI’ package67 and included as a predictor from 1-month to 12-month timescales68,69. Time series of SPEI were calculated from time series of monthly precipitation and PET values using the function ‘spei’.
Gap-filled eddy covariance data for the North American Southwest sites were acquired from site PIs (see Supplementary Table 4 and Table 1 in Biederman et al. 2017 for site details). GPP was calculated from the Net Ecosystem Exchange values using the relationship between nighttime Net Ecosystem Exchange and temperature as described in Biederman et al. 201721. Global validation sites were obtained from the FLUXNET2015 dataset70—sites were selected based on MAP, MAT, geographic location, and data policy. Mean daily GPP values using the nighttime partitioning method with variable USTAR threshold71 were aggregated to monthly resolution for all selected dryland sites. The specific FLUXNET2015 column used for GPP values was ‘GPP_NT_VUT_MEAN’. A mask for global drylands was created based on an updated global drylands map from United Nations Environment World Monitoring Centre and are in accordance with United Nations Convention to Combat Desertification definition of drylands72,73.
Site-based random forest analysis, validation, and upscaling
The random forest model was trained using a random subset of 80% of the sites (19 sites; n = 1540 monthly observations), with 20% of the sites (n = 5; 366 monthly observations) held out for model testing74. The training method was repeated three times with 5-fold cross-validation. The minimum RMSE was used to select the optimal number of variables selected as candidates at each split (mtry) such that mtry = 10. The default number of trees (Ntree = 500) was used in model training. Variable selection was based on several factors intended to maximize both parsimony and model precision and accuracy. To avoid bias in importance metrics when there are highly correlated predictor variables, we assessed variable importance with conditional permutation importance metrics with the ‘varimp (conditional = TRUE)’ function in the ‘party’ package75,76,77. Importance metrics are measured as a drop in model accuracy when a specific variable is excluded from the model (the more model accuracy drops by excluding a variable, the more important that variable is in the prediction). If two predictor variables are highly correlated in nonconditional importance metrics, removal of one variable would not result in a large decrease in model accuracy and importance metrics for these variables could be underestimated. In contrast, conditional permutation metrics considers correlation between variables when assessing and provides more accurate importance metrics76. Variables and variable importance for the final random forest model are shown in Supplementary Fig. 3. Z-scores were calculated to evaluate anomalous GPP estimates and precipitation values. Z-scores were calculated for each year according to the following equation:
such that the mean of all months for year ‘i’ is subtracted from mean precipitation (Supplementary Fig. 9) or GPP estimate (Fig. 3) across all months and years (2000–2015), then divided by the standard deviation of all precipitation or GPP estimates included in the analysis.
For upscaling, random forest models were applied to global gridded satellite and meteorological inputs for masked dryland areas at 0.5° scale using the ‘predict’ function in the ‘raster’ package59. The ‘predict’ function applies a fitted model to each grid cell over a given spatial extent, using a stack of raster layers as inputs. In our case, the fitted model was the DryFlux random forest model trained in the Southwest and was applied over global drylands. The model had a training accuracy of R2 = 0.815; RMSE = 0.521 with mtry = 10. The testing accuracy of the model was R2 = 0.610 and RMSE = 0.876.
To evaluate DryFlux model predictions, we extracted the time series of GPP estimates from 2000 to 2016 of the DryFlux model for each of 24 flux sites and compared it to the eddy covariance tower GPP, MODIS GPP, and FLUXCOM GPP estimates. Then, 8-day 500 m MODIS GPP (MOD17A2H v006) values were extracted for all sites using Google Earth Engine and subset to dates included in the NDVI/EVI estimates78. Daily 0.5° resolution FLUXCOM GPP estimates using MODIS remote sensing and CRUJRA_v1 meteorological forcing data inputs (for consistency with the CRU datasets) were downloaded from the Data Portal of the Max Planck Institute for Biogeochemistry31,79,80. Daily FLUXCOM GPP estimates were used to calculate mid-month average values. Root mean square error (RMSE) values, correlation coefficients (r), coefficient of determination (R2), and RMSSD were used to evaluate model performance. RMSSD, a measure of variability in a time series41,81, is obtained by taking the square root of the average squared successive differences. RMSSD values were calculated with the ‘rmssd’ function in the ‘psych’ package82, and r and R2 values were calculated with the ‘stats’ package56. RMSE values were calculated with the ‘RMSE’ function in the ‘caret’ package57,
Data availability
MODIS Vegetation Indices were downloaded from NASA’s Land Processes Distributed Active Archive Center (LP DAAC) located at the USGS Earth Resources Observation and Science (EROS) Center and is available at https://lpdaac.usgs.gov/products/mod13c1v006/. Climatological data were downloaded from the Climatic Research Unit (University of East Anglia) and Met Office at https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.04/ and were further used to calculate SPEI and vapor-pressure deficit variables. MAT, and MAP data from WorldClim (https://www.worldclim.org/) were downloaded using the ‘raster’ package in R. Elevation from the Shuttle Radar Topography Mission (https://srtm.csi.cgiar.org/) were downloaded using the ‘raster’ package in R. The MODIS GPP data product was downloaded from NASA’s LP DAAC at the USGS EROS Center using Google Earth Engine and is available at https://lpdaac.usgs.gov/products/mod17a2hv006/. Daily FLUXCOM GPP estimates were downloaded from the Data Portal of the Max Plank Institute for Biogeochemistry and are available at http://www.fluxcom.org/. Daily flux tower GPP for global dryland testing sites is available at https://fluxnet.org/data/. Daily flux tower GPP for DryFlux training sites in the North American Southwest was acquired from site PIs and this data is available at https://github.com/marthageb/DryFlux.
Code availability
Analysis was conducted using R version 4.0.2 and all code for the analysis and production of figures is available at https://doi.org/10.5281/zenodo.554001583
References
Ahlström, A. et al. The dominant role of semi-arid ecosystems in the trend and variability of the land CO2 sink. Science 348, 895–899 (2015).
Poulter, B. et al. Contribution of semi-arid ecosystems to interannual variability of the global carbon cycle. Nature 509, 600–603 (2014).
Smith, W. K. et al. Remote sensing of dryland ecosystem structure and function: progress, challenges, and opportunities. Remote Sens. Environ. 233, 111401 (2019).
Verma, M. et al. Remote sensing of annual terrestrial gross primary productivity from MODIS: an assessment using the FLUXNET La Thuile data set. Biogeosciences 11, 2185–2200 (2014).
MacBean, N. et al. Dynamic global vegetation models underestimate net CO2 flux mean and inter-annual variability in dryland ecosystems. Environ. Res. Lett. 16, 094023 (2021).
Wang, L., Manzoni, S., Ravi, S., Riveros-Iregui, D. & Caylor, K. Dynamic interactions of ecohydrological and biogeochemical processes in water-limited systems. Ecosphere 6, 1–27 (2015).
Oleson, K. W. et al. Technical Description of the Community Land Model (CLM). Technical Note NCAR/TN-461+ STR (NCAR, 2004).
Bonan, G. B. & Levis, S. Evaluating aspects of the community land and atmosphere models (CLM3 and CAM3) using a dynamic global vegetation model. J. Clim. 19, 2290–2301 (2006).
Brovkin, V., Ganopolski, A. & Svirezhev, Y. A continuous climate-vegetation classification for use in climate-biosphere studies. Ecol. Modell. 101, 251–261 (1997).
Foley, J. A. et al. An integrated biosphere model of land surface processes, terrestrial carbon balance, and vegetation dynamics. Global Biogeochem. Cycles 10, 603–628 (1996).
Haxeltine, A. & Prentice, I. C. BIOME3: an equilibrium terrestrial biosphere model based on ecophysiological constraints, resource availability, and competition among plant functional types. Global Biogeochem. Cycles 10, 693–709 (1996).
Sitch, S. The Role of Vegetation Dynamics in the Control of Atmospheric CO2 Content. Dissertation, Lund Univ. (2000).
Levis, S., Bonan, G. B., Vertenstein, M. & Oleson, K. W. The Community Land Model’s Dynamic Global Vegetation Model (CLM-DGVM): Technical Description and User’s Guide. NCAR Technical Note TN-459+ IA 50 (NCAR, 2004).
Woodward, F. I., Lomas, M. R. & Betts, R. A. Vegetation-climate feedbacks in a greenhouse world. Philos. Trans. R. Soc. Lond. B Biol. Sci. 353, 29–39 (1998).
Hickler, T., Prentice, I. C., Smith, B., Sykes, M. T. & Zaehle, S. Implementing plant hydraulic architecture within the LPJ Dynamic Global Vegetation Model. Glob. Ecol. Biogeogr. 15, 567–577 (2006).
Turner, D. P. et al. Evaluation of MODIS NPP and GPP products across multiple biomes. Remote Sens. Environ. 102, 282–292 (2006).
Loik, M. E., Breshears, D. D., Lauenroth, W. K. & Belnap, J. A multi-scale perspective of water pulses in dryland ecosystems: climatology and ecohydrology of the western USA. Oecologia 141, 269–281 (2004).
Austin, A. T. et al. Water pulses and biogeochemical cycles in arid and semiarid ecosystems. Oecologia 141, 221–235 (2004).
Biederman, J. A. et al. Terrestrial carbon balance in a drier world: the effects of water availability in southwestern North America. Glob. Change Biol. 22, 1867–1879 (2016).
Wilcox, B. P., Sorice, M. G. & Young, M. H. Dryland ecohydrology in the Anthropocene: taking stock of human–ecological interactions. Geogr. Compass 5, 112–127 (2011).
Biederman, J. A. et al. CO2 exchange and evapotranspiration across dryland ecosystems of southwestern North America. Glob. Change Biol. 23, 4204–4221 (2017).
Lauenroth, W. K. & Bradford, J. B. Ecohydrology of dry regions of the United States: precipitation pulses and intraseasonal drought. Ecohydrology 2, 173–181 (2009).
Schwinning, S. & Sala, O. E. Hierarchy of responses to resource pulses in arid and semi-arid ecosystems. Oecologia 141, 211–220 (2004).
Huxman, T. E. et al. Convergence across biomes to a common rain-use efficiency. Nature 429, 651–654 (2004).
Liu, Y., Kumar, M., Katul, G. G. & Porporato, A. Reduced resilience as an early warning signal of forest mortality. Nat. Clim. Change 9, 880–885 (2019).
Bradford, J. B., Schlaepfer, D. R., Lauenroth, W. K. & Palmquist, K. A. Robust ecological drought projections for drylands in the 21st century. Glob. Change Biol. 26, 3906–3919 (2020).
Dai, A. Increasing drought under global warming in observations and models. Nat. Clim. Change 3, 52–58 (2013).
Jung, M. et al. Global patterns of land-atmosphere fluxes of carbon dioxide, latent heat, and sensible heat derived from eddy covariance, satellite, and meteorological observations. J. Geophys. Res. 116, G00J07 (2011).
Jung, M. et al. Compensatory water effects link yearly global land CO2 sink changes to temperature. Nature 541, 516–520 (2017).
Xiao, J. et al. A continuous measure of gross primary production for the conterminous United States derived from MODIS and AmeriFlux data. Remote Sens. Environ. 114, 576–591 (2010).
Tramontana, G. et al. Predicting carbon dioxide and energy fluxes across global FLUXNET sites with regression algorithms. Biogeosciences 13, 4291–4313 (2016).
Joiner, J. & Yoshida, Y. Satellite-based reflectances capture large fraction of variability in global gross primary production (GPP) at weekly time scales. Agric. For. Meteorol. 291, 108092 (2020).
Aguiar, M. R. & Sala, O. E. Patch structure, dynamics and implications for the functioning of arid ecosystems. Trends Ecol. Evol. 14, 273–277 (1999).
Bacour, C. et al. Improving estimates of gross primary productivity by assimilating solar-induced fluorescence satellite retrievals in a terrestrial biosphere model using a process-based SIF model. J. Geophys. Res. Biogeosci. 124, 3281–3306 (2019).
MacBean, N. et al. Strong constraint on modelled global carbon uptake using solar-induced chlorophyll fluorescence data. Sci. Rep. 8, 1973 (2018).
Xiao, J. et al. Assessing net ecosystem carbon exchange of U.S. terrestrial ecosystems by integrating eddy covariance flux measurements and satellite observations. Agric. For. Meteorol. 151, 60–69 (2011).
Ropelewski, C. F. & Halpert, M. S. Global and REGIONAL SCALE PRECIPITATION PATTERNS ASSociated with the El Niño/Southern Oscillation. Mon. Wea. Rev. 115, 1606–1626 (1987).
Trenberth, K. E. The definition of El Niño. Bull. Amer. Meteor. Soc. 78, 2771–2778 (1997).
Boening, C., Willis, J. K., Landerer, F. W., Nerem, R. S. & Fasullo, J. The 2011 La Niña: so strong, the oceans fell. Geophys. Res. Lett. 39, L19602 (2012).
Kogan, F. & Guo, W. Strong 2015–2016 El Niño and implication to global ecosystems from space data. Int. J. Remote Sens. 38, 161–178 (2017).
Berntson, G. G., Lozano, D. L. & Chen, Y. J. Filter properties of root mean square successive difference (RMSSD) for heart rate. Psychophysiology 42, 246–252 (2005).
von Neumann, J., Kent, R. H., Bellinson, H. R. & Hart, B. I. The mean square successive difference. Ann. Math. Stat. 12, 153–162 (1941).
Jenerette, G. D., Barron-Gafford, G. A., Guswa, A. J., McDonnell, J. J. & Villegas, J. C. Organization of complexity in water limited ecohydrology. Ecohydrology 5, 184–199 (2012).
IPCC 2013. Climate Change 2013 - The Physical Science Basis. Working Group I Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (Cambridge University Press, 2014).
Breshears, D. D. et al. The critical amplifying role of increasing atmospheric moisture demand on tree mortality and associated regional die-off. Front. Plant Sci. 4, 266 (2013).
Novick, K. A. et al. The increasing importance of atmospheric demand for ecosystem water and carbon fluxes. Nat. Clim. Change 6, nclimate3114 (2016).
Allen, C. D., Breshears, D. D. & McDowell, N. G. On underestimation of global vulnerability to tree mortality and forest die-off from hotter drought in the Anthropocene. Ecosphere 6, art129 (2015).
Bonan, G. B. Forests and climate change: forcings, feedbacks, and the climate benefits of forests. Science 320, 1444–1449 (2008).
Cook, B. I., Ault, T. R. & Smerdon, J. E. Unprecedented 21st century drought risk in the American Southwest and Central Plains. Sci. Adv. 1, e1400082 (2015).
Huang, J., Yu, H., Dai, A., Wei, Y. & Kang, L. Drylands face potential threat under 2 °C global warming target. Nat. Clim. Change 7, 417–422 (2017).
Easterling, D. R. et al. Climate extremes: observations, modeling, and impacts. Science 289, 2068–2074 (2000).
MacDonald, G. M. Water, climate change, and sustainability in the Southwest. Proc. Natl Acad. Sci. USA 107, 21256–21262 (2010).
van Dijk, A. I. J. M. et al. The Millennium Drought in southeast Australia (2001–2009): natural and human causes and implications for water resources, ecosystems, economy, and society. Water Resour. Res. 49, 1040–1057 (2013).
Collier, N. et al. The International Land Model Benchmarking (ILAMB) System: design, theory, and implementation. J. Adv. Model. Earth Syst. 10, 2731–2754 (2018).
Liaw, A. & Wiener, M. Classification and regression by randomForest. R News 2, 18–22 (2002).
R Core Team. R: A language and environment for statistical computing R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2021).
Kuhn, M. caret: Classification and regression training. R package version 6.0-88. https://CRAN.R-project.org/package=caret (2021).
Didan, K. MOD13C1 MODIS/Terra Vegetation Indices 16-Day L3 Global 0.05Deg CMG V006. NASA EOSDIS Land Processes DAAC. https://doi.org/10.5067/MODIS/MOD13C1.006 (NASA EOSDIS Land Processes DAAC, 2015).
Hijmans, R. J. raster: Geographic data analysis and modeling. R package version 3.5-2. https://CRAN.R-project.org/package=raster (2021).
Harris, I., Osborn, T. J., Jones, P. & Lister, D. Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Sci. Data 7, 109 (2020).
Climatic Research Unit (University of East Anglia) & Met Office. CRU TS Version 4.04. http://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.04/ (CRU, 2020).
Hijmans, R. J. geosphere: Spherical trigonometry. Package version 1.5-10. https://CRAN.R-project.org/package=geosphere (2019).
Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Barnes, M. L. et al. Vegetation productivity responds to sub-annual climate conditions across semiarid biomes. Ecosphere 7, n/a–n/a (2016).
Vicente-Serrano, S. M. et al. Performance of drought indices for ecological, agricultural, and hydrological applications. Earth Interact. 16, 1–27 (2012).
Vicente-Serrano, S. M., Beguería, S. & López-Moreno, J. I. A multiscalar drought index sensitive to global warming: the Standardized Precipitation Evapotranspiration Index. J. Climate 23, 1696–1718 (2010).
Beguería, S. & Vicente-Serrano, S. M. SPEI: Calculation of the Standardised Precipitation-Evapotranspiration Index. R package version 1.7. https://CRAN.R-project.org/package=SPEI (2017).
Beguería, S., Vicente-Serrano, S. M. & Angulo-Martínez, M. A Multiscalar Global Drought Dataset: the SPEI base: a new gridded product for the analysis of drought variability and impacts. Bull. Am. Meteorol. Soc. 91, 1351–1356 (2010).
Vicente-Serrano, S. M., Beguería, S., López-Moreno, J. I., Angulo, M. & El Kenawy, A. A New Global 0.5° Gridded Dataset (1901–2006) of a Multiscalar Drought Index: comparison with current drought index datasets based on the Palmer Drought Severity Index. J. Hydrometeorol. 11, 1033–1043 (2010).
Pastorello, G. et al. The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. Sci. Data 7, 225 (2020).
Reichstein, M. et al. On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm. Glob. Change Biol. 11, 1424–1439 (2005).
Sörensen, L. A spatial analysis approach to the global delineation of dryland areas of relevance to the CBD Programme of Work on Dry and Subhumid Lands. Dataset based on spatial analysis between WWF terrestrial ecoregions (WWF-US, 2004) and aridity zones https://www.unep-wcmc.org/resources-and-data/a-spatial-analysis-approach-to-the-global-delineation-of-dryland-areas-of-relevance-to-the-cbd-programme-of-work-on-dry-and-subhumid-lands (CRU/UEA; UNE, 2007). Data accessed: 6/27/2021.
Miles, L. et al. A global overview of the conservation status of tropical dry forests. J. Biogeogr. 33, 491–505 (2006).
Freitag, D. Information Extraction from HTML: Application of a General Machine Learning Approach, 517–523 (AAAI/IAAI, 1998).
Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A. & Van Der Laan, M. J. Survival ensembles. Biostatistics 7, 355–373 (2006).
Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T. & Zeileis, A. Conditional variable importance for random forests. BMC Bioinformatics 9, 307 (2008).
Strobl, C., Boulesteix, A.-L., Zeileis, A. & Hothorn, T. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25 (2007).
Running, S., Mu, Q. & Zhao, M. MOD17A2H MODIS/Terra Gross Primary Productivity 8-Day L4 Global 500m SIN Grid V006. https://doi.org/10.5067/MODIS/MOD17A2H.006 (NASA EOSDIS Land Processes DAAC, 2015).
Jung, M. et al. Scaling carbon fluxes from eddy covariance sites to globe: synthesis and evaluation of the FLUXCOM approach. Biogeosciences 17, 1343–1365 (2020).
Jung, M. et al. The FLUXCOM ensemble of global land-atmosphere energy fluxes. Sci. Data 6, 74 (2019).
Von Neumann, J., Kent, R., Bellinson, H. & Hart, B. The mean square successive difference. Ann. Math. Stat. 12, 153–162 (1941).
Revelle, W. R. psych: Procedures for personality and psychological research. R package version 2.1.6. https://CRAN.R-project.org/package=psych (2021).
Farella, M. Code and data for ‘Improved dryland carbon flux predictions with explicit consideration of water–carbon coupling’. zenodo https://doi.org/10.5281/ZENODO.5540015 (2021).
Acknowledgements
Funding for some of the flux data collection and analysis in this study comes from The Office of Science, U.S. Department of Energy and the U.S. Department of Agriculture. DDB was supported by USDA National Institute of Food and Agriculture McIntire Stennis project 1016938 (ARZT-1390130-M12-222). The authors would like to acknowledge three anonymous reviewers whose suggestions greatly improved this manuscript.
Author information
Authors and Affiliations
Contributions
M.B. designed the model, computational framework, and prepared the manuscript. M.F. analyzed the data and produced the figures. D.B., D.M., and R.S. contributed early intellectual input to the project’s development and direction. G.P.C. contributed data processing and machine learning code. J.B. contributed processing and synthesis of eddy covariance data. N.M. contributed modeling text. M.F., J.B., G.P.C., N.M., D.B., D.D., and M.L. contributed intellectual input and edited the manuscript. All authors provided critical feedback and helped shape the research, analysis, and manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Communications Earth & Environment thanks the anonymous reviewers for their contribution to the peer review of this work. Primary handling editor: Clare Davis.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Barnes, M.L., Farella, M.M., Scott, R.L. et al. Improved dryland carbon flux predictions with explicit consideration of water-carbon coupling. Commun Earth Environ 2, 248 (2021). https://doi.org/10.1038/s43247-021-00308-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s43247-021-00308-2
This article is cited by
-
AmeriFlux BASE data pipeline to support network growth and data sharing
Scientific Data (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.