Robust observations of land-to-atmosphere feedbacks using the information flows of FLUXNET

Feedbacks between atmospheric processes like precipitation and land surface fluxes including evapotranspiration are difficult to observe, but critical for understanding the role of the land surface in the Earth System. To quantify global surface-atmosphere feedbacks we use results of a process network (PN) applied to 251 eddy covariance sites from the LaThuile database to train a neural network across the global terrestrial surface. There is a strong land–atmosphere coupling between latent (LE) and sensible heat flux (H) and precipitation (P) during summer months in temperate regions, and between H and P during winter, whereas tropical rainforests show little coupling seasonality. Savanna, shrubland, and other semi-arid ecosystems exhibit strong responses in their coupling behavior based on water availability. Feedback couplings from surface fluxes to P peaks at aridity (P/potential evapotranspiration ETp) values near unity, whereas coupling with respect to clouds, inferred from reduced global radiation, increases as P/ETp approaches zero. Spatial patterns in feedback coupling strength are related to climatic zone and biome type. Information flow statistics highlight hotspots of (1) persistent land–atmosphere coupling in sub-Saharan Africa, (2) boreal summer coupling in the central and southwestern US, Brazil, and the Congo basin and (3) in the southern Andes, South Africa and Australia during austral summer. Our data-driven approach to quantifying land atmosphere coupling strength that leverages the global FLUXNET database and information flow statistics provides a basis for verification of feedback interactions in general circulation models and for predicting locations where land cover change will feedback to climate or weather. “Big data” methods reveal robust hotspots of land–atmosphere coupling. Sparse observations and inadequate analytical tools have hindered our understanding of land–atmosphere feedbacks, including the exchange of energy, water, and CO2. Emergent methods such as machine learning, however, offer new opportunities, as Tobias Gerken from Montana State University, USA, and colleagues, demonstrate. A “big data” approach is adopted to characterise the spatial and temporal variability of land–atmosphere coupling without a priori assumptions: information flows are computed from 251 FLUXNET sites which are subsequently used to train a neural network. Distinct regional differences in the magnitude of land–atmosphere feedbacks are found, related to climatic zone and biome type; coupling in semi-arid ecosystems, for example, are strongly related to seasonal water availability. Complementing model studies with such empirical approaches may assist in quantifying climate change impacts on ecosystem services.


INTRODUCTION
The terrestrial land surface and atmosphere are coupled through a complex set of interactions and feedbacks that determine the fluxes of mass and energy between the two systems. Weather and climate are well known to determine the productivity of terrestrial ecosystems, but the functioning of the land surface can likewise modify weather and climate patterns. [1][2][3][4] In some cases, the influence of one of these systems can propagate through the other to influence itself, establishing a feedback. An understanding of land-atmosphere feedbacks is essential for determining the regional impacts of climate variability and change on the ecosystem services humanity has come to depend, but remains a major challenge as analytical tools to quantify feedbacks have only recently been developed. [4][5][6][7][8][9][10] Feedback processes in nature are difficult to directly observe and to infer, as cause and effect relationships may become obscured or break down when a process influences itself through an intermediary, 11 as is often the case in the Earth System. Feedback processes amplify or buffer inputs, resulting in exaggerated or muted responses to perturbations, the latter of which can be difficult to identify. The most severe uncertainties in our climate models are believed to feature feedback. [12][13][14] These challenges necessitate the development and application of novel methods to quantify feedback processes in the Earth System. This study presents direct observations of global land to atmosphere information flow through the use of a global network of surface energy flux and meteorological observations, introducing a statistical approach to characterize temporal and spatial variability in land-atmosphere coupling strength.
Energy exchange between the land surface and atmosphere provides a primary method of interactionand thereby feedbackbetween the two systems. Downwelling solar energy absorbed by the land surface warms the soil and vegetation and drives fluxes of sensible and latent heat between the land surface and atmosphere. These energy fluxes modify the composition of the atmospheric boundary-layer (ABL) and drive convective processes that deepen the ABL and result in entrainment of air from the free troposphere. [15][16][17][18][19] These changes then impact nearsurface temperature and humidity as well as precipitation processes, resulting in potential feedbacks through ecosystem physiological response to favor subsequent latent or sensible heat fluxes that in turn impact ABL processes. 20 Such feedback processes may intensify with future climate changes, 21 with the potential to impact critical functions such as water availability 22 and ecosystem resilience, 23 and to intensify phenomena such as heat waves, [24][25][26] drought, [27][28][29] and local convective precipitation. 30,31 Likewise, the impact of canopy photosynthesis and evapotranspiration on cloud developmentand the potential for future climate change to further these effects 32-34make understanding land-atmosphere feedback processes central to predictions of the future availability of ecosystem services.
These examples illustrate the complexity of the coupled land-atmosphere system. Disentangling the presence and strength of positive and negative feedbacks is an ongoing challenge in understanding how ecosystems and their management impact Earth System processes. Feedback is inherently nonlinear, and its study therefore calls for methods free from assumptions of linear proportionality, simple correlation, or isolated causes and effects. 4 Likewise, and despite the sophistication of global climate models, our current model-based assessments almost certainly miss key feedback processes or scales of interaction, because they represent hypotheses about poorly understood processes, and not real-world observations of those processes. To robustly measure feedback and critique these hypotheses, we therefore require a method for direct, in situ, and relatively assumption-free (nonlinear and empirical) observation of directional functional couplings.
Process Networks (PNs) characterize the state of a system as a pattern of flows of mass, energy, and/or information that correspond to key system functions. 11 Information flow statistics are a robust and mature method for delineating PNs, and have been previously applied to the direct and explicit measurement of feedback between the land surface and atmosphere using flux tower observations. [35][36][37][38][39] PNs have been shown to accurately diagnose interactions between turbulent fluxes and the atmosphere in ecohydrological systems, [35][36][37]40 and have accurately described functional differences between starkly diverse land surface ecosystems at continental scales. 35,41,42 This paper's choice of Transfer Entropy to delineate PNs 43 is ideal to measure directional, scale-specific, and nonlinear couplings that characterize land-to-atmosphere feedbacks.
To investigate feedback between land and atmosphere, we focus on relationships among land surface turbulent energy fluxes of sensible (H) and latent heat (LE) and three atmospheric variables: downward global shortwave radiation (R g ) as an indicator of cloud cover, air temperature (T a ), and precipitation (P). Analysis of these terms provides a core set of surface flux and atmospheric variables that link land and atmosphere through turbulent flux exchange. The strength of the process coupling is quantified through the information flow as given by the normalized transfer entropy (T′). T′(X → Y, τ) is a measure of the predictability of the time series Y from time series X at a time lag of τ. While characteristic τ for significant T′(X → Y, τ) depend on the variable pair, sub-daily timescales are believed to be the primary timescales for flux-based land-atmosphere feedbacks and the average T′ for τ from 0.5 to 18 h (T′ Avg ) is used here to capture the functional relationships for land-atmosphere interactions.
Statistically significant values of information flow and feedback (p < 0.05) are established using the method of shuffled surrogates where surrogate T′ values are calculated using randomly shuffled time series of X t and Y t to remove any correlation between variables. These surrogates are then compared to the observed T′. 11,44 The fraction of instances during which significant process coupling is observed at any τ (F Sig ) is calculated as where N Sig represents the total number of observations during which T′(X → Y, τ) is significant at any τ while N Tot is the total number of observations taken into account. When F Sig approaches 1, a coupling process is robustly significant, but when it approaches zero, the process is weak or absent. The recent development of regional networks of co-located meteorological and carbon dioxide, water, and energy flux measurements provides a new opportunity to assess land-atmosphere coupling across terrestrial biomes and climate space globally. Here we leverage the LaThuile FLUXNET database, 45 which provides globally distributed, to a degree standardized observations of land-atmosphere fluxes of energy and water using the eddy covariance technique spanning 251 sites ( Supplementary Fig. 1) and representing 11 major IGBP (International Geosphere-Biosphere Programme) vegetation classes (Supplementary Table 1) across a large range of aridities, and over 10000 site months. We use these data to calculate information flows from land surface fluxes to atmosphere (i.e. coupling strength) and then train an artificial neural network (ANN) of land-atmosphere coupling strength across the terrestrial surface. The use of observational "big data" represents a unique approach to characterizing temporal and spatial variability in land-atmosphere coupling and feedback without a priori assumptions about underlying processes that is capable of directly observing and resolving critical processes at the interface between surface, vegetation, and convective ABL. Our approach therefore complements large-scale climate models and reanalysis products, for which these processes and feedbacks currently remain parameterized due to their comparatively coarse resolution. Given that our information flow PN methodology quantifies the presence, strength, direction, and significance of land-toatmosphere coupling using a large sample of in situ flux tower observations, it can be used as in independent control for existing models and theory, in addition to providing unique insights into these difficult-to-observe processes. In this work, we focus on the land-to-atmosphere portion of land-atmosphere coupling, by investigating the directional information flow from H and LE to future states of atmospheric variables.
Several studies have identified global land-atmosphere coupling strength and associated "coupling hotspots" using climate models 5,21,46 and reanalysis data, 47,48 primarily focused on soil moisture-precipitation feedbacks. Here we examine these hypothesized feedback hotspots in the context of an empirical data-driven analysis, broadening coupling mechanisms to the specific surface fluxes (latent and sensible heat) that are directly measured through FLUXNET, and which are directly responsible for convection and changes in the near-surface atmosphere that impact ecosystem function. Given that PNs are a methodological distinct tool for the analysis of environmental data, they can serve as a validation tool for process-based models and more conventional observational analysis.

RESULTS
There is pronounced seasonality in the magnitude of land-toatmosphere coupling (Figs 1, 2) and seasonal patterns differ between the six coupling pairs considered in this work. Pairs (LE → T a ), (LE → P), and (H → T a ) exhibit low F sig during winter months in temperate regions compared to high coupling strength during summer, whereas the seasonality is reversed for (LE → R g ) and (H → R g ). The coupling process (H → P) shows significant coupling during both winter and summer. Tropical regions exhibit little seasonality, except for (H → R g ) and (LE → R g ). These are much stronger during June through August, which also broadly coincides with the dry season in Amazonia. As expected, the increase in feedback coupling between winter and summer within temperate regions also coincides with a strong increase in vegetation density and greenness, represented here as the Normalized Difference Vegetation Index (NDVI).
Spatial patterns in feedback coupling strength are related to IGBP biome type (Fig. 3). The lack of a clear annual cycle for (H → P), visible in Fig. 2, stems mainly from forest ecosystems (mixed forest, MF; deciduous broad leaf forest, DBF; evergreen needle leaf forest, ENF; evergreen broad leaf forest, EBF), which are abundant in FLUXNET, and closed shrubland (Fig. 3 increased (LE → R g ) and reduced (LE → P) during summer, and strong (LE → T a ) throughout the year for these generally semi-arid to arid ecosystems, highlight the importance of interactions between biome and prevailing climate in governing land to atmosphere coupling behavior. The amplitude in coupling strength for annual cycles approaches 0.8-1.0 for all couplings except (LE → R g ) for which the amplitude tends to be less than 0.5.
There is a great deal of variability in land-to-atmosphere feedback coupling strength between biomes, climate zones, and seasons. To aid interpretation, we plot the feedback coupling strengths described above against monthly values of T a and monthly aridity (P/ET p ) (Figs 4, 5). There is an increasing increasing trend of increasing feedback strength (T′ Avg ) with increasing monthly T a across all biomes. Land-atmosphere coupling is largely absent at T a < 0°C, which can be expected given that energy inputs and ecosystem activity are generally minimized under these conditions. Savanna and shrubland type ecosystems exhibit the lowest coupling originating from LE at high monthly temperatures (T a > 20°C), whereas the situation is reversed with high coupling originating from H when T a > 20°C.
The behavior of T′ Avg is more complex with respect to P/ET p . There is little relationship between T′ Avg (LE → T a ) and aridity, but a clear relationship emerges for T′ Avg (H → T a ), in which feedback originating from H increases with aridity for all vegetation types. The feedback coupling from surface fluxes to P peakes at P/ET p values near unity and WSA, OSH, and CSH exhibit the strongest feedbacks of all vegetation types in that range of P/ET p . For the coupling between surface fluxes and cloud cover as indicated by R g , we find that there is little feedback for P/ET p > 1. Savanna (SAV and WSA) and shrub (CSH and OSH) vegetation classes generally exhibit the highest feedback for T a and R g for low P/ET p . While we chose to present T′ Avg as the coupling metric in this work, there is considerable variation in coupling timescales between variable pairs ( Supplementary Figs 4-6), which in itself shows dependence on T and P/ET p and may be related to the timescales needed to effectively connect land-surface and atmospheric processes. For example, the dominant coupling timescales in the order of 6-12 h between surface fluxes and P or R g show substantial time-lags in the atmosphere's response to surface fluxes, which are consistent with timescales typically found in convective boundary layers.
The extrapolation of observed feedback strength (T′ Avg ) from FLUXNET sites to the global map reveals several hotspots of land-atmosphere coupling that stand out from global average feedback strength ( Fig. 6; see Supplementary Figs 7-12 for monthly data). The ANN models had R 2 values of 0.69 to 0.92 for (H → P) and (H → R g ), respectively (Supplementary Table 2), with no evidence of overfitting, so the model's extrapolation is robust when tested against the 251 FLUXNET sites and over 10,000 observed site-months. We find strong land-to-atmosphere feedback in sub-Saharan Africa (LE → T a and LE → P), the central and southwestern US during summer (H → R g and H → T a ), the southern Andes, South Africa and Australia during DJF (H → R g ), Fig. 1 The fraction of significant surface to atmosphere interaction as determined by a process network (PN) with respect to latent heat flux (LE), F Sig (blue colorscale), for 251 sites of the FLUXNET LaThuile dataset and calculated as average across DJF (northern hemisphere winter) (a, c, e) and JJA (northern hemisphere summer) (b, d, f). Larger circles indicate more years of tower data considered for calculation of F Sig . The background image is DJF and JJA NDVI from MODIS for 2016 (dark colorscale). When F Sig approaches 1, a significant coupling is present, and when it approaches 0 the coupling is weak or absent Amazonia (LE → P and LE → R g ), agricultural areas in eastern Brazil (H → T a and H → R g ), the African Rift Valley (LE → T a and H → R g ) as well the Congo, where strong coupling persists throughout the year, but switches from (LE → R g ) in DJF to (H → R g and LE → T a ) in JJA. This plot is thematically similar to the soil moisture based results from Koster et al. 5

DISCUSSION
PNs and other empirical methods based on information theory applied to environmental "big data" provide a wealth of information about land-atmosphere coupling. Specifically, PNs provide information about functional relationships between ecosystem variables that can be used to investigate processes such as land-atmosphere coupling and feedbacks as well as their response to environmental change. Using an ANN to extrapolate these couplings to the global scale, we identified several hotspots of land-atmosphere coupling (Fig. 6). Monthly data are presented in Supplementary Figs 7-12. Unlike previous studies e.g. 5,46 which used process-based models, the ANN is based on empirical extrapolation of observations and does not include a priori assumptions about functional relationships to demonstrate the existence of feedbacks. It can therefore be used to complement global models, which require (i) process relationships to be known and (ii) may require parameterizations to include processes that are under-resolved due to their global nature.
We investigated six couplings between turbulent fluxes and atmospheric/near surface properties by taking advantage of databases that incorporate observations of a wide range of surface meteorology and fluxes. Couplings of H and LE to P and R g are directly related to the hydrologic cycle, in contrast to the coupling with temperature, which is more related to near surface conditions and cover type. The ANN trained on PN results identifies feedback hotspots in the southwestern and central US similar to, 5 but does not reproduce the hotspot on the Indian subcontinent. However, for the southern African hotspot we find that the coupling signal is strongest for H, LE, and T a rather than precipitation, and is more pronounced in DJF. For the US hotspot, we find a stronger signal for H, LE, and R g rather than P. The ANN also detects the hotspots in the Congo Basin, South Africa, Australia and to some extent Brazil (for H to R g and T a ), in agreement with Notaro and Zeng et al. 46,48 Similarly, several regional studies highlighted the strong coupling between surface and air temperatures for semi-arid regions in the US and Europe, 6,49 which is reflected in the PN results for the southwest US and to some extent for the Iberian peninsula. Compared to previous studies, we find a stronger coupling of LE to R g and P in Amazonia, further highlighting the importance of tropical rainforest function for cloud development and regional precipitation. 50 We find R g to exhibit much clearer land to atmosphere coupling than P, which can be expected given that not all clouds produce precipitation. The reduced coupling could also indicate that models are overly sensitive with respect to their precipitation response or that the PN has problems detecting feedback in P due to the sparseness of precipitation events. While the latter cannot be excluded, global and even regional models rely on cumulus parameterizations for precipitation generation, which have wellknown difficulties in producing realistic precipitation. 51,52  Extrapolation of empirical PN results to the global scale shows two distinct advantages compared to global scale modeling approaches. As a statistical method, global results at a high resolution (e.g. 0.25°) are computationally cheaper than running an Earth System Model, while also providing detailed information on land-atmosphere coupling on spatial and seasonal scales. Also, through considering multiple land-atmosphere feedback pathways, PNs are capable of providing information that can be used to improve process-level understanding of feedbacks not accessible in more complex models. At the same time, data-driven approaches such as PNs and ANNs are not constrained by physically realistic limitations and cannot prove cause-effect relationships. This should not be considered a limitation but as a feature. Combined with domain expertise, data-driven methods can be very useful in guiding research toward regions and processes that merit further scientific attention.
The PN and ANN reveal that dryland ecosystems exhibited the strongest ecosystem-atmosphere feedback due to variability in available water (Figs 3-6). We find the highest couplings between surface fluxes and precipitation at P/ET p~1 , highlighting the importance of sufficient water supply and soil moisture in controlling land-atmosphere interactions. 53,54 Interestingly, for savannas, high monthly mean temperatures (T a > 20°C) are associated with low T′ Avg (LE → P), indicating the water limited state of these systems during the dry season and the associated absence of coupling. Similarly transition periods between wet and dry seasons and monsoon circulations are important for soil moisture-precipitation coupling. 47,49 Vegetation response to water limitations occurs on a continuum from isohydric (plants closely regulate transpiration through stomatal conductance in response to atmospheric vapor pressure deficit) to anisohydric (plants have little regulation of stomatal conductance). From these species-level traits, ecosystem-level drought responses emerge. 28,55 Grasses, which were thought to be mostly anisohydric, often exhibit isohydric behavior in semi-arid environments, [56][57][58] supporting the notion that semi-arid grasslands can exhibit substantial feedbacks with the atmosphere. The resulting interplay between vegetation, surface-energy flux partitioning and atmospheric control also influences the development of local convection, which can be an important ecosystem moisture source. 31,[59][60][61][62] Substantial feedbacks between biosphere and precipitation were recently reported for semi-arid and monsoonal regions, 63 highlighting the need of an accurate representation of the biosphere's response to temperature, radiation, and water availability for predicting hydrometeorological and climatological feedbacks.
The strong coupling between turbulent fluxes and P for semiarid systems (i.e. savannas, Figs 5, 6) is particularly interesting in the light of their pronounced seasonality. Given the fact that the analysis covers monthly system state, and precipitation inputs are highly pulsed, intermediate P/ET p might correspond to rapidly changing moisture supplies at the surface that elicit responses in the land-atmosphere system. The increase in coupling between LE and H to T a and R g for small P/ET p further highlights the importance of convective processes that impact ABL growth and present multiple avenues for feedbacks mediated by the surface-ABL system. [15][16][17][18][19] PNs applied across aridity gradients can be used to better understand potential changes to land-atmosphere interactions and ecosystem functioning across temporal and spatial scales. Given that semi-arid ecosystems are critical to the carbon cycle and climate, [64][65][66] and are likely to expand 67 and deteriorate 68 under climate change, the ability of PNs to quantify their coupling to the atmosphere is of particular importance. Additionally, projected changes in aridity are expected to exhibit complex changes across the globe, 69,70 increasing the uncertainty for land-atmosphere interactions and feebacks.
This study is not without limitations related to data availability and uncertainty. This study relies on near surface observations as a proxy for land-atmosphere coupling rather than direct observations of boundary-layer processes that mediate these couplings and feedbacks due to a lack of continuous and spatially distributed ABL observations, which needs to be addressed by H and LE flux dynamics are closely coupled through the surface energy balance but observed couplings of H and LE diverge (e.g. there is significant coupling between H and T a , in a given region but not for LE and T a or vice versa). This behavior of the PN is likely due to the fact that despite their correlation, H and LE are rarely of the same magnitude. The PN is implicitly sensitive to the absolute magnitude and the time rates of change in the time series and thus acts as a low-pass filter on information flow from flux variations (see Supplemental Note 1 for additional details).
Despite its limitations, given the PNs good agreement with previous studies in diagnosing feedback hotspots and its coupling response with respect to T a and P/ET p (Figs 4, 5), which is in line with ecohydrological expectations, 53 we have confidence that observed information flow from the growing body of environmental big datathrough networks such as FLUXNET can be used to provide unique insights on land-atmosphere feedbacks from an empirical perspective and can serve as independent empirical verification for process-based climate models, potentially driving progress in the improvement of climate models toward the representation of critical processes for projecting land-atmosphere interactions and feedbacks.
In conclusion, we demonstrate that PN results can be used as independent validation for process-based models based on observed information flows between the land surface and atmosphere. As hypothesized by prior models and research, savanna, shrubland, and other semi-arid ecosystems exhibit a strong response in their atmospheric feedback behavior based on seasonal water availability and aridity. Information flow from surface to atmosphere for other variables exhibited seasonal variability with the exception of tropical rainforests and was a strong function of air temperature. In the light of dryland expansion their vulnerability to climate change, this might strongly impact land-atmosphere coupling including important precipitation processes.

METHODS Observations
Observed variables were obtained from the FLUXNET LaThuile synthesis dataset, which encompasses data from 251 sites representing nearly 1000 site years at a temporal resolution of 0.5 h. To ensure data quality, we (i) used only original data and gap-filled data of high quality, as indicated by the data-quality flag provided by FLUXNET; (ii) excluded site years with less than 50% available data; (iii) excluded outliers that were identified by exceeding six standard deviations compared to detrended data which had the diurnal cycle removed using a periodic anomaly (except for P); and (iv) excluded site-months with <500 observations (out of~1400 possible per month). This resulted 10398 site-months being used for this analysis. Monthly T′ was then calculated across sites that represent 11 major IGBP vegetation classes (Supplementary Table 1 75 Transfer entropy (T) was calculated as 11,43 T X t ! Y t ; τ ð Þ¼ X yt ; ytÀ1;xtÀτ p y t ; y tÀ1; ; x tÀτ À Á log p y t j y tÀ1 ; x tÀτ ð Þ ð Þ p y t jy tÀ1 ð Þ ; where the predictability of time series Y t based on knowledge of time series X t at time lag τ is calculated using y t-1 as the immediate history of Y t and x t-τ as the history of X t at τ. The p denotes the corresponding probability density functions. T is bounded between 0 and log(m), where m is the number of discrete microstates y taken by variable Y t . We normalize T to a unit-less fraction by division with its upper limit [log(m)], yielding the normalized transfer entropy (T′). T′(X → Y, τ) was calculated for 0.5 h increments of τ from 0.5 to 18 h and then averaged across all 36 increments to yield T′ Avg (Supplementary Figs 4, 5 present additional information on the underlying significant timescales of T′ Avg ).
In order to achieve a balance between entropy estimation accuracy and limited observations in the numerical estimation of p(y), m = 20 was used dividing H and LE into 20 bins (referred to as microstates in information theory nomenclature) of equal width. Note that these microstates do not have a physical significance, but serve as the basis for determining the underlying relationship between X and Y. T(X → Y, τ) measures additional information that is provided by knowledge of X at time-lag τ in addition to information provided by the history of Y itself. It is a statistical index for physically causal and directional coupling (not correlation), albeit with limitations. The reader is also referred to previous works 11,44 for details on PN calculation methodology.

Artificial neural network (ANN)
To extrapolate results from site level to the entire land surface, artificial neural networks (three-layer feed forward) were trained for each flux coupling. The ANN was trained to extrapolate T′ Avg values from FLUXNET using gridded data at 0.25°resolution. ANN training inputs were monthly T a , R g , P, ET p , the enhanced vegetation index (EVI), IGBP class, elevation, and absolute latitude. ANN outputs were monthly T′ Avg values for each coupling at 0.25°resolution using the auxiliary datasets described at the end of the methods section.
ANNs have been widely used for climate change and ecosystem research and, given their skill in dealing with noisy and unbalanced datasets, 76 they are well suited for PN research as they do not require geographically well distributed training sites and are robust to uneven distribution in IGBP classes or climates in the training dataset. To minimize overfitting, which ANNs are sensitive to, we employed Bayesian regularization backpropagation as the training function in a three-layer feed-forward ANN. This improves generalization for small and noisy datasets. 77 We divided the training dataset randomly into training (70%) and test (30%) datasets and the performance of the ANN was evaluated on the test set using the Pearson's R coefficient (Supplementary Table 2). The ANN analysis was performed with the Neural Network Toolbox in Matlab2014b.
Auxiliary data for ANN extrapolation In addition to FLUXNET site level data (T a , R g , P, and ET p ), the ANN was trained using IGBP (provided by FLUXNET), elevation above sea level, and the EVI. For elevation, we used data provided by FLUXNET, if present. In the absence of site provided data or if only an approximate height was provided, United States Geological Survey (USGS) Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010) was substituted using a nearest neighbor approach. EVI from the Moderate Resolution Imaging Spectrometer (MODIS) Monthly L3 Global V006 (Terra: MOD13C2; Aqua: MYD13C2) was assigned to the sites using the following procedure: 1. If both MODIS Terra and Aqua 0.05-degree data was available for the given month and year, their mean was used applying a nearest neighbor approach. 2. If either MODIS Terra or Aqua were available, the available data was used. 3. If neither were present (e.g. for site months before the Terra and Aqua launch dates) the long-term mean for Terra and Aqua for that location and month were used. 4. The value was set to missing, if no acceptable EVI values could be calculated (e.g. snow and ice cover during winter for all years). Information about the gridded datasets used for the ANN can be found in Supplementary Table 3. The IGBP landcover was assigned using the dominant land cover class (as percent cover within the 0.25°× 0.25°grid cell) from the MODIS MCD12C1 product. Data availability for EVI is shown in Supplementary Fig. 13. Global monthly mean meteorological data (T a , R g , P, and ET p ) for ANN extrapolation are obtained from GLDAS (Global Land Surface Data Assimilation System) V2.1 at 0.25°resolution. We use the 30-year climatological average (1981-2010) for ANN extrapolation. On  Figure 5, but for T′ Avg of H to T a , P, and R g . The normalized standard deviation for each bin is presented in Supplementary Fig. 3 overview about total data availability for training of the ANN is given in Supplementary Fig. 14.

DATA AVAILABILITY
PN and ANN results are available from the corresponding author upon reasonable request.

CODE AVAILABILITY
The Process Network code is published as: ProcessNetwork/ProcessNetwork_Software -File Exchange -MATLAB Central Available at: http://www.mathworks.com/ matlabcentral/fileexchange/41515-processnetwork-processnetwork-software Received: 15 January 2019; Accepted: 19 September 2019; Fig. 6 Average estimated transfer entropy (T′ Avg , red colorscale) of latent (LE) and sensible heat (H) to air temperature (T a ), precipitation (P), and global radiation (R g ) for northern hemisphere winter (DJF) (a, c, e, g, j, k) and summer (JJA) (b, d, f, h, j, l) extrapolated from FLUXNET sites to the global map using an artificial neural network (ANN). Gray and black areas indicate lack of input data and negative (non-physical) T′ Avgvalues, respectively. Dark and red areas indicate strong land to atmosphere feedback associated with the indicated land surface flux process