Hierarchical climate-driven dynamics of the active channel length in temporary streams

Looking across a landscape, river networks appear deceptively static. However, flowing streams expand and contract following ever-changing hydrological conditions of the surrounding environment. Despite the ecological and biogeochemical value of rivers with discontinuous flow, deciphering the temporary nature of streams and quantifying their extent remains challenging. Using a unique observational dataset spanning diverse geomorphoclimatic settings, we demonstrate the existence of a general hierarchical structuring of river network dynamics. Specifically, temporary stream activation follows a fixed and repeatable sequence, in which the least persistent sections activate only when the most persistent ones are already flowing. This hierarchical phenomenon not only facilitates monitoring activities, but enables the development of a general mathematical framework that elucidates how climate drives temporal variations in the active stream length. As the climate gets drier, the average fraction of the flowing network decreases while its relative variability increases. Our study provides a novel conceptual basis for characterizing temporary streams and quantifying their ecological and biogeochemical impacts.

data has been performed during the whole survey period (V 1 ) and then focussing only on the summer and fall seasons, that represent the period during which the network is more dynamic (July-Nov 2018, V 2 ). For more detailed information about the Valfredda catchment the reader is referred to [28].
Turbolo -The Turbolo creek is a small river located in southern Italy, where the climate is Mediterranean with hot summers; its drainage area is approximately 7 km 2 at the outlet of Fitterizzi, with elevations ranging between 183 and 1005 ma.s.l.. The empirical data used in this study refers to two subbasins in the northern part of the catchment, with contributing areas of 0.67 and 0.48 km 2 (catchments T 1 and T 3 , respectively). A third catchment (T 2 ), nested into T 1 , has also been identified to capture the effects of the internal heterogeneity of geology and land cover in the study area. The T 2 catchment drains an area of 0.23 km 2 only. The Mediterranean climate of the area consists in hot and dry summers with wet, mild winters. The annual precipitation is approximately 1200 mm, but due to high evapotranspiration rates a complete dry-down of the network is observed during the summer. The T catchments are dominated by silty marly clays with low permeability, that generate (jointly with the aridity of climate) streams with very low persistency. The geomorphic drainage densities in this site range between 2.8 and 5.6 km 2 . The upper part of T 1 is composed by sandy-conglomerate formations, in which the higher permeability produce a relative increase in the persistency of the stream network. The catchments T 1 and T 2 have been monitored for 35 times from April 2019 to January 2020, while T 3 was monitored for 42 times from May 2019 to January 2020.
Hubbard Brook -Three US catchments considered in this study pertain to the Hubbard Brook Experimental Forest of New Hampshire (New England). In the paper, they have been identified as H 1 , H 2 and H 3 , and their contributing areas are 0.14, 0.25 and 0.42 km 2 , respectively. The mean altitude of these catchments ranges between 630 and 740 ma.s.l., with average slopes of 28%. The stream networks of these catchments are the densest of our dataset, with geomorphic drainage densities of 12.5, 14.5 and 8.9 km −1 , respectively. The geology of the sites is mainly composed of shists and granulites, covered by a mantle of sandy loams. The humid continental climate of the area provides a mean annual precipitation of 1400 mm, with cold snowy winters and mild summers. For each catchment, 7 visual monitoring of the active network have been carried out in June and July 2015, spanning a wide range of hydroclimatic conditions. More information on this catchments is available in [27].
Fernow -The Fernow Experimental Forest is located in the Appalachian Plateau of West Virginia (USA). In this site, three study catchments (F 1 , F 2 , F 3 ) have been selected, with drainage areas of 0.14, 0.16 and 0.37 km 2 , respectively. These catchments have a mean altitude of approximately 800 ma.s.l., with slopes around 30%. Field observations revealed geomorphic drainage densities of 2.9 km −1 with local persistencies varying in the full range between 0 and 1. The bedrock is composed of shales and sandstones, with a soil cover consisting mainly of silt loams. The average annual precipitation is approximately 1450 mm, with rainy summers and mite winters. Snowfalls are common but usually limited and isolated. From May to December 2016 a total of 7 field surveys per catchment were carried out. More information on this study site can be found in [27].
Potts and Poverty -Three study catchments are located in the Valley and Ridge physiographic provinces of Virginia (USA). Two of them (P 1 and P 2 ) are part of the Poverty Creek, while the third (P 3 ) belongs to the South Fork Potts Creek, 25 km northern of P 1 and P 2 . These catchments have a contributing area of 0.25, 0.35 and 0.73 km 2 , respectively. P 1 and P 2 have and average elevation of 750 ma.s.l., a mean slope of 33% and a geomorphic drainage density of 7 km −1 . The Potts Creek catchment, instead, has a higher elevation (1030 ma.s.l.) and gentler slopes (27%). Its geomorphic drainage density is 2.9 km −1 . Bedrocks are mainly composed by standstones in all the catchments, whereas shales and siltstones can be found in the P 1 and P 2 sites. The average annual precipitation in the area is close to 1000 mm, remarkably lower than all the other USA catchments considered in this study (and comparable with the T sites in southern Italy). Unlike the Turbolo Creek, however, the drainage networks do not experience summer dry down due to the lower evapotranspiration rates. The active network in these catchment has been monitored by visual inspection 7 times between September 2015 and March 2016. More information about these study sites can be found in [27].
Coweeta -The C 1 , C 2 and C 3 catchments are located in the Coweeta Hydrologic Laboratory, North Carolina (USA). They are the highest and steepest USA sites considered in this study, with mean elevations between 800 and 1100 ma.s.l. and slopes of about 50%. These sites have areas of 0.12, 0.33 and 0.40 km 2 , respectively, and drainage densities ranging between 3.6 and 6.2 km −1 . The catchments lie on the Southern Blue Ridge Mountains, which mainly consist of exposed metamorphic rock (biotite gneiss and amphibolite). The climate is generally warm and humid, with temperatures above 0°C all year round. The precipitation is greatest in winter and early spring, with an average annual precipitation of about 1800 mm. The field campaign for detecting the active stream network in these sites were carried out from November 2015 to December 2016, and consisted in 7 field surveys for each catchment. More information about these study sites can be found in [27].
Attert -The Attert catchment (cathcment code A) covers an area of 247 km 2 between Luxemburg and Belgium. The altitude ranges between 245 and 550 ma.s.l. and the geology is dominated by slate, marls and sandstone, while lithology is driven by land use (mainly forest and agricolture). The climate is characterized by wet winters and mild summers, with an average annual precipitation of 900 mm uniformly distributed troughout the year (58). Different types of sensors were employed in this catchment (time-lapse photography, conductivity sensors and water level gauging stations) to monitor flow presence in a set of 182 nodes along the network. These sensors recorded water presence at a 30-minute time interval from 2013 to 2017. The sparse arrangement of the sensors along the network didn't allow a proper calculation of the active length. Therefore this data was only used to validate the hypothesis of hierarchical activation of the stretches. Detailed information about this catchment can be found in [56].
Seugne -The Seugne river is located in the Charente-Maritime département in west France. The catchment (hereafter coded as S) has a contributing area of 920 km 2 , with elevations that span between 8 and 160 ma.s.l.. The geology is composed of limestones, carbonates, sedimentary rocks and sand. The climate is Marine West Coast. The active network was monitored in the catchment as part of the Onde campaign. Visual inspection observations were employed determine water presence in a set of 40 predetermined nodes, at a monthly interval from 2012 to 2020. As for the Attert catchment, this data was only used for validating the hierarchical hypothesis because the sparse monitoring points did not allow the calculation of the active length. More information about the field campaign can be found in (59).   Figure S1 shows the frequency distribution of the local persistency of the nodes within each study catchment (histograms). The plots also show as a solid line the fitted Beta distribution (see equation (S11)). Furthermore, the corresponding spatial maps of local persistency are represented in Figure S2.  Figure S1: Probability Density Function of the local persistency P i of the study catchments.  Figure S3 summarizes the main climatic and hydrological characteristics of the study catchments reported in Table S2. The figures indicate that the study catchments cover a wide range of climatic conditions in terms of mean precipitation, evapotranspiration, aridity, temperature, catchment area and elevation. Therefore, the dataset can be considered as representative of a broad range of climatic and morphologic settings. Figure 2c in the main text

Methods underlying
The effective precipitation P t − ET is used in this paper to quantify the specific (per unit catchment area) catchment water availability. P t − ET is calculated as the difference between the total precipitation and the potential evapotranspiration temporally averaged over a suitable reference period. Specifically, the reference period consists of the months during which the field surveys for the network monitoring have been carried out. If a field survey was performed in the first 8 days of the month, the month before the first survey of the campaign has been included in the computation of P t − ET . For the Italian catchments, climatic data necessary for the calculation of P t − ET were retrieved from the nearest Regional Environmental Protection Agency weather station (the Falcade station of ARPAV for the V catchment and the Fitterizzi station of ARPACAL fot the T catchment). Precipitation and streamflow data in the H, F and C catchments, instead, were provided by the Hubbard Brook Experimental Forest (60), the Fernow Experimental Forest (61) and the Coweeta Hydrologic Laboratory (62) respectively. Data from the NOAA meterologic station US1VAMN0025 were also used for the P 1 and P 2 catchments. No precipitation data were available for P 3 instead. Monthly potential evapotranspiration data for all the US catchments were provided by USGS (63). This data were estimated at a 1 km spatial scale using the SSEBop model, starting from climatic and morphological data (64). Instead, the Penman-Monteith equation was used to estimate the potential evapotranspiration in the T and V catchments, based on available climatic data (65).
Calculation of τ m The interarrival between any pair of field surveys is here defined as the time elapsed between the dates of the corresponding surveys. For each study catchment, the interarrival between each couple of survey dates has been calculated (including all the couples of non-consecutive surveys). Then, the median interarrival, τ m , has been computed to quantify to what extent a field campaign was biased by the presence of several surveys concentrated in a relatively short sub-period of the whole survey period (shown in Table S1). The median interarrival was chosen instead of the average interarrival to reduce the effect of field survey that were carried out many months later than all the other surveys. It was found that if τ m is too small, the surveys reflect the hydrological conditions observed during the months in which most of the surveys were performed and not the average climatic conditions over the whole study period. Considering that the mean frequency of the surveys is of the order of one survey per month, in this study it was assumed that τ m must be greater than > 15 days to guarantee that the active lengths observed in correspondence of the field surveys can be properly linked to the average climatic conditions observed within all the months during which at least one survey was performed. The calculated values of τ m for all the study catchments are reported in Table S1.

Performance of the hierarchical model
The accuracy of the hierarchical model on each of the study networks is calculated by means of equation (11) of the methods. Figure S5 shows the maps of the local accuracy of the hierarchical model for each study catchment. The Figures indicate that the local accuracy is close to 100% in most cases. The average accuracy within each study catchment is reported in Table 1. In the framework presented in this paper, the local persistency is calculated from empirical observations. Given that node persistency dictates the fraction of time for which a node is active, one could expect that a model in which each node is randomly activated/deactivated through time in a way that is consistent with the local persistency could have a positive accuracy. In fact, the accuracy of such a random model for a node with persistency P i is P 2 i + (1 − P i ) 2 , leading to a minimum accuracy of 50%. The accuracy of the hierarchical model was thus compared to the accuracy of the random model for all the study catchments. Figure S6 show the maps of local accuracy that would be obtained from the random model. Figure S4 shows that the mean accuracy of the hierarchical model (99±1.7%) is substantially higher than the accurcay of the random model (79 ± 7.5%).  Figure S4: Accuracy of the hierarchical and random models as a function of the mean network persistency.

Drainage density statistics
The data underlying Figures 2 and 3 of the main text are reported in Table S4, which reports the relevant drainage density statistics for the study catchments.
The table also provides information on how the catchments have been grouped in the box and whiskers plots of Figure S3.

Correlation between mean network persistency and catchment area
Our analyses indicated that mean network persistency and catchment area are poorly correlated, with a correlation coefficient of 0.06. The scatter plot ofP vs. A is shown in Figure S7.

Analytical derivations
Average drainage density A dynamical stream network is here described by a set of N nodes. Each node identifies the hydrological status (active vs. dry) of a suitably identified portion of stream network with length ∆l i . At any given time, the node i takes a binary status X i (0 = dry / 1 = active). Therefore, at any time the active steam length can be calculated as the sum of the lengths associated to each active node, The local persistency of node i, P i , defined as the probability for the node i to be observed as active, can be calculated as the temporal average of the observed state of the node exploiting a series of field surveys of the active network (i.e. the average of X i , sampled at different times).
Likewise, the average stream length can also be calculated as: whereP is the mean network persistency, L g = N i=1 ∆l i is the geomorphic length (corresponding to the length of the network when all the nodes are active), and the overline denotes time averaging. Equation (S2) shows that the definition of the maximum length of the network is a crucial step for the calculation of the mean persistency: in fact, for a given mean network length, the mean persistency is inversely proportional to L max . Equation (1) of the main text is obtained by dividing both sides of equation (S2) by the catchment area.
General expression for the coefficient of variation of the active drainage density The local status of each node, X i , can be seen as a Bernoulli binary random variable. Accordingly, the local persistency is the expected value of X i , while the variance of X i can be expressed as: V ar(X i ) = P i (1 − P i ). Let us now consider a generic couple of nodes i and j with local persistencies P i and P j , respectively. One can identify four joint probabilities of the states of these nodes as follows (66): The above joint probabilities obey to three constraints: P ij 11 + P ij 10 = P i , P ij 01 + P ij 11 = P j and P ij 11 + P ij 10 + P ij 01 + P ij 00 = 1. Incorporating these constraints into equation (S3) allows for the calculation of all the joint probabilities based on the local persistencies (P i and P j ) and one joint probability (e.g. P ij 11 ). In the general case, the covariance of the states of two nodes can be written as: Due to the symmetry of the covariance (Cov(X i , X j ) = Cov(X j , X i )), there is no loss of generality in assuming P i ≥ P j . Introducing the hypothesis of hierarchical activation of the nodes (66) is equivalent to imposing P ij 01 = 0 (i.e. the less persistent node j cannot be active if the more persistent node i is dry), or P ij 11 = P j . As a consequence, the covariance between the states of two nodes reads: This expression is only dependent on the persistencies of the two nodes. Let us now consider a stream network with N nodes numbered in a decreasing order of persistency (i.e. P i ≥ P j for i < j). Note that node numbering does not necessarily reflect the actual position of the nodes in the physical space. The states of the nodes can be represented in a vector of random variables X = (X 1 , X 2 , ..., X N ). The covariance between each possible couple of nodes can be summarized in the covariance matrix, the elements of which can be expressed as: where i and j represent the row and column indexes, respectively. At any given time, the active stream length L is a linear combination of the states of all the nodes, as per equation (S1). The stream length variance is therefore expressed as the sum of all the elements of the covariance matrix, properly weighted by the ∆l i associated to each node: Without loss of generality, to simplify the notation a constant unit length ∆l i = ∆l = 1 is assumed for each node. All the relevant equations, however, can be easily generalized to the case ∆l i = const by properly weighting each term associated to the node i (i.e. the status X i or the local persistency P i ) with the corresponding ∆l i . First, let us express each local persistency as the sum of the mean network persitency,P , and a local deviation from the mean, ∆P i (i.e. P i =P + ∆P i ). Note that N i=1 ∆P i = 0 by definition. Combining equations (S6) and S7, and using the above definitions, the stream length variance can then be written as: where the term ∆P = 2/N 2 · N i=1 i∆P i represents a weighted spatial average of the persistency deviations around the mean. The weights associated to each ∆P i increases as i increases (i.e., as P i decreases). Therefore, a bigger weight is attributed to the negative values of ∆P i , and ∆P ≤ 0. It is important to note that the last row of equation (S8) represents an approximation of the actual value of V ar(L), which is obtained from the full expression of V ar(L) by neglecting the following two terms: i) the term N i=1 ∆P 2 i , that proves to be negligible when N is large (because it is of order N , while the other terms are of order N 2 ); ii) the term 2 N i=2 i−1 j=1 ∆P i ∆P j , that is negligible because of the compensating effects originated by the interplay between the factors ∆P i and ∆P j when the summation indices vary. A detailed set of numerical simulations based on synthetic networks and available empirical data confirmed that the neglected terms are at least four orders of magnitude smaller then all the other terms of the equation. Equation (S8) was then combined with equation (S2) to derive equation (2) in the main paper.
Upper limits for the coefficient of variation and the standard deviation of the active drainage density For a given stream network (i.e. a set of nodes with known local persistencies), the length variance is maximized if the activation/deactivation of the nodes follows a hierarchical order. In fact, the hierarchical activation scheme (for which P ij 01 = 0 if i < j) maximizes the joint probability that two arbitrary nodes are concurrently active (P ij 11 = P j ), as per equation (S3). This circumstance in turn maximizes the covariance of the states of each couple of nodes in the network (equation (S5)) and the active length variance (as per equation (S7)).
Furthermore, a stream network in which all the nodes share the same persistency originate an higher active length variance with respect to any other possible network with the same mean network persistencyP . Intuitively, this property is directly implied by the hierarchical activation scheme, according to which all the nodes are activated or deactivated at the same time as long as they share the same persistency. In such a case, the active length can only assume the limiting values of 0 and L g , thereby ensuring the maximization of the temporal variance of the active length. The same conclusion can be obtained analyzing the term ∆P ≤ 0 in equation (S8). ∆P , in fact, is a weighted spatial average of the fluctuations of persistency around the mean. In the case where all nodes share the same persistency, ∆P i = 0 for each i, and ∆P = 0, allowing the maximization of the length variance (as per equation (S8)). Based on equation (S8), the upper limit for the variance of the active stream length can be written as: This equation can be easily combined with equation (S2) to obtain equation (3) in the main paper. Equation (S9) shows that the universal maximum variance of the active length is obtained whenP = 0.5, because under these circumstances the termP (1 −P ) is maximized. In such a case, V ar max (L) = 0.25N 2 , where N is the geomorphic length due to the assumption that ∆l i = ∆l = 1.
The universal upper bound of the length standard deviation is then: as reported in the main text.
Coefficient of variation of the drainage density for beta-distributed persistencies The available empirical data show that the frequency distribution of the local persistency of the different nodes in the network, here denoted as pdf (P ), can be reasonably described by a Beta distribution with the first shape parameter set to 1: (S11) In the above equation P is the local persistency, β is the second shape parameter of the distribution and B(1, β) = 1 β . For each case study, the parameter β was calculated using the method of moments based on the mean network persistency,P . In particular, the analytical expression of the first moment of the Beta distribution was exploited to calculate the free shape parameter as a function of the mean network persistency: The analogous expression for the second-order moment of the Beta distribution can be used to evaluate the spatial variance of the local persistency as: 1 +P (S13) Likewise, ∆P can be expressed as follows: Equations (S8) and (S14) were then combined to obtain the following expression of the variance of the active length when the local persistencies are beta-distributed: Equation (4) of the main paper is then derived combining equations (S2) and (S15).

Monitoring hierarchical networks
If the activation and deactivation of the nodes of a dynamic stream network obeys to a purely hierarchical scheme, the monitoring of active stream length dynamics can be significantly facilitated. In fact, the hierarchical assumption can be used to deduce the status of some nodes in the network from the observed status of other nodes. Specifically, if a node is observed as wet then all the nodes in the network with a larger persistency should be wet. Likewise, if a node is dry, then all the nodes with a smaller persistency should be dry as well. One might argue that prior to the field campaign for the monitoring of active stream network dynamics the persistencies of the nodes are not known. However, as long as the surveys take place, this hierarchy is progressively revealed and, crucially, the hierarchies emerging during any survey will remain unchanged for the entire campaign. Thus, as the number of available active stream network maps increases, the ability to extrapolate in space information about the status of the nodes increases as well. To quantify the number of nodes that do not need to be visited during a field campaign (because their status can be deduced from the status of other nodes), a series of numerical simulations was performed in which we simulated the dynamics of the stream network by simulating a hierarchical chain of nodes and the position of the transition between wet and dry nodes in the chain. Likewise, the number of nodes that has to be visited using the information available prior to the survey is computed. These numerical simulations consisted in the following steps: i) a sequence of N nodes with a dichotomic status and a fixed distribution of the local persistencies is considered; ii) the nodes are first re-ordered in a chain following a decreasing order of local persistency (the first node is the most persistent, the last node the least persistent); iii) an auxiliary counting variable, k, is set equal to 1; iii) the position of the unique transition between wet and dry nodes in the chain is identified by selecting randomly one of the existing groups of nodes in the chain; if k = 1 the position of the transition is set within the unique group containing all the N nodes of the chain; iv) the selected group is then randomly split into two sub groups (namely, the group of wet nodes and the group of dry nodes) and a new group is thus created; v) a new hierarchy is created between these two node subgroups by assigning an apparent persistency P a to the existing k + 1 groups as follows: P a = s/k, where s is the number of times that a group of nodes was observed as active during the campaign up to the k th survey; vi) the number of nodes to be visited during the k th survey to characterize the whole active network is then computed as the sum of the number of nodes that need to be surveyed to identify the groups within which the transition takes place (i.e., the number of existing groups k) plus the number of nodes that need to be visited to identify the precise location of that transition within the nodes of the relevant groups (all the nodes that are included in the two groups where the transition takes place); vii) the counting variable k is increased to the subsequent integer number and the procedure starts again from step iii). The procedure stops when a pre-determined number of field surveys (say, K) is performed. The percentage of nodes that has to be surveyed is then calculated as the number of surveyed nodes during the whole campaign scaled to the number of nodes that potentially have to be visited (i.e., N K). The simulations were performed with different combinations of K, N andP , and the results are shown in Figure S8. The plot indicates that, for hierarchical networks, the number of nodes to be visited is between 60 to 30 % of the overall number of nodes that should be surveyed in non-hierarchal networks, with smaller percentages associated to field campaigns made up of a larger number of surveys.