Influence of watershed characteristics on streambed hydraulic conductivity across multiple stream orders

Streambeds are critical hydrological interfaces: their physical properties regulate the rate, timing, and location of fluxes between aquifers and streams. Streambed vertical hydraulic conductivity (Kv) is a key parameter in watershed models, so understanding its spatial variability and uncertainty is essential to accurately predicting how stresses and environmental signals propagate through the hydrologic system. Most distributed modeling studies use generalized Kv estimates from column experiments or grain-size distribution, but Kv may include a wide range of orders of magnitude for a given particle size group. Thus, precisely predicting Kv spatially has remained conceptual, experimental, and/or poorly constrained. This usually leads to increased uncertainty in modeling results. There is a need to shift focus from scaling up pore-scale column experiments to watershed dimensions by proposing a new kind of approach that can apply to a whole watershed while incorporating spatial variability of complex hydrological processes. Here we present a new approach, Multi-Stemmed Nested Funnel (MSNF), to develop pedo-transfer functions (PTFs) capable of simulating the effects of complex sediment routing on Kv variability across multiple stream orders in Frenchman Creek watershed, USA. We find that using the product of Kv and drainage area as a response variable reduces the fuzziness in selecting the “best” PTF. We propose that the PTF can be used in predicting the ranges of Kv values across multiple stream orders.

There are many laboratory and in-situ tests to determine streambed K v 18,[21][22][23] . While some studies focused on advantages and limitations of different measurement techniques 24,25 , others have focused on only the spatial variability of streambed K v along transects across a channel 26 , both the spatial and temporal variability 18 , statistical description (means, ranges, variances) for hydraulic conductivity data 22,27 , or spatial interpolation of streambed K v 22,28 . These studies have all grappled with challenges of estimating K v because of the difficulty of determining representative samples and comparing results, considering the heterogeneity and anisotropy of streambed materials and geological conditions 5 .
Similar to the issue of spatial variability is the effect of scaling. Based on some unsolved problems in hydrology that were recently published 29 , one of the unanswered questions posed was, "What are the hydrologic laws at the watershed scale and how do they change with scale?" They also questioned why dominant hydrological processes emerge and disappear across scales, and why hydrology seems to be simple at the watershed scale despite being complex at smaller scales 29 . Since the pore-scale approach to flow in porous media may be inherently inadequate at the watershed scale, a more fruitful path forward is to consider a watershed as a single ecosystem, and to build a new kind of theory or propose a new kind of approach that can apply to the whole watershed.
Saturated hydraulic conductivity (K sat ) is a quantitative measure of the ability of a saturated soil to transmit water when subjected to a hydraulic gradient 30 . Although K v and K sat are similar in definition, the processes that govern their distribution are different. While K sat is essential in modeling surface and subsurface flow as well as solute transport in soils and sediments 31 , K v is an important variable controlling water and solute exchange between streams and surrounding groundwater systems [17][18][19][20] . Many pedo-transfer functions (PTFs) have been developed in the past five decades to estimate K sat from easily measureable parameters, such as textural properties, bulk density, and sample dimensions [31][32][33][34][35][36][37] . In comparison to K sat , PTFs have not been as widely applied to streambed K v estimation from other soil properties 38,39 . While most of the K v studies have focused on analyzing the spatial and temporal variations of point measurements, very few empirical studies have focused on predicting K v using only reach-scale attributes 8 . There is still a knowledge gap in the spatial prediction of streambed K v using PTFs based on soil textural distribution and watershed characteristics. Although PTFs may not be applicable beyond the regions for which they were developed 39 , we attempted to develop PTFs for estimating K v within a few orders of magnitude of the measurements using a new approach which hinges on the premise that the sediment composing the streambed is originated from the eroded rocks and sediment within its enclosing drainage area.

Results and Discussion
The geometric mean K v value varies between 7.57 × 10 −3 and 1.81 m/day, about four orders of magnitude variation, which indicates different types of soils with various structures across different stream orders. The summary statistics of streambed K v values at each of the ten test sites (stream channels) are shown in Table 1.
Spatial distributions of soil texture show that, compared to the upland areas, there is about twice more silt than sand in the downstream areas (Sites 4 and 10) of the watershed (Fig. 1). Conversely, in situ permeameter tests and sieve analysis show that the streambed is predominantly sandy (>95%) in the downstream areas with higher K v values (Fig. 2). In general, we observe that geometric mean K v values tend to increase in the downstream direction in the study area. This seems counterintuitive because K v is expected to decrease going downstream since the grain size of streambed sediments typically decreases with distance downstream due to abrasion, sorting, and selective transport 40 . However, a downstream transition occurs in stream channels due to the assortment of sediments coming from all points in a watershed and the spatial variation of soil textural properties of the sediment. In addition, the sediment source of the tributaries plays a major role in controlling the grain-size distribution for streambed sediments 41 . A downstream decrease in K v would be true only if all the sediments enter a stream only from the upper headwaters and there are no sediment contributions from tributaries, streambank erosion and runoff from adjacent fields that enters the stream by flowing over the streambanks. In reality, storms deliver large amounts of eroded sediment from the surrounding landscape. Therefore, we introduce a new approach called Multi-Stemmed Nested Funnel (MSNF) to capture the effect of the spatial variability of soil, reach and watershed properties on K v prediction (Fig. 3).
The multi-stemmed nested funnel (MSNF) approach. MSNF approach is based on the concept of nested hierarchy of lower-ordered sub-watersheds and also involves the understanding of soil erosion and sediment mobilization processes. A large watershed is typically made up of many sub-watersheds that are drained by www.nature.com/scientificreports www.nature.com/scientificreports/ tributary streams and rivers. These sub-watersheds, in turn, are also made up of smaller watersheds of streams draining into their main channels. Headwater watersheds are known to be major contributors of sediments to downstream reaches. This approach suggests that the order of magnitude of K v (m/day) depends on the textural composition of the sediments coming from headwater watersheds as well as reach and watershed attributes. Thus we hypothesize that a point estimate of K v should be a function of the sediments coming from the entire area it drains, the reach properties as well as the contributing watershed attributes. Also, point estimates should vary across multiple stream orders due to the heterogeneity and anisotropy of streambed materials.
We choose seven frequently available watershed and soil characteristics (SSURGO datasets) as predictor variables for this study. These are drainage area (DA), reach slope (Rch_Slp), percent organic matter (OM), percent sand, percent silt, percent clay and soil erodibility factor (K_Erod). Although other watershed-scale and reach-scale attributes derived from digital elevation models (DEMs) such as the watershed elevations, reach elevations, reach length, width at top of bank, depth, width-depth ratio, and average width of tributary channels would also affect K v , we do not use them in this study. This is partly because of their high sensitivity to spatial resolutions of DEMs used in most studies. These attributes are major inputs to distributed parameter watershed models that are used in simulating the hydrologic response of a watershed 42,43 . Moreover, with 93 permeameter tests carried out across 10 sites (n = 10), such sample size with many predictor variables will lead to overfitting which reduces the accuracy of the estimates and the power of the PTFs.
The correlation matrix shows that K v is more highly correlated with DA than with other selected predictor variables (Fig. 4). There is also a negative correlation between DA and Rch_Slp. Several studies related to streamflow and sediment yield have also shown a negative relationship between drainage area and average watershed www.nature.com/scientificreports www.nature.com/scientificreports/ slope [44][45][46] . Although the correlation coefficients of streambed K v with OM and clay are relatively small, both OM and clay are still suitable for developing PTFs because of the relatively high inter-correlation among soil textural properties.
Developing PTFs for predicting K v . With the MSNF approach and seven predictor variables, all 127 possible PTFs based on all possible combinations of these variables were developed and analyzed. The performances of all PTFs were assessed by the values of thirteen selection criteria (see Methods). The distributions of all the possible k-variable (k = 1, 2, …, 7) models for six criteria are shown (Fig. 5a). The best subset regression models were selected for each variable-number category using these criteria. That is, of all the possible k-variable models, the best performing model was selected. Since there is only one possible model with seven variables (k = 7), this implies that this is the best 7-variable model.
Except for the best Model 4 (k = 4), all the best subset PTFs consist of DA as a predictor and explains 65% of the variance in LogKv. The next most important variable is OM. Results indicate that there is no consistency in the selection of the overall best PTF based on these thirteen criteria, although Model 5 appears to be the strongest candidate since five out of the thirteen selection criteria choose Model 5 as the overall best PTF.
K v DA as a better response variable. In order to better predict K v spatially, there is a need to avoid seeking exactness where only an approximation is possible, and accept the degree of imprecision that the nature of complex erosion and sedimentation processes allow. Therefore, we introduce a new response variable (K v DA) for predicting www.nature.com/scientificreports www.nature.com/scientificreports/ the order of magnitude of variation of K v so as to help balance the accuracy we seek with the uncertainty that exists. While K v is an intrinsic property of streambed materials, K v DA is an extrinsic property which behaves like hydraulic conductance and is a property of a reach which varies depending upon its shape and size as well as watershed attributes. It is important to note that whereas hydraulic conductance is the product of hydraulic conductivity, stream width and length of stream reach divided by the thickness of streambed, K v DA is the product of streambed vertical hydraulic conductivity at a location and the watershed area drained by that location.
Coupling K v and DA can be important for reducing the fuzziness in the selection of the overall best PTF. It can also help to capture the hydrological processes which control magnitude of variation of K v across scale, soil texture domains, as well as across different stream orders. Based on the MSNF approach, K v DA is used as a response variable that is dependent on the long-term sediment transport and deposition processes by overland flow from the surrounding landscape. Figure 5a shows all the 63 possible PTFs for predicting log-transformed KvDA (that is LogKvDA) from percent OM, sand, silt and clay content as well as Rch_Slp and K_Erod.
In contrast to PTFs for predicting LogKv, different model categories select LogRch_Slp and LogOM as the most significant predictor variables when K v is coupled with DA. Models 1, 5, and 6 select LogRch_Slp as the best predictor variable while Models 2, 3 and 4 select LogOM. Comparison of the performance of the best subset PTFs and the rank totals of the selection criteria (Fig. 6) shows that Model 4 is the overall best PTF due to a better consistency of all the selection criteria. This implies that OM, sand, silt and clay contents are the best predictor variables for predicting LogKvDA.  www.nature.com/scientificreports www.nature.com/scientificreports/

Conclusion
Beyond demonstrating that using K v DA as the response variable in the MSNF approach enables prediction of streambed K v with lower prediction error (when compared to using K v as the response variable), it is also important to note that predicted K v values are only rough estimates and are probably best considered as order-of-magnitude estimates. In many applications of distributed models, we believe the MSNF approach we propose can be useful in hydrological modeling using tools such as Soil and Water Assessment Tool (SWAT) and MODFLOW. Prediction errors for the "best" PTF can be used to determine the minimum and maximum K v values for calibration purposes. However, it is important to note that the use of different prediction error equations in predicting calibration ranges should be expected to lead to different results in terms of the predicted uncertainty limits. We hope that the MSNF approach might provide the basis for further studies which include more soils from the soil texture triangle, specifically clay. Our results, which shall be corroborated by further studies, support the importance of using spatial K v in modeling to understand the interactions between streams and aquifers. Data collection. In this study, ninety three in-situ falling head permeameter tests were carried out at 10 test sites within the Frenchman Creek watershed (Fig. 1). Seven sites (Sites 4-10) are on Frenchman Creek, two sites (Sites 2 and 3) are on Stinking Water Creek and one site (Site 1) on Spring Creek. The number of sites was limited to landowner access. Other tributaries within the watershed were dry at the time of the study. Each test site comprised of at least three transects and each transect comprised of at least three K v measurements. Transparent tubes (76 cm long and 8 cm inside diameter or 183 cm long and 6.8 cm inside diameter) were pressed vertically into the channel sediments. The thickness of the tube wall is about 3 mm, typical of many previous studies 18, 28 . The locations of permeameter stations were mapped with a global positioning system (GPS). For each K v measurement, the tube was inserted into the stream bed to a depth of 30 cm. Before beginning the falling head test, it was assumed the stream water level was equal to the groundwater level. This assumption increased measurement uncertainty, but likely by a small amount compared to the variability within and between sites. The surface water-level at the streambed surface was considered as the initial hydraulic head at the measurement point. Water was added slowly to fill up the tube from the top so as not to disturb the sediments inside the tube. As the hydraulic head in the tube began to fall, a series of hydraulic heads at given times were recorded for the derivation of the vertical hydraulic conductivity of the sediment column. K v (m/day) was calculated using Eq. (1) derived from Hvorslev's 23 equation.
where H is the water level inside the permeameter relative to the ambient pre-test water level, D is the inside diameter of the tube, L v is the length of the sediment in the tube, t is time, and m is the isotropic transformation ratio K K / h v where K h is the horizontal hydraulic conductivity of the sediment around the base of the tube. This study used the average of K v values using m = 1 and m = ∞. Note that the m = ∞ scenario simplifies to the standard Darcy equation 24 .

Data analysis.
To check whether the distribution of K v are normal for the sites, formal tests of normality were carried out using six normality tests 47 . Anderson-Darling (AD), Cramer-von Mises (CVM), Lilliefors (LL), Pearson chi-square (CSQ), Shapiro-Francia (SF), and Shapiro-Wilk (SW) tests were applied at 0.05 significance level. Owing to the fact that there are contradicting results as to which test is the optimal or best test 48 , these six normality tests were compared in order to see how they performed for both non-transformed and log-transformed K v values.
For more general heterogeneous systems, the effective hydraulic conductivity is known to be the geometric mean since samples of hydraulic conductivity in most cases follow lognormal distribution. Since K v values may vary by orders of magnitude within a short distance along a river reach and across a section, the geometric mean for each site was used in this study to capture the spatial variation.
The database used in this study includes soil datasets from the SSURGO database that consists of information about soils as collected by the National Cooperative Soil Survey 49 in the Unites States. Spatial soil properties (% sand, silt, clay and organic matter) at 0-50 cm depth, K_Erod (fraction), Rch_Slp (fraction) and DA (ha) were extracted using Soil Map Viewer in ArcMap 10.3.
PTFs for predicting the order of magnitude of K v were developed using a new approach called Multi-Stemmed Nested Funnel (Fig. 3). A new response variable (KvDA) was also introduced to help in the selection of the most significant predictor variables (Fig. 5b). Multiple linear regression (MLR) method was used to develop the PTFs because it has been used extensively due to its simplicity, good application and accuracy. The PTFs were developed by log-transforming all the variables in order to avoid heteroscedasticity and non-normality of the residuals of the regressions. We evaluated and compared the predictive performance of all possible 127 PTFs using thirteen performance indicators including R-squared (R 2 ), Adjusted R-squared (Adj.R 2 ), Mallow's C(p), Akaike Information Criteria (AIC), corrected Akaike Information Criteria (AICc), Sawa's Bayesian Information Criteria www.nature.com/scientificreports www.nature.com/scientificreports/ (SBIC), Schwarz Bayesian Criteria (SBC), Mean Squared Error of Prediction (MSEP), Final Prediction Error (FPE), Hocking's Sp (HSP), Amemiya Prediction Criteria (APC), Leave-One-Out Cross-Validation (LOOCV), and Predicted Residual Error Sum of Squares (PRESS). The major difference between these criteria is how severely each penalizes increases in number of predictor variables (Eqs. (2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)). For good prediction, the overall "best" PTF is the one that maximizes R 2 , Adj. R 2 and APC and minimizes the other selection criteria. All statistical analyses and calculations were done using R version 3.4.4 50 .
where x i is the observed value, x î is the predicted value, x is the observed mean value, p is the number of explanatory variables, x ip is the predicted value of the ith observation of x from the p explanatory variables, p ⁎ is the number of explanatory variables including the intercept, x ii is the predicted value of the ith observation of x using all data except ith, S 2 is the residual mean square after regression on the complete set of K explanatory variables, N is the sample size, L is the maximized value of the likelihood function for a model, σ 2 is the pure error variance fitting the full model. In order to select the overall "best" PTF for predicting LogKvDA, all the best subset PTFs (i.e. best n-predictor models) are selected for each number of predictor variables. For each criterion, the models are ranked from 1 to n, and the rank total for each best n-predictor model is calculated by adding all the ranks for all the criteria. All rank totals exclude R 2 because it increases every time a predictor variable is added to a model, even if due to chance