Have GRACE satellites overestimated groundwater depletion in the Northwest India Aquifer?

The Northwest India Aquifer (NWIA) has been shown to have the highest groundwater depletion (GWD) rate globally, threatening crop production and sustainability of groundwater resources. Gravity Recovery and Climate Experiment (GRACE) satellites have been emerging as a powerful tool to evaluate GWD with ancillary data. Accurate GWD estimation is, however, challenging because of uncertainties in GRACE data processing. We evaluated GWD rates over the NWIA using a variety of approaches, including newly developed constrained forward modeling resulting in a GWD rate of 3.1 ± 0.1 cm/a (or 14 ± 0.4 km3/a) for Jan 2005–Dec 2010, consistent with the GWD rate (2.8 cm/a or 12.3 km3/a) from groundwater-level monitoring data. Published studies (e.g., 4 ± 1 cm/a or 18 ± 4.4 km3/a) may overestimate GWD over this region. This study highlights uncertainties in GWD estimates and the importance of incorporating a priori information to refine spatial patterns of GRACE signals that could be more useful in groundwater resource management and need to be paid more attention in future studies.


Results
We use the three-state region (i.e., Punjab, Haryana & Delhi, and Rajasthan with a total area of 438,296 km 2 , the same as Rodell et al. 16 ) in the NWIA as a test-bed, with ~30% of the study region equipped for irrigation ( Fig. 1(a)) and 74% of the equipped area irrigated with groundwater ( Fig. 1(b) and Supporting Information 2). Note that only 16% of Rajasthan state was irrigated. The GWD rate from groundwater-level data for the six-year period (Jan 2005-Dec 2010, Fig. 1c) and specific yield (Fig. S1) for the three-state region was estimated to be 2.8 cm/a ( Fig. 1d and Supporting Information 2, GWD shown in negative in all figure legends). Variations in groundwater levels for most of 20 selected sites across the three states show rapidly declining trends (Fig. S2), particularly in the east but much less in the west, with even some recovery of groundwater levels in the west (Fig. 1c,d). Specific yield values to convert groundwater level changes to storage for unconfined aquifers for Punjab and Haryana were estimated to be ~0. 15 38 , which is dominated by sandy alluvium (a specific yield of 0.15) that accounts for ~98% over the two states 39 . The specific yield in Rajasthan was estimated to be ~0.1, which is dominated by sandy alluvium (60%) and sandstone (11%) with a lower specific yield of 0.045 (Supporting Information 2).
Evaluation of forward modeling using hypothetical data. A range of numerical simulations using hypothetical data was performed to evaluate the utility of unconstrained and constrained forward modeling to recover signal loss due to low-pass filtering. Constrained forward modeling can completely recover the signal loss during low-pass filtering, regardless of signal configuration, whereas unconstrained forward modeling cannot (Table S1). Expanding the study region to include the surrounding areas may allow one to obtain the total volume of GWD from the simulated GWD rates irrespective of signal configuration. In terms of spatial pattern, unconstrained forward modeling is unable to recover detailed spatial patterns of GWD, showing correlation coefficients between simulated and original GWD rates ranging from 0.04 for the completely random distribution to 0.22 for the uniform distribution (Table S1, Figs S3-S7). However, constrained forward modeling can recover the spatial pattern of GWD to varying degrees, showing correlation coefficients between the simulated and original GWD rates ranging from 0.36 for the completely random distribution to 0.99 for the uniform distribution. For the other three signal configurations, the correlation coefficients are slightly less than 0.5 (Table S1 and Figs S8-S12).  Fig. 1(c) represent 20 selected groundwater monitoring sites in the three-state region, with groundwater-level time series for the 20 sites shown in Fig. S2. Map was created using ArcGIS (http://www.esri.com/software/arcgis/arcgis-for-desktop).
We conclude that the capability of constrained forward modeling to recover the detailed spatial distribution of signal depends highly on the spatial pattern of original data. The greater the heterogeneity in the signal, the lower the spatial correspondence between the simulated and original signals. The study by Chen et al. 37 examined constrained forward modeling only with uniform distribution for mass change rates, i.e., a uniform mass change rate of − 33 cm/a and − 31 mm/a assigned to the Amundsen Sea Embayment and the Northern Peninsula, respectively, over the Antarctic region. While this approach may be suitable for this case because of relatively uniform distribution of glacier and ice melting over the Antarctic in cryosphere studies 37,40 , constrained forward modeling could not completely recover the spatial pattern of signals in land hydrology due to marked spatial heterogeneity in water storage changes over land. Results here using hypothetical data demonstrate that it is challenging to completely recover the spatial pattern of GWD rates if GWD is spatially heterogeneous.
Evaluation of forward modeling using synthetic data from PCR-GLOBWB. Both unconstrained and constrained forward modeling can recover over 95% of signal variations in GWD from PCR-GLOBWB over the three-state region for the period 2003-2010 (Table S1). Use of constrained forward modeling to recover filtered signals in GWD was confirmed again by using synthetic data from PCR-GLOBWB. Unlike the inability of unconstrained forward modeling to completely recover filtered signals using hypothetical data, it is interesting to see that unconstrained forward modeling can be effective (96%) in restoring the signal loss for the three-state region. For instance, GWD from PCR-GLOBWB for the period 2003 through 2010 was mainly distributed over the north and east of the three-state region, with an annual depletion rate of 6.7 cm/a (Fig. 2a). By performing low-pass filtering as applied to GRACE spherical harmonics, e.g., truncation at the maximum degree and order of 60 and 300 km Gaussian filtering, we obtained filtered GWD rates (Fig. 2d), with a significantly dampened GWD magnitude of 4.8 cm/a. In practice, we normally only have filtered GWD rates, e.g., from GRACE satellites and ancillary data, with the aim of obtaining the recovered GWD rates using the scaling factor, additive, or multiplicative approaches. Through unconstrained forward modeling, we can recover most of the dampened GWD rates, denoted as GWD s in Fig. 2b, with a spatial mean of 6.4 cm/a. Therefore, unconstrained forward modeling can recover over 96% of the original GWD rates. By performing low-pass filtering for GWD s (Fig. 2b), we can obtain filtered GWD rates that are the same as the filtered GWD rates obtained directly from PCR-GLOBWB (Fig. 2a).
The ability of unconstrained forward modeling to recover signal loss using PCR-GLOBWB can be attributed to signal amplification from the surrounding region that essentially reduces signal dampening within the three-state region. Though unconstrained forward modeling is unable to completely recover signal loss based on numerical simulations using hypothetical data where all signal variations are assumed to be completely restricted to the study region, there are some cases in practice that unconstrained forward modeling can generally recover signal forward modeled GWD s rate distribution after 500 iterations using unconstrained forward modeling, (c) forward modeled GWD s rate distribution after 500 iterations using constrained forward modeling, and (d) filtered GWD rates from (a-c) after low-pass filtering. Map was created using ArcGIS (http://www.esri.com/software/arcgis/ arcgis-for-desktop).
Scientific RepoRts | 6:24398 | DOI: 10.1038/srep24398 loss when the study region is impacted by signal variations from its surroundings. Apparently, unconstrained forward modeling cannot recover the spatial pattern of GWD relative to the original pattern from synthetic data ( Fig. 2b compared with Fig. 2a), showing a correlation coefficient between the simulated and PCR-GLOBWB GWD of 0.64 across the study region. Frequency distributions show that the simulated GWD rates with unconstrained forward modeling have a more even distribution (Fig. S13b) compared to the frequency distribution of GWD rates from PCR-GLOBWB that shows a sharp increase in the interval 0 to −10 mm/a dominated by areas without GWD (Fig. S13a). In particular, grid cells without GWD (e.g., surrounding those with GWD in Fig. 2a) do show GWD in Fig. 2b. This artifact resulted from signal leakage from the area with GWD to its surroundings due to low-pass filtering. Overall, without a priori information on the distribution of GWD, it is impossible to obtain a 'realistic' GWD distribution through unconstrained forward modeling.
The distribution of PCR-GLOBWB GWD is highly consistent with the GWD distribution from another GHM, WGHM (Fig. S14), the distribution of areas equipped for irrigation (Fig. 1a,b) and land cover (Fig. S15). It is apparent that the southwest of the three-state region is dominated by natural vegetation that is not equipped for irrigation, and therefore should not exhibit marked GWD. Globally constrained forward modeling (referring to the Methods section) can recover both the magnitude and spatial pattern of GWD relative to the original GWD from PCR-GLOBWB, showing a spatially averaged GWD rate of 6.8 cm/a and a correlation coefficient of 0.77 (Table S1 and Fig. 2c). This means that given the filtered GWD (e.g., Fig. 2d) and its spatial pattern (e.g., GWD rate < 0 in Fig. 2a), it is possible to completely recover the magnitude and generally recover the spatial pattern of GWD from globally constrained forward modeling. However, if one does not have the critical information on the spatial pattern of GWD, use of constrained forward modeling to recover GWD in both magnitude and spatial pattern may fail. This is demonstrated by the case of locally constrained forward modeling (referring to the Methods section) that assumes GWD to be restricted to the three-state region, showing a resulting magnitude of GWD of 12.2 cm/a and a correlation coefficient of 0.28 (Table S1 and Fig. S16). The assumption of GWD restricted to the three-state region used in previous studies is unreasonable because of large GWD across the surrounding areas, i.e., Pakistan and west Uttar Pradesh (Figs 1d and 2a).
Groundwater storage depletion from GRACE using forward modeling. Filtered GWD rates (GWD a ) were calculated using constrained forward modeling based on filtered GRACE TWS changes minus filtered SMS changes from GLDAS-1 Noah. GWD a rates with a mean of 1.9 cm/a represent what GRACE can see after low-pass filtering for removing high-frequency noise (Fig. 3c). Therefore, directly using GWD a rates from GRACE would result in significantly reduced GWD rates compared with the 'realistic' GWD rates (Fig. 3b). When the GWD a rates from GRACE and the filtered GWD s rates are the same (Fig. 3c), the forward modeling solution has converged and is terminated with the last updated GWD s rates taken as the resulting GWD rates using this approach.
Interannual variability in surface water is assumed to be negligible in the calculation of GWD a , which has also been assumed in previous studies for this region 16,21 and demonstrated by WGHM in our study (see Supporting Information 3 and Fig. S17). In addition, both globally ( Fig. 2c) and regionally ( Fig. S18b) constrained forward modeling resulted in a reasonable distribution of GWD compared with the GWD pattern from PCR-GLOBWB. However, constrained forward modeling generated lower regionally averaged GWD rates (3.1 cm/a for globally constrained and 3.1 cm/a for regionally constrained) than the GWD rate of 6.7 cm/a from PCR-GLOBWB. Locally constrained forward modeling, again, resulted in an unrealistic spatial pattern and magnitude of spatially averaged GWD rates (not shown) due to the assumption of GWD restricted to the three-state region.
Interestingly, unconstrained forward modeling can also provide a similar magnitude of GWD of ~3.0 cm/a compared with the result of ~3.1 cm/a from globally and regionally constrained forward modeling. However, it is apparent that the GWD spatial distribution from unconstrained forward modeling does not seem reasonable relative to the GWD pattern from PCR-GLOBWB and groundwater-level data. This is determined by the intrinsic characteristic of unconstrained forward modeling that no a priori information is used to constrain the recovered signal distribution. However, it could recover the magnitude of filtered signals as long as the boundary covers all areas with signal leakage. This is also the reason in Chen et al. 21 why a broader area than the three-state region was prescribed to account for all recovered GWD using unconstrained forward modeling. From the numerical simulations using hypothetical data and the simulations using synthetic data from PCR-GLOBWB, it has been shown that signal interference from the region surrounding the three-state region could result in a marked reduction in signal leakage from the three-state region. Therefore, the overall effectiveness of unconstrained forward modeling to recover signal loss for the three-state region is a particular case and may not be generalized to other regions. Recovering signal loss using unconstrained forward modeling depends on the relative difference in signal variation within and outside a study region of interest.
Comparison of GRACE-based estimates with groundwater-level data. Filtered and original (unfiltered) SMS anomalies from three LSMs (i.e., GLDAS-1 Noah, Mosaic, and VIC) are consistent (see shading areas representing one standard deviation of the three SMS anomaly time series) (Fig. 3d). The mean of the three time series shows a slightly decreasing trend from Jan 2003 to the summer in 2010 and an appreciable recovery from drought in Southeast and South Asia since then 41 . Furthermore, filtered SMS anomalies from these LSMs show slightly higher amplitudes than the original time series, indicating signal amplification from the area surrounding the three-state region during low-pass filtering.
It is apparent that the filtered TWS anomalies have larger seasonal amplitudes and long-term trends than SMS and GWS anomalies (Fig. 3e), which is reasonable given that TWS changes integrate surface and subsurface water storage changes. The filtered GWS anomalies show a steady decreasing trend from 2003 to the summer of 2010, but the seasonal amplitudes of filtered GWS anomalies are small, resulting from different phases and uncertainties from TWS and SMS anomalies (more noises). In addition, TWS and SMS anomalies have a similar Scientific RepoRts | 6:24398 | DOI: 10.1038/srep24398 phase of variations, but the filtered GWS anomalies show 1-2 month time lag compared with the TWS and SMS anomalies. This is reasonable because GWS variations should lag climatic variations relative to TWS and SMS and should also include impacts (phase) from human activities (e.g., intensive irrigation with groundwater). Therefore, both the seasonal amplitude and long-term trend in GWS anomalies are quite different from TWS and SMS anomalies. Note that our purpose is to recover long-term trends in GWS time series using different approaches. The filtered GWS anomalies are just an intermediate variable that essentially reveals temporal variability in GWS. For obtaining a precise magnitude of changes in GWS, signal losses in filtered GWS anomalies should be restored especially for long-term trends first before interpreting GWS changes over this region.
In general, GWS anomaly time series from four approaches show steady decreasing trends from Jan 2003 to Jun 2010. Since then, all GWS time series tend to recover by the end of 2011. GWS started to decline from 2012 onwards by the end of the study period Jul 2013 (Fig. 3f). However, different approaches for generating GWS anomalies can differ in seasonal amplitude. The GWS anomalies from the multiplicative approach show relatively higher seasonal amplitudes than the other estimates, with a multiplicative factor of 2.1 derived from the amplitude ratio of the basin function and filtered basin function (Fig. S3a,c), similar to ~1.95 derived by the previous study 16 . The multiplicative factor that assumes a uniform distribution within a study region but no variations outside the region resulted in both the highest seasonal variability and GWD rate. In contrast, the scaling factor of 0.67 derived from CLM4.0 provided by the JPL website resulted in the lowest seasonal amplitudes in GWS anomaly. The highest differences between the two time series occurred in wet and dry seasons during the study period, e.g., ~150 mm in Sep 2003 (the wet case) and ~130 mm in Jun 2010 (the dry case) at the end of the extreme drought in Southeast Asia in spring 2010.
GWS anomalies from the forward modeling approach show appreciably larger seasonal amplitudes than the additive correction approach. Scaling factors derived from three GLDAS-1 LSMs for the three-state region are all less than 1 (0.92, 0.79, and 0.75 for Noah, Mosaic, and VIC, respectively, Fig. 3d), indicating that signals for seasonal variability in water storage from GRACE tend to be amplified due to stronger signals from the surrounding areas during low-pass filtering. Therefore, a scaling factor less than 1 needs to be applied to reduce the leakage error. CLM4.0 provided a markedly lower scaling factor (0.67) than other LSMs examined, thereby reducing seasonal signals in TWS change significantly. rates from filtered GRACE GWD rates and globally constrained forward modeling after 500 iterations using the spatial pattern of GWD rates from PCR-GLOBWB for the same period, (c) filtered GRACE GWD rates, i.e., GWD a , (d) time series of filtered and original SMS changes from GLDAS-1 Noah, Mosaic, and VIC models, (e) filtered TWS changes from GRACE, filtered SMS changes from the mean of filtered SMS changes, and filtered GWS changes from filtered TWS changes minus filtered SMS, and (f) time series of restored GWS anomaly time series from the additive, multiplicative, gridded, and (globally constrained) forward modeling approaches. Also shown are the filtered GWS changes and filtered GWS changes whose trends were removed. Map was created using ArcGIS (http://www.esri.com/software/arcgis/arcgis-fordesktop) and SigmaPlot (http://www.sigmaplot.com/).  (Table 1 and Methods section). For the period Jan 2003 through Dec 2010, the multiplicative and the scaling factor approaches generated the highest (3.8 ± 0.2 cm/a) and lowest (0.9 ± 0.2 cm/a) GWD rates among the four approaches. The multiplicative approach amplified both seasonal amplitudes (see Jun 2010) and long-term trends. The lower estimates of GWS changes from the scaling factor approach resulted mainly from the low scaling factor of 0.67 from CLM4.0, which dampened both seasonal amplitudes and secular changes (Fig. 3f). The forward modeling approach provided estimates of GWS change rates less (3.1 ± 0.1 cm/a) than the multiplicative approach but greater than both the scaling factor and additive (1.9 ± 0.3 cm/a) approaches. Similar results were also found for the period Jan 2005-Dec 2012 and Jan 2003-Dec 2012 (Supporting Information 4). The forward modeled GWD (3.1 ± 0.1 cm/a) in this study agrees well with the GWD rate (2.8 cm/a) from the in situ groundwater-level measurements and specific yield. The multiplicative approach and PCR-GLOBWB appeared to overestimate GWD in this region. The scaling factor and additive approaches, however, appeared to underestimate GWD rates.

Discussion
Our estimate of GWD from constrained forward modeling for the three-state region during the five-year period Aug 2003 through Oct 2008 is 2.5 ± 0.1 cm/a, which is less than the estimate of 4 ± 1 cm/a for the period Aug 2002 through Oct 2008 by Rodell et al. 16 . One of the possible reasons for a higher estimate for the GWD rate from Rodell et al. 16 lies in the assumption inherent in the multiplicative approach that all GWD was restricted to the three-state region, i.e., a binary averaging kernel was assumed to be 1 in the three-state region but 0 outside the region. This assumption does not seem reasonable considering the spatial patterns of GWD from PCR-GLOBWB and groundwater-level measurements examined in this study. The surrounding areas of the three-state region, e.g., Pakistan and western Uttar Pradesh, exhibit significant GWD. Assuming no GWS amplitude outside the averaging kernel would result in an overestimate of the multiplicative factor and consequently overestimation of both the seasonal cycle and secular change in GWD. In addition, our estimate of the GWD rate (3.1 ± 0.1 cm/a) for the three-state region during the period Jan 2003-Dec 2010 is appreciably higher than the mass loss estimates of Jacob et al. 29 of 1.9 cm/a (35 km 3 /a) over North India for the period Jan 2003-Dec 2010 using 350 km Gaussian filtering that was attributed primarily to GWD in the region. The reason could be the use of the mascon solution in Jacob et al. 29 that covers a broader area than the three-state region, reducing the mass loss rate.
The constrained forward modeling-based GWD rate during the period Jan 2003-Dec 2012 in our study was 2.1 ± 0.1 cm/a for the three-state region, which is consistent with what Chen et al. 21 obtained using unconstrained forward modeling. This has been demonstrated by numerical simulation using synthetic data from PCR-GLOBWB that constrained and unconstrained forward modeling can generate a similar magnitude of GWD due to the spatial distribution of GWD within and outside the three-state region. Furthermore, our GWD estimates using constrained forward modeling make GRACE estimates more meaningful by incorporating the a priori information on groundwater-based irrigation. The limitations of unconstrained forward modeling without exact spatial pattern of GWD and requiring post-processing of manually defining the extent of GWD were greatly reduced. The footprint for GRACE data is large (e.g., 200,000 km 2 ). Compared with traditional processing that can only provide basin-scale GWD, constrained forward modeling makes use of the GWD pattern from a priori knowledge, which is extremely important to assess the local vulnerability of water supply especially during drought 5 . As the quality of GRACE data and related processing techniques improve, GRACE satellites will be more valuable in assessing GWD over aquifers globally, which should be important for evaluating and improving the sustainability of agriculture and water resource management. GRACE signal restoration for aquifers is challenging because GWS changes may have different phases of seasonal and interannual variability resulting from climate variability and human activities. Scaling factors for correcting for seasonal cycles and secular changes of GWS should be applied separately 16,33 . Note that SMS may not show marked interannual variability due to long-term intensive irrigation that results in moist soils most of time (referring to Fig. S17a). LSMs/GHMs are powerful tools for depicting seasonal cycles and therefore generating scaling factors for correcting for seasonal variability in SMS or GWS. Surface water storage over this region does not seem to have significant interannual variability, either. Therefore, the key issue to estimate GWS changes from GRACE satellites is to compute scaling factors for correcting for secular changes in TWS or directly compute secular changes in GWS as forward modeling does. Use of a single scaling factor from LSMs to correct for both seasonal cycles and secular changes may be highly uncertain. Our study showed the lowest estimate of the GWD rate of ~1 cm/a for the period 2003-2010 due to a lower scaling factor of 0.67 from CLM4.0. This seems to be unreasonable and much lower than those estimated by previous studies and groundwater-level monitoring data. GWD rates from the additive approach were higher than those for the scaling factor approach, but lower than those from the forward modeling and multiplicative approaches. This can be explained by the way in which bias and leakage effects were corrected depending on GLDAS-1 Noah output. Because Noah does not have a groundwater component, the bias computed using Noah-based SMS output may be less than what it should be, resulting in relatively lower estimates of seasonal amplitudes and secular changes in TWS. Therefore, one of the most important limitations in scaling factor and additive approaches rests with the dependence on LSMs that do not account for the impact of anthropogenic GWS changes, though some LSMs accommodate the effect of natural variability in GWS (e.g., CLM4.0). Both constrained and unconstrained forward modeling presented in this study can provide reasonable estimates of GWD rates relative to groundwater-level data. The limitation of constrained forward modeling is that it may not completely recover the spatial pattern of signals over regions with highly heterogeneous mass changes. We suggest that each approach has advantages and limitations. The range of GWD estimates is valuable in understanding uncertainties in GWD estimates for aquifers, especially when GRACE-based monitoring is taken as an independent approach to evaluate model output. This study highlights the importance of selecting appropriate processing strategies for restoring GRACE signals and uncertainties in GWD estimation for aquifers from different approaches. Constrained forward modeling with a priori spatial distribution of GWD from PCR-GLOBWB or ground-based measurements is useful in recovering both the magnitude and distribution of GWD rates from GRACE satellites for the Northwest India Aquifer.

Methods
Constrained and unconstrained forward modeling. Constrained forward modeling makes use of a priori information regarding the spatial distribution of signals whereas unconstrained forward modeling does not. The basic idea of forward modeling is to iteratively adjust the filtered signal (and distribution for constrained forward modeling) to numerically (and spatially) approach a reference. The reference is often from filtered GRACE mass change rates (and synthetic data from GHMs or ground-based observations). An illustration of unconstrained and constrained forward modeling of GWD rates from GRACE is given below (also referring to Fig. S19a).
First, SH solutions of CSR RL05 and GLDAS Noah soil moisture were destriped 17,31 (not for GLDAS data), truncated at the maximum degree and order of 60, and filtered using a 300 km Gaussian filter 17 . The first-order estimate of GWS change rates can be derived by the least square fit for the filtered GWS anomalies from filtered GRACE TWS minus filtered GLDAS SMS anomalies. The first-order estimate of GWS change rates is termed the apparent GWD rate (GWD a ), which is the incomplete estimate of GWD rates whose signals have been dampened during the low-pass filtering process. Second, a dummy GWD distribution is created, with its initial values taken as GWD a or random for each grid cells, termed the simulated GWD rate (GWD s ). Third, the difference between the filtered GWD s and GWD a is added back to GWD s until the filtered GWD s numerically converges to GWD a . Under the condition that the filtered GWD s and GWD a are equal or less than a threshold, the last update of GWD s is then taken as the final estimate of GWD through unconstrained forward modeling. For constrained forward modeling, only the differences between filtered GWD s and GWD a for constrained areas (often from synthetic data used as a priori knowledge) are added back to GWD s . Note that GWD s needs to be continually updated during the iterative process, but GWD a is determined by GRACE data that does not change.
We hypothesize that the spatial distribution of original signals may affect the use of constrained and unconstrained forward modeling. In this study, both constrained and unconstrained forward modeling were first evaluated using: (1) hypothetical data with varying configurations of uniform and heterogeneous signal distributions, and (2) synthetic data of GWD from PCR-GLOBWB. A flowchart for the two tasks is shown in Fig. S19b. Based on this, GWD for the three-state region and surrounding areas from GRACE satellites was estimated using both unconstrained and constrained forward modeling.

Reconstruction of groundwater storage change time series. Reconstruction of GWS time series
is performed in two steps. The first step is to derive long-term trends from forward modeling illustrated in the above section. The second step is to reduce the bias and leakage effects for seasonal variations using scaling factors derived from LSMs/GHMs 16,33 . The assumptions involved in this reconstruction approach are: (1) LSMs reliably depict seasonal variations in GWS changes; and (2) seasonal and long-term changes can be recovered separately: where GWS(t) is the corrected GWS anomaly time series for aquifers as a function of time (t); GWS detrend (t) is the seasonal cycle of the filtered GWS anomaly time series from GRACE data in which the incomplete long-term trends are removed; k is the scaling factor derived from filtered and unfiltered LSM/GHM output. In this study, we used PCR-GLOBWB output 12,42,43 to create scaling factors for correcting for seasonal cycles of GWS changes; and a is the long-term change rate of GWS through forward modeling illustrated in the above section. Use of LSMs to quantify long-term trends in water storage components may have large uncertainties because LSMs do not simulate all the components of the water budget (e.g., GWS) 33 . However, LSMs may still be useful in quantifying relative patterns of water storage changes within and outside an area (aquifer) at the seasonal scale. The forward modeled GWS rates will subsequently be added back to the seasonal cycle of GWS anomalies Scientific RepoRts | 6:24398 | DOI: 10.1038/srep24398 (equation (1)) from filtered GRACE GWS anomalies with long-term trends removed first and bias and leakage effects corrected using scaling factors from LSMs. The reconstruction approach can also be applied in a piecewise manner. This means that forward modeling is performed for each sub-period with a statistically significant trend due to climate variability and/or anthropogenic activities. The resulting GWS rates will be added back to the seasonal cycle of GWS anomalies for each sub-period to reconstruct the monthly GWS anomaly time series that retains both the seasonal cycle with bias and leakage effects corrected by a scaling factor from LSMs and the secular change derived from forward modeling. Study strategies. First, a range of numerical simulations using hypothetical data was performed to evaluate the utility of unconstrained and constrained forward modeling to recover signal loss due to low-pass filtering. Five configurations of signal patterns were tested, with the aim of obtaining a comprehensive understanding of how different patterns of original data impact the ability of these forward modeling techniques to recover signal loss, including: (1) assigning a uniform GWD rate of − 50 mm/a across the study region, (2) assigning random GWD rates ranging from 0 to − 100 mm/a across the study region, (3) assigning a uniform GWD rate of − 50 mm/a for the right half of the study region but random GWD rates ranging from 0 to − 100 mm/a for the left half; (4) assigning a uniform GWD rate of − 50 mm/a for the lower half but random GWD rates ranging from 0 to − 100 mm/a for the upper half, and (5) assigning a uniform GWD rate of − 50 mm/a for a circular area in the central half of the study region but random GWD rates ranging from 0 to − 100 mm/a for its surroundings within the study region.
Second, synthetic data were used to demonstrate how signals vary during low-pass filtering and how the filtered signal is recovered using forward modeling in a more realistic environment. The modeled data are often obtained from GHMs/LSMs. Here, the utility of both unconstrained and constrained forward modeling for the three-state region was tested using GWD estimates from PCR-GLOBWB. We suggest that in practice there are three types of constrained forward modeling. The first type makes use of the global distribution of GWD simulations from PCR-GLOBWB, termed globally constrained forward modeling. Note that globally constrained forward modeling may suffer from uncertainties due to noise in GRACE signals. The second type only uses a broader area of the spatial pattern of synthetic data that includes the surroundings of a study region where significant signal variations outside the study region may occur, termed regionally constrained forward modeling. The third type makes use of spatial patterns of signal variations from models or observations completely restricted to a study region, termed locally constrained forward modeling.
Third, GWD rates for the three-state region during the periods 2003-2010, 2005-2010 and 2013-2012 were estimated using GRACE and GLDAS-1 LSM SMS changes within the framework of constrained and unconstrained forward modeling. The GWD distribution from PCR-GLOBWB was taken as a priori information for implementing constrained forward modeling. Following the three parts of work mentioned above, GWS anomaly time series from forward modeling were reconstructed and compared with those from traditional approaches, i.e., the single scaling factor approach, additive approach, and multiplicative approach. Our estimates of GWD for the three-state region were further compared with groundwater-level monitoring data and published studies, with advantages and limitations of different approaches for aquifers being systematically examined.
Error analysis. We derived error time series for GWS anomalies and subsequently performed weighted least square regression. Errors in GWS time series were computed by root mean square of errors of TWS changes and SMS changes 17 at the monthly scale. Errors in TWS changes include GRACE L1 measurement errors and leakage errors. The measurement errors were quantified by looking at the standard deviation of TWS changes over ocean for the same latitude of a study basin 44 , and the leakage errors were quantified by different LSMs to approximate TWS changes for a study region. Errors of SMS changes were quantified by the standard deviation of SMS from four GLDAS-1 LSMs and two GHMs (see Table S2) 17 .