Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

SMAP-HydroBlocks, a 30-m satellite-based soil moisture dataset for the conterminous US


Soil moisture plays a key role in controlling land-atmosphere interactions, with implications for water resources, agriculture, climate, and ecosystem dynamics. Although soil moisture varies strongly across the landscape, current monitoring capabilities are limited to coarse-scale satellite retrievals and a few regional in-situ networks. Here, we introduce SMAP-HydroBlocks (SMAP-HB), a high-resolution satellite-based surface soil moisture dataset at an unprecedented 30-m resolution (2015–2019) across the conterminous United States. SMAP-HB was produced by using a scalable cluster-based merging scheme that combines high-resolution land surface modeling, radiative transfer modeling, machine learning, SMAP satellite microwave data, and in-situ observations. We evaluated the resulting dataset over 1,192 observational sites. SMAP-HB performed substantially better than the current state-of-the-art SMAP products, showing a median temporal correlation of 0.73 ± 0.13 and a median Kling-Gupta Efficiency of 0.52 ± 0.20. The largest benefit of SMAP-HB is, however, the high spatial detail and improved representation of the soil moisture spatial variability and spatial accuracy with respect to SMAP products. The SMAP-HB dataset is available via zenodo and at

Measurement(s) wetness of soil
Technology Type(s) computational modeling technique
Factor Type(s) geographic location • temporal interval
Sample Characteristic - Environment land • surface soil
Sample Characteristic - Location contiguous United States of America

Machine-accessible metadata file describing the reported data:

Background & Summary

Detailed and accurate information on the spatiotemporal distribution of soil moisture is important for numerous applications, such as monitoring of drought1,2,3 and crop irrigation demands4,5,6; mapping antecedent conditions that trigger wildfires7,8, landslides9,10, and flooding11,12; and quantifying water, energy, and carbon fluxes between the land and atmosphere13,14,15. Depending on the landscape heterogeneity, such physical processes can occur at the 1–100 m spatial scale, at which in-situ sensors could provide detailed information. However, in-situ observations’ representativeness can be limited to only a few meters from the sensors, they are costly to deploy and maintain, and therefore are not widely available at continental extents.

With satellite observations increasingly available16, optical and near-infrared satellite sensors (e.g., MODIS, Landsat, and Sentinel-2) can provide proxies for estimating soil moisture at high spatial resolution (10–250 m)17,18,19. However, estimates from these sensors can suffer attenuation from the atmosphere, high cloud coverage, dense vegetation, and infrequent revisit time (~1–2 weeks). Alternatively, passive microwave sensors were designed to penetrate through clouds and dense vegetation to retrieve surface soil moisture with a 25–50-km spatial resolution and 2–3-days revisit time20,21,22,23,24,25. NASA’s Soil Moisture Active-Passive mission22 (SMAP), for example, has a 36-km spatial resolution (or 9 km via the resampled SMAP L3 Enhanced23,26 product). Combining such passive sensors with a active sensor (e.g., Sentinel-1) and/or assimilating them into physical models can provide estimates with a 1–3-km27,28 and 9–25-km29,30,31,32 spatial resolution, respectively. These capabilities critically contributed for aiding regional- to global-scale water resources applications33,34. However, they still lack the spatial detail and accuracy necessary for local-scale (1–100 m) applications3,35,36. Thus, despite the increased demand, obtaining high-resolution data at continental extents remains a challenge.

To address the need for high-resolution satellite-based soil moisture estimates, Vergopolan et al.37 developed an approach that combines HydroBlocks, a cluster-based high-resolution land surface model, with a Tau-Omega Radiative Transfer Model (RTM). This approach fuses HydroBlocks-RTM outputs (30-m resolution) and SMAP L3 brightness temperature observations (36-km resolution) using a cluster-based merging scheme. The uniqueness of this approach resides in leveraging HydroBlocks’ complex tiling for merging satellite observations in the cluster space. In this way, satellite-based soil moisture at an effective 30-m resolution can be achieved in a computationally efficient manner that, otherwise, would be challenging to scale using traditional regular grid approaches. Here, we introduce a new parameterization for this cluster-based merging scheme, which uses machine learning to regionalize the relationships between landscape characteristics and data from satellites, models, and in-situ observations. We apply this new approach to fuse brightness temperature from HydroBlocks-RTM (30-m resolution) and the SMAP L3 Enhanced (9-km resolution) product, and we demonstrate its scalability by developing SMAP-HydroBlocks (SMAP-HB), the first hyper-resolution38 satellite-based surface soil moisture dataset at over a continental extent (Fig. 1). SMAP-HB is available at 6-h 30-m spatial resolution (2015–2019) over the conterminous United States (CONUS).

SMAP-HB revealed a substantial spatial variability (Fig. 1), reflecting the complex interactions between hydroclimate and topography across CONUS, but also the impact of soil properties and land use evident at the local scales (insets). SMAP-HB captures the imprint of river reaches and wet riparian corridors in both wet and dry hydroclimates, such as over the wetlands of the Okefenokee National Wildlife Refuge (inset 6) and the perennial tributaries replenished by snowmelt in the California’s Sierra Nevada (inset 1). We evaluate the accuracy of SMAP-HB using in-situ observations and compare its performance against the HydroBlocks and SMAP L3E (representing the baseline products), and the NASA’s SMAP L4 data assimilation product29,39 at their respective spatial resolution (Fig. 2). Overall, SMAP-HB has the best temporal statistics, with a Root Mean Square Error (RMSE) of 0.07 m3/m3 for the training and testing sites (Table S1), and Kling-Gupta Efficiency (KGE) scores of 0.53 and 0.48 for the training and testing sites, respectively. SMAP-HB showed temporal correlations of 0.71 and 0.77 at the training and testing sites, respectively, compared to 0.73 and 0.74 for SMAP L4. SMAP-HB performed substantially better than the baseline products (Fig. 2b). The largest gains are in the KGE score, with a 0.12 improvement compared to SMAP L3E. SMAP-HB showed the highest spatial accuracy (Fig. 3), evaluated through the spatial correlation across the CONUS (0.66), the New York Mesonet (0.42), and the Oklahoma Mesonet (0.54). As such, we anticipate this dataset can transform efforts to monitor water resources and natural hazards by enabling better representation and understanding of water, energy, and carbon cycle processes at spatial scales that have so far been unresolved.

Fig. 1

Climatology of SMAP-HB surface soil moisture, representing the top 5-cm of the soil column at 30-m spatial resolution over the CONUS (2015–2019). Insets show the soil moisture spatial detail at locations with different hydroclimatic and topographic conditions. For each location, two boxes of 100-km and 20-km size are shown along with illustrative Landsat satellite imagery. Water bodies are shown in blue. Inset labels show scale bar. Interactive visualization of the 30-m data is available at

Fig. 2

Temporal evaluation of daily SMAP L3E, SMAP L4, HydroBlocks, and SMAP-HB soil moisture products against in-situ observations. Evaluation statistics are the Pearson correlation, RMSE, and the KGE score. Panel (a) shows the temporal evaluation analysis split between in-situ observations used in the merging scheme random forest model (training sites, Table S1) and independent observations (the SMAP core calibration/validation sites). The in-situ observations were compared with the respective soil moisture product when data was simultaneously available for all four products, where n is the number of observational sites evaluated. To remove the influence of frozen soils, observations are masked when the HydroBlocks soil temperature is below 4 °C. Panel (b) shows the temporal statistics of the SMAP-HB product distributed in space and their respective improvement over those for the base products. The first row shows the correlation, RMSE, and KGE for SMAP-HB for all the sites. The following rows show the difference in the evaluation statistics between SMAP-HB and the base products. Blue colors indicate higher SMAP-HB performance. Inset histograms show the median and median absolute deviation values.

Fig. 3

Soil moisture spatial correlation. The spatial correlation was calculated for each day by comparing the soil moisture products at collocated grid cells with the in-situ observations over the CONUS, New York Mesonet, and Oklahoma Mesonet. This was done when at least 60 in-situ observations were available simultaneously at each time step. Panel (a) shows the time series of the CONUS spatial correlation. For clarity, we used a seven-day moving average. Panel (b) shows the summary statistics of the daily spatial correlation for each region, with n as the number of days evaluated in each comparison.


Satellite brightness temperature and soil moisture retrievals

We used data from NASA’s Soil Moisture Active Passive (SMAP) Mission, in particular version 3 of the L3 Enhanced Global 9-km product26 (SMAP L3E). Relative to other satellites, the SMAP L-Band microwave sensor tends to offer the best sensitivity to soil moisture retrieval at the top 5 cm of the soil40,41. The SMAP L3E provides morning and afternoon composites of brightness temperature, ancillary data for the Tau-Omega Radiative Transfer Model, retrieved soil moisture, time of measurement, and quality control flags. This product spans from 31 March 2015 to the present, with a 2–3 days revisit time. We used the vertically polarized brightness temperature corrected and flagged for the presence of frozen ground, snow cover, transient water, and active precipitation at the time of the satellite overpass. SMAP L3E soil moisture retrievals were only used for evaluation purposes.

To expand the soil moisture dataset evaluation, we included the SMAP L4 Global 3-h 9-km EASE-Grid Surface Soil Moisture Analysis Update version 539. This product was computed via dynamic assimilation of SMAP brightness temperatures into the NASA Catchment land surface model42 using a customized version of the Goddard Earth Observing System (GEOS) land data assimilation system.

HydroBlocks land surface model

Satellite Earth observation and physiographic data are increasingly available at higher spatial resolutions. However, traditional land surface models struggle to harness the opportunities afforded by these data due to their complex representation of physical processes, and they are unable to computationally scale with the massive data volumes across large domains. To address this challenge, the HydroBlocks land surface model was designed to leverage the repeating spatial patterns that exist over the landscape by implementing a hierarchical clustering algorithm to define its computational mesh43,44. This approach groups the fine-scale drivers of the landscape spatial heterogeneity using, for example, 30-m land cover, soil properties, topography data, into complex tiles/clusters of similar hydrologic behavior, herein called Hydrologic Response Unit (HRU)44,45. In this way, HydroBlocks simulates hydrological processes within the HRUs instead of regular grids, yielding an effective 30-m spatial resolution. This allows HydroBlocks to leverage the complex physics of land surface models while efficiently reducing the system’s dimensionality and computational requirements. For example, a 9-km grid box containing 90,000 30-m grid cells can be represented with ~300–500 clusters (a 180–300 times reduction) depending on the landscape complexity.

Here, HydroBlocks was set up to simulate soil moisture and soil temperature with a 3-h 30-m resolution, between 2015–2019 (with model spin up between 2010–2014). We used the 1-h 3-km Princeton CONUS Forcing45 (PCF) dataset as meteorological input. PCF downscales the North American Land Data Assimilation System 2 (NLDAS-2) data with several higher resolution datasets. PCF precipitation combines the Stage IV and Stage II radar/gauge products with NLDAS-2, and the shortwave radiation combines GOES Surface and Insolation Product (GSIP) with NLDAS-2. PCF also uses an elevation-based downscaling/fusion procedure to ensure physical consistency and mass/energy balance. To parameterize the land surface model, we used a 30-m SRTM-based elevation dataset46 and post-processed it to remove pits and derived slope, aspect, topographic index, flow direction, flow accumulation values, and height above the nearest drainage. We used the 2016 30-m land cover classification from the National Land Cover Database47 (NLCD). The soil-water hydraulic parameters were from the 30-m Probabilistic Remapping of SSURGO48 (POLARIS) dataset. No model calibration was performed, to allow the in-situ soil moisture observations to be used for independent validation.

To obtain HRU-level 30-m brightness temperature estimates, HydroBlocks was combined with a Tau-Omega Radiative Transfer Model (HydroBlocks-RTM). Using the 30-m HydroBlocks soil moisture (of the top 5-cm of the soil column), soil temperature, 30-m POLARIS clay content, and the 9-km SMAP L3E ancillary data (albedo, vegetation optical depth, surface roughness), we computed the 30-m brightness temperature with HydroBlocks-RTM. Further details of the HydroBlocks-RTM implementation are presented in Vergopolan et al.37.

Merging brightness temperature via spatial cluster-based Bayesian merging

We merged the 30-m resolution brightness temperature from HydroBlocks-RTM with the the 9-km resolution SMAP L3E observed brightness temperature. To do this, we used a spatial cluster-based merging scheme, introduced in Vergopolan et al.37. This merging scheme is implemented such that, in a given time step, the fine-scale merged brightness temperature \({T}_{HB}^{+}\) can be derived according to the state update equation:

$${T}_{HB}^{+}={T}_{HB}^{-}+K({T}_{SMA{P}_{anom}}-H{T}_{H{B}_{anom}}^{-})\ast {w}_{short}+bias\ast {w}_{long}$$

Where TSMAP is the SMAP brightness temperature observation resampled to 9-km (SMAP L3E product), \({T}_{HB}^{-}\) is the cluster-space HydroBlocks-RTM brightness temperature, and the anom subscript refers to the anomalies of each product. \({T}_{HB}^{+}\), \({T}_{HB}^{-}\), and \({T}_{H{B}_{anom}}^{-}\) have dimensions nc × 1, where nc is the total number of clusters in the domain. \({T}_{SMA{P}_{anom}}\) dimensions ns × 1, where ns is the total number of SMAP grids in the domain. H is the observation operator that maps HydroBlocks-RTM brightness temperature anomalies (\({T}_{H{B}_{anom}}^{-}\)) from the cluster space to the SMAP grid space. H has dimensions ns × nc, and it uses a Gaussian-shaped weighted area to account for the relative contribution of each cluster to each SMAP grid (Fig. 4).

Fig. 4

Spatial cluster-based Bayesian merging scheme. According to Eq. 1, this scheme merges the 30-m HydroBlocks-RTM brightness temperature estimates (\({T}_{HB}^{-}\), in the cluster-space) with the 9-km SMAP L3E observed brightness temperature (TSMAP, in the grid space) to obtain a fused 30-m brightness temperature estimate (\({T}_{HB}^{+}\), in the cluster space). Figure adapted from Vergopolan et al.37.

The difference between \({T}_{SMA{P}_{anom}}\) and \(H{T}_{H{B}_{anom}}^{-}\) accounts for the short-term (instantaneous) SMAP increments. The bias term accounts for the systematic seasonal differences between TSMAP and \(H{T}_{HB}^{-}\) and it was calculated using a 4-month moving window average. wshort and wlong are static parameters ranging between 0–1, and they are applied to control the contribution of SMAP anomalies and bias depending on how SMAP adds value to the merging scheme. As described in the sequence, to identify the added value of SMAP brightness temperature, we used a machine learning data-driven approach to extract relationships from in-situ observations, landscape characteristics, and SMAP ancillary data. The contribution of SMAP anomalies are also weighted by K, which represents the relative magnitude of the model and observation uncertainties:


K also operates in the cluster space and it has dimensions nc × ns. R is the observation error covariance matrix, and P is the model error covariance matrix. R has dimensions ns × ns, with the diagonal elements set to the SMAP radiometer uncertainty of 1.32 K249, and the off-diagonal set to zero–assuming the SMAP observation errors are uncorrelated with each other. For the model error covariance, we assume cluster pairs belonging to the same SMAP grid have correlated errors; otherwise, the errors are assumed to be uncorrelated. Thus, P has dimensions nc × nc, with the entries of correlated cluster pairs set to the HydroBlocks brightness temperature uncertainty of 52 K237 and the entries of uncorrelated cluster pairs set to zero.

With the merged brightness temperature estimates, we deployed the inverse HydroBlocks-RTM model to retrieve the 30-m (merged) satellite-based soil moisture estimates at each time-step independently. This spatial cluster-based merging scheme allows for efficiently combining regular-grid observational data into the cluster-space using matrices with nc dimension of ~300–500 elements instead of a fully distributed setup that would require ~90,000 elements (of 30-m grid cells) for merging data over the same 9-km grid.

Quantifying the added value of SMAP

Models and satellites have variable accuracy across the landscape, and these differences are reflected in the accuracy of merged products. Thus, identifying where the satellite data adds value and by how much is critical to improving the estimates. Here, we map the added value of SMAP brightness temperatures that would result in merged soil moisture with the highest accuracy. To this aim, we quantified the added value of SMAP based on how SMAP seasonal mean (4-month moving window) and anomalies (instantaneous differences with respect to the seasonal mean) improve soil moisture estimates with respect to the HydroBlocks model.

This approach relies on identifying the wshort and wlong parameters in Eq. 1 that result in merged soil moisture with the highest KGE score (defined in the Technical Validation section). Since these parameters control the contribution of SMAP to the merged brightness temperature, the higher the parameter values, the more SMAP contributes. When the parameters are close to zero, SMAP adds limited value with respect to the model. To quantify the wshort and wlong parameters, we used 958 in-situ soil moisture observations distributed across the CONUS (training sites in Table S1). We identified the added value at each site by testing all possible combinations of wshort and wlong (each parameter varied a 0.01 increment between 0–1), and we selected the pair that resulted in merged soil moisture with the highest KGE score. In this way, we compiled an observation-based training sample of wshort and wlong, shown in Fig. 5a. Subsequently, we used this sample to train a random forest model (RF) to predict the added value of SMAP based on the relationship learned from physiographic and SMAP ancillary data predictors, listed in Table S2. For model training, the value of each RF predictor was defined at the collocated location of each observation with respect to the predictor grid cell. All the predictors were normalized based on their maximum and minimum values. For model prediction, the value of each RF predictor was defined as the predictor spatial mean at each cluster, with each predictor normalized based on the training set maximum and minimum values. In this way, after the RF is trained, it enables the prediction of the added value of SMAP seasonal mean and anomalies at each cluster, instead of every 30-m grid cell, while still yielding an effective 30-m spatial resolution.

Fig. 5

The added value of SMAP L3 Enhanced brightness temperature. The top row (a) shows the SMAP added value identified at 958 in-situ sites. The added value represents how much SMAP contributed to obtaining merged soil moisture with the highest KGE score. Values close to one indicate that SMAP fully contributed to improving soil moisture accuracy, while values close to zero shows that the soil moisture accuracy was not impacted by merging SMAP, and thus the added value is minimal. The bottom row (b) shows the spatial distribution of SMAP added value predicted using a random forest model. This model was trained on the added value of the 958 in-situ sites (a), SMAP ancillary data, and landscape characteristics (Table S2). The added value of SMAP seasonal means and anomalies were quantified jointly, but their contributions are shown separately. The SMAP added value was applied to parameterize the SMAP-HB merging scheme in Eq. 1 via the wshort and wlong parameters.

This approach was applied to predict the added value of SMAP seasonal mean and anomalies across the CONUS (Fig. 5b). The seasonal mean represents the overall wet and dry biases of soil moisture, and the anomalies represent instantaneous contributions, such as from rainfall, irrigation, and flooding. Fig. 5b show how the SMAP seasonal mean adds more value in the Northern Great Plains, in the dry and heavily irrigated Southwest and California Central Valley, and in the wet and sandy soils of the Mississippi floodplains, correcting for the model bias. Short-term contributions (anomalies) tend to be more relevant across the irrigated Great Plains and in the sandier soil conditions of the West Coast and the Atlantic Coastal Plain, where SMAP can capture the timing of wetting events better than model-only estimates. This implies that at these locations SMAP contributes significantly to improve deficiencies in the precipitation data or in the way the model translates precipitation into soil moisture. However, SMAP anomalies provide a limited contribution in the northeast US, the Rocky Mountains, and the Appalachian Mountains, which could be attributable to the confounding effects of complex terrain and dense vegetation on the satellite retrievals, but also due to SMAP’s limited quality control in snow-dominated regions50. The added value of SMAP seasonal mean and anomalies was applied to parameterize the SMAP-HB merging scheme in Eq. 1 via the wshort and wlong parameters. This observation-driven parameterization enabled the merging scheme to benefit from the information contained in in-situ observations and physical landscape characteristics without solely relying on covariance errors.

Data Records

The SMAP-HydroBlocks surface soil moisture dataset at 30-m 6-h resolution (2015–2019) comprises a 22 TB dataset (with maximum compression). Due to the storage limitation of online repositories, we provide the raw data at the HRU level (time, hru) compressed to 33.8 GB. A python code and instructions for post-processing the data into geographic coordinates (time, latitude, longitude) is provided at GitHub ( An aggregated version at 1-km 6-h resolution already post-processed into geographic coordinates (time, latitude, longitude) is also made available comprising in 31.5 GB of data. Data are available for download from the Zenodo repository51 ( Different subsets of the data can also be made available upon request from the primary author. Please provide details on the intended and desired spatial and temporal resolution, domain, and period of interest in your request. Data will be provided via Google Drive shared link. The data are provided in self-describing netCDF-4 format (, and referenced to the World Geodetic Reference System 1984 (WGS 84) ellipsoid. The netCDF-4 files can be viewed, edited, and analyzed using most Geographic Information Systems (GIS) software packages, including ArcGIS, QGIS, and GRASS. As an illustration example, a 30-m map of the SMAP-HB annual and long-term climatology can be viewed through an interactive web interface at

Technical Validation

We quantified the spatial and temporal accuracy of the SMAP-HB 30-m soil moisture using observations from in-situ sensors at 1,191 sites. We compared it with the performance of the HydroBlocks and the SMAP L3E products (representing the baseline products), and the state-of-the-art SMAP L4 data assimilation product. Our evaluation used mean daily in-situ observations at the soil moisture products’ collocated grid cell only at the time steps in which all soil moisture products were simultaneously available. To remove the impact of frozen soils in the evaluation, we masked the soil moisture estimates when the HydroBlocks soil temperature was below 4 degrees Celsius.

The temporal evaluation was split between 958 training sites (used to parameterize our merging scheme via machine learning) and 233 independent testing sites (SMAP core calibration/validation sites; see Table S152,53,54,55,56,57,58,59,60,61,62,63). Training sites were selected such that no validation sites were within a 25 km radius from testing sites. We evaluate the soil moisture performance in terms of the temporal Pearson correlation, the Root Mean Squared Error (RMSE), and the Kling-Gupta Efficiency (KGE) score. The KGE score combines the linear Pearson correlation (ρ), the bias ratio (β), and the variability ratio (γ):

$$KGE=1-\sqrt{{(\rho -1)}^{2}+{(\beta -1)}^{2}+{(\gamma -1)}^{2}}\quad \quad \beta =\frac{{\mu }_{prod}}{{\mu }_{obs}}\quad \quad \gamma =\frac{{\sigma }_{prod}/{\mu }_{prod}}{{\sigma }_{obs}/{\mu }_{obs}}$$

where μ and σ are the temporal mean and standard deviation of the soil moisture products (prod) and the observations (obs).

Fig. 2a presents the temporal evaluation results. Overall, SMAP-HB has the best temporal statistics, with RMSE values of 0.07 m3/m3 for both the training and testing sites, and Kling-Gupta Efficiency (KGE) scores of 0.53 and 0.48 for the training and testing sites, respectively. While SMAP-HB median temporal correlations were 0.71 and 0.77 at the training and testing sites, respectively, the values for SMAP L4 were 0.73 and 0.74. SMAP-L4 generally performed better than SMAP-HB in terms of temporal correlation at mountainous and snow-dominated sites (e.g., at SNOTEL sites; see Fig. S1). The higher skill of SMAP L4 at these sites could be associated with the benefit of assimilation of in-situ precipitation observations into the meteorological forcings of the Catchment land surface model64.

To also quantify the added value of our merging scheme at the point level, we evaluated the temporal statistics spatially (Fig. 2b). SMAP-HB correlation, bias, and RMSE values across the CONUS are spatially homogeneous, with an overall improvement with respect to the baseline products (SMAP L3E and HydroBlocks). SMAP-HB showed a median improvement of 0.03 in temporal correlation with respect to the SMAP L3E product. However, the largest gains are observed in the KGE score, with a median improvement of 0.12 in comparison to SMAP L3. This KGE improvement consolidates overall improvements in temporal correlation, bias ratio, and variation ratio. Figs. S1 and S2 present additional temporal evaluation statistics stratified per soil moisture network, soil type, elevation, vegetation type, among others.

To assess the soil moisture products’ performance in representing spatial dynamics, the spatial correlation was calculated for each day by comparing the daily soil moisture products collocated grid-cell and daily in-situ observations over CONUS, New York Mesonet, and Oklahoma Mesonet. Aiming for statistical significance, the spatial correlation was only calculated when at least 60 in-situ observations and soil moisture products were available simultaneously at a given time step. As such, the spatial correlation aims to quantify at each time step to what extent are the soil moisture products representative of the soil moisture spatial variability. Our results show in Fig. 3 that HydroBlocks and SMAP-HB presented the highest spatial correlation across the CONUS, the New York Mesonet, and the Oklahoma Mesonet. SMAP-HB spatial correlation was 0.66 over CONUS, 0.42 over the New York Mesonet, and 0.54 over the Oklahoma Mesonet. The largest SMAP-HB improvement is observed at the NY-Mesonet, where HydroBlocks spatial correlation was 0.32 and SMAP L4 was 0.23. However, the caveat of this spatial correlation analysis is that it includes the training in-situ observations (also used to parameterize the merging scheme).

Usage Notes

Given its spatial detail, the SMAP-HB dataset will be useful for solving many physical processes and application at spatial scales that so far have been unresolved. These applications include mapping and understanding crop irrigation demands4,6, farmer decision making and planting dates65, drought impacts1,2,3; and mapping of antecedent soil moisture conditions can help estimate the susceptibility to wildfires7,8, landslides9,10, flooding, and waterlogging conditions11,12. Detailed soil moisture information can aid and improve the quantification of biogeochemical cycles in wetlands and riparian zones66, as well as better inform the environmental conditions that facilitate epidemic outbreaks of, for example, West Nile virus67, malaria68, and locust69. SMAP-HB’s improved characterization of soil moisture spatial variability can inform the parameterization of atmospheric convection models70 directly supporting climate and weather predictions71. However, uncertainties still remain and some caveats should be considered:

  • SMAP-HB estimates the volumetric surface soil moisture content of the top 5-cm of the soil based on SMAP-observed brightness temperature. As such, SMAP-HB retrievals are only available when and where SMAP has non-flagged brightness temperature observations.

  • SMAP-HB showed lower temporal correlation at sites of high elevation (Fig. S2b), such as sites belonging to the SNOTEL network (Fig. S1). This could be due to (i) the confounding effects of topographic relief on the upwelling microwave brightness temperature observed by the radiometer; (ii) the likely more frequent presence of frozen or snow-covered soils that were not captured by quality control, but can affect both the in-situ measurements and the satellite retrievals; and (iii) the lower quality of the precipitation data (due to terrain blockage of radar beams, a lower rain gauge density, and a relatively high spatial heterogeneity in precipitation). In fact, Beck et al.72 demonstrated that the precipitation forcing can play a large role in driving the temporal correlation accuracy of the soil moisture products that were derived from merging approaches that include physically-based modeling.

  • Although not quantified due to limited in-situ observation coverage, we expected high uncertainties near urban areas, given limitations in characterizing hydrological processes in urban and human-managed settings, as well as limited model capability in representing drainage networks. High uncertainties and NoData is expected in coastal areas and near large water bodies due to microwave signal contamination.

  • With respect to irrigation, due to the large footprint of the SMAP sensor, SMAP-HB is limited to only capturing large-scale irrigation signals. To capture the impact of local-scale patchy irrigation, future work will include the assimilation of thermal sensors and an irrigation module into the HydroBlocks model. Such improvements on data and methods would benefit not only the spatial and temporal accuracy but may also enhance capabilities for local-scale applications.

Code availability

Source code for the HydroBlocks land surface model is available at The Random Forest model used to parameterize the merging scheme was implemented using the RandomForestRegressor class of the scikit-learn Python module. While not written as a portable library or toolset, code is available upon request.


  1. 1.

    Bolten, J. D., Crow, W. T., Zhan, X., Jackson, T. J. & Reynolds, C. A. Evaluating the utility of remotely sensed soil moisture retrievals for operational agricultural drought monitoring. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 3, 57–66, (2010).

    ADS  Article  Google Scholar 

  2. 2.

    Champagne, C., White, J., Berg, A., Belair, S. & Carrera, M. Impact of soil moisture data characteristics on the sensitivity to crop yields under drought and excess moisture conditions. Remote Sensing 11, 372, (2019).

    ADS  Article  Google Scholar 

  3. 3.

    Vergopolan, N. et al. Field-scale soil moisture bridges the spatial-scale gap between drought monitoring and agricultural yields. Hydrology and Earth System Sciences (2021).

  4. 4.

    Lawston, P. M., Santanello, J. A. & Kumar, S. V. Irrigation signals detected from smap soil moisture retrievals. Geophysical Research Letters 44, 11,860–11,867, (2017).

    Article  Google Scholar 

  5. 5.

    Karthikeyan, L., Chawla, I. & Mishra, A. K. A review of remote sensing applications in agriculture for food security: Crop growth and yield, irrigation, and crop losses. Journal of Hydrology 586, 124905, (2020).

    Article  Google Scholar 

  6. 6.

    Abolafia-Rosenzweig, R., Livneh, B., Small, E. & Kumar, S. Soil moisture data assimilation to estimate irrigation water use. Journal of Advances in Modeling Earth Systems 11, 3670–3690, (2019).

    ADS  CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Taufik, M. et al. Amplification of wildfire area burnt by hydrological drought in the humid tropics. Nature Climate Change 7, 428–431, (2017).

    ADS  Article  Google Scholar 

  8. 8.

    O, S., Hou, X. & Orth, R. Observational evidence of wildfire-promoting soil moisture anomalies. Scientific Reports 10, (2020).

  9. 9.

    Brocca, L. et al. Use of satellite soil moisture products for the operational mitigation of landslides risk in central italy. Satellite Soil Moisture Retrieval 231–247, (2016).

  10. 10.

    Wang, S., Zhang, K., van Beek, L. P., Tian, X. & Bogaard, T. A. Physically-based landslide prediction over a large region: Scaling low-resolution hydrological model results for high-resolution slope stability assessment. Environmental Modelling & Software 124, 104607, (2020).

    Article  Google Scholar 

  11. 11.

    Berghuijs, W. R., Woods, R. A., Hutton, C. J. & Sivapalan, M. Dominant flood generating mechanisms across the united states. Geophysical Research Letters 43, 4382–4390, (2016).

    ADS  Article  Google Scholar 

  12. 12.

    Zhu, Z., Wright, D. B. & Yu, G. The impact of rainfall space-time structure in flood frequency analysis. Water Resources Research 54, 8983–8998, (2018).

    ADS  Article  Google Scholar 

  13. 13.

    Zheng, Y., Brunsell, N. A., Alfieri, J. G. & Niyogi, D. Impacts of land cover heterogeneity and land surface parameterizations on turbulent characteristics and mesoscale simulations. Meteorology and Atmospheric Physics (2021).

  14. 14.

    Rouholahnejad Freund, E., Fan, Y. & Kirchner, J. W. Global assessment of how averaging over spatial heterogeneity in precipitation and potential evapotranspiration affects modeled evapotranspiration rates. Hydrology and Earth System Sciences 24, 1927–1938, (2020).

    ADS  Article  Google Scholar 

  15. 15.

    Trugman, A. T., Medvigy, D., Mankin, J. S. & Anderegg, W. R. L. Soil moisture stress as a major driver of carbon cycle uncertainty. Geophysical Research Letters 45, 6495–6503, (2018).

    ADS  Article  Google Scholar 

  16. 16.

    McCabe, M. F. et al. The future of earth observation in hydrology. Hydrology and Earth System Sciences 21, 3879–3914, (2017).

    ADS  Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Sadeghi, M., Babaeian, E., Tuller, M. & Jones, S. B. The optical trapezoid model: A novel approach to remote sensing of soil moisture applied to sentinel-2 and landsat-8 observations. Remote Sensing of Environment 198, 52–68, (2017).

    ADS  Article  Google Scholar 

  18. 18.

    Ojha, N. et al. Stepwise disaggregation of smap soil moisture at 100 m resolution using landsat-7/8 data and a varying intermediate resolution. Remote Sensing 11, 1863, (2019).

    ADS  Article  Google Scholar 

  19. 19.

    Sabaghy, S. et al. Comprehensive analysis of alternative downscaled soil moisture products. Remote Sensing of Environment 239, 111586, (2020).

    ADS  Article  Google Scholar 

  20. 20.

    Parinussa, R. M., Holmes, T. R. H., Wanders, N., Dorigo, W. A. & de Jeu, R. A. M. A preliminary study toward consistent soil moisture from amsr2. Journal of Hydrometeorology 16, 932–947, (2015).

    ADS  Article  Google Scholar 

  21. 21.

    Wagner, W. et al. The ascat soil moisture product: A review of its specifications, validation results, and emerging applications. Meteorologische Zeitschrift 22, 5–33, (2013).

    ADS  Article  Google Scholar 

  22. 22.

    Entekhabi, D. et al. The soil moisture active passive (smap) mission. Proceedings of the IEEE 98, 704–716, (2010).

    Article  Google Scholar 

  23. 23.

    Chan, S. et al. Development and assessment of the smap enhanced passive soil moisture product. Remote Sensing of Environment 204, 931–941, (2018).

    ADS  Article  PubMed  Google Scholar 

  24. 24.

    Kerr, Y. H. et al. The smos soil moisture retrieval algorithm. IEEE Transactions on Geoscience and Remote Sensing 50, 1384–1403, (2012).

    ADS  Article  Google Scholar 

  25. 25.

    Gruber, A., Scanlon, T., van der Schalie, R., Wagner, W. & Dorigo, W. Evolution of the esa cci soil moisture climate data records and their underlying merging methodology. Earth System Science Data 11, 717–739, (2019).

    ADS  Article  Google Scholar 

  26. 26.

    O’Neill, P. et al. Smap enhanced l3 radiometer global daily 9 km ease-grid soil moisture, version 3 (2019).

  27. 27.

    Das, N. N. et al. The smap and copernicus sentinel 1a/b microwave active-passive high resolution surface soil moisture product. Remote Sensing of Environment 233, 111380, (2019).

    ADS  Article  Google Scholar 

  28. 28.

    Bauer-Marschallinger, B. et al. Toward global soil moisture monitoring with sentinel-1: Harnessing assets and overcoming obstacles. IEEE Transactions on Geoscience and Remote Sensing 57, 520–539, (2019).

    ADS  Article  Google Scholar 

  29. 29.

    Reichle, R. H. et al. Version 4 of the smap level-4 soil moisture algorithm and data product. Journal of Advances in Modeling Earth Systems 11, 3106–3130, (2019).

    Article  Google Scholar 

  30. 30.

    Hersbach, H. et al. The era5 global reanalysis. Quarterly Journal of the Royal Meteorological Society (2020).

    Article  Google Scholar 

  31. 31.

    Martens, B. et al. Gleam v3: satellite-based land evaporation and root-zone soil moisture. Geoscientific Model Development 10, 1903–1925, (2017).

    ADS  Article  Google Scholar 

  32. 32.

    Lievens, H. et al. Joint sentinel-1 and smap data assimilation to improve soil moisture estimates. Geophysical Research Letters 44, 6145–6153, (2017).

    ADS  CAS  Article  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Brocca, L., Ciabatta, L., Massari, C., Camici, S. & Tarpanelli, A. Soil moisture for hydrological applications: Open questions and new opportunities. Water 9, 140, (2017).

    Article  Google Scholar 

  34. 34.

    Sadri, S. et al. A global near-real-time soil moisture index monitor for food security using integrated smos and smap. Remote Sensing of Environment 246, 111864, (2020).

    ADS  Article  Google Scholar 

  35. 35.

    Foster, T., Mieno, T. & Brozović, N. Satellite-based monitoring of irrigation water use: Assessing measurement errors and their implications for agricultural water management policy. Water Resources Research 56, (2020).

  36. 36.

    Peng, J. et al. A roadmap for high-resolution satellite soil moisture applications–confronting product characteristics with user requirements. Remote Sensing of Environment 252, 112162, (2021).

    ADS  Article  Google Scholar 

  37. 37.

    Vergopolan, N. et al. Combining hyper-resolution land surface modeling with smap brightness temperatures to obtain 30-m soil moisture estimates. Remote Sensing of Environment 242, 111740, (2020).

    ADS  Article  Google Scholar 

  38. 38.

    Wood, E. F. et al. Hyperresolution global land surface modeling: Meeting a grand challenge for monitoring earth’s terrestrial water. Water Resources Research 47, (2011).

  39. 39.

    Reichle, R. et al. Smap l4 global 3-hourly 9 km ease-grid surface and root zone soil moisture analysis update, version 5 (2020).

  40. 40.

    O’Neill, P., Bindlish, R., Chan, S., Njoku, E. & Jackson, T. Algorithm theoretical basis document. level 2 & 3 soil moisture (passive) data products (2018).

  41. 41.

    Kumar, S. V., Dirmeyer, P. A., Peters-Lidard, C. D., Bindlish, R. & Bolten, J. Information theoretic evaluation of satellite soil moisture retrievals. Remote Sensing of Environment 204, 392–400, (2018).

    ADS  Article  PubMed  Google Scholar 

  42. 42.

    Koster, R. D., Suarez, M. J., Ducharne, A., Stieglitz, M. & Kumar, P. A catchment-based approach to modeling land surface processes in a general circulation model: 1. model structure. Journal of Geophysical Research: Atmospheres 105, 24809–24822, (2000).

    Article  Google Scholar 

  43. 43.

    Chaney, N. W., Metcalfe, P. & Wood, E. F. Hydroblocks: a field-scale resolving land surface model for application over continental extents. Hydrological Processes 30, 3543–3559, (2016).

    ADS  Article  Google Scholar 

  44. 44.

    Chaney, N. W., Torres-Rojas, L., Vergopolan, N. & Fisher, C. K. Two-way coupling between the sub-grid land surface and river networks in earth system models. Geoscientific Model Development (2020).

  45. 45.

    Chaney, N. W. et al. Harnessing big data to rethink land heterogeneity in earth system models. Hydrology and Earth System Sciences 22, 3311–3330, (2018).

  46. 46.

    Danielson, J. J. & Gesch, D. B. Global multi-resolution terrain elevation data 2010 (GMTED2010) (US Department of the Interior, US Geological Survey, 2011).

  47. 47.

    Homer, C. G. et al. Completion of the 2011 national land cover database for the conterminous united states–representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing 81, 345–354 (2011).

    Google Scholar 

  48. 48.

    Chaney, N. W. et al. Polaris soil properties: 30-m probabilistic maps of soil properties over the contiguous united states. Water Resources Research 55, 2916–2938, (2019).

    ADS  Article  Google Scholar 

  49. 49.

    Piepmeier, J. R. et al. Smap l-band microwave radiometer: Instrument design and first year on orbit. IEEE Transactions on Geoscience and Remote Sensing 55, 1954–1966, (2017).

    ADS  Article  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Kraatz, S. et al. Evaluation of smap freeze/thaw retrieval accuracy at core validation sites in the contiguous united states. Remote Sensing 10, 1483, (2018).

    ADS  Article  Google Scholar 

  51. 51.

    Vergopolan, N. et al. Smap-hydroblocks: Hyper-resolution satellite-based soil moisture over the continental united states. Zenodo (2021).

  52. 52.

    Bell, J. E. et al. U.s. climate reference network soil moisture and temperature observations. Journal of Hydrometeorology 14, 977–988, (2013).

    ADS  Article  Google Scholar 

  53. 53.

    Brotzge, J. A. et al. A technical overview of the new york state mesonet standard network. Journal of Atmospheric and Oceanic Technology 37, 1827–1845, (2020).

    ADS  Article  Google Scholar 

  54. 54.

    McPherson, R. A. et al. Statewide monitoring of the mesoscale environment: A technical update on the oklahoma mesonet. Journal of Atmospheric and Oceanic Technology 24, 301–321, (2007).

    ADS  Article  Google Scholar 

  55. 55.

    Larson, K. M. et al. Use of gps receivers as a soil moisture network for water cycle studies. Geophysical Research Letters 35, (2008).

  56. 56.

    Keefer, T. O., Moran, M. S. & Paige, G. B. Long-term meteorological and soil hydrology database, walnut gulch experimental watershed, arizona, united states. Water Resources Research 44, (2008).

  57. 57.

    Bosch, D. D. et al. Little river experimental watershed database. Water Resources Research 43, (2007).

  58. 58.

    Cosh, M. H., Jackson, T. J., Starks, P. & Heathman, G. Temporal stability of surface soil moisture in the little washita river watershed and its applications in satellite soil moisture product validation. Journal of Hydrology 323, 168–177, (2006).

    ADS  Article  Google Scholar 

  59. 59.

    Seyfried, M. S., Murdock, M. D., Hanson, C. L., Flerchinger, G. N. & Van Vactor, S. Long-term soil water content database, reynolds creek experimental watershed, idaho, united states. Water Resources Research 37, 2847–2851, (2001).

    ADS  Article  Google Scholar 

  60. 60.

    Coopersmith, E. J., Cosh, M. H., Petersen, W. A., Prueger, J. & Niemeier, J. J. Soil moisture model calibration and validation: An ars watershed on the south fork iowa river. Journal of Hydrometeorology 16, 1087–1101, (2015).

    ADS  Article  Google Scholar 

  61. 61.

    Colliander, A. et al. Validation of smap surface soil moisture products with core validation sites. Remote Sensing of Environment 191, 215–231, (2017).

    ADS  Article  Google Scholar 

  62. 62.

    Ma, S., Baldocchi, D., Wolf, S. & Verfaillie, J. Slow ecosystem responses conditionally regulate annual carbon balance over 15 years in californian oak-grass savanna. Agricultural and Forest Meteorology 228−229, 252–264, (2016).

    ADS  Article  Google Scholar 

  63. 63.

    Dorigo, W. A. et al. The international soil moisture network: a data hosting facility for global in situ soil moisture measurements. Hydrology and Earth System Sciences 15, 1675–1698, (2011).

    ADS  Article  Google Scholar 

  64. 64.

    Reichle, R. H. et al. The contributions of gauge-based precipitation and smap brightness temperature observations to the skill of the smap level-4 soil moisture product. Journal of Hydrometeorology 22, 405–424, (2021).

    ADS  Article  Google Scholar 

  65. 65.

    Waldman, K. B. et al. Cognitive biases about climate variability in smallholder farming systems in zambia. Weather, Climate, and Society 11, 369–383, (2019).

    ADS  Article  Google Scholar 

  66. 66.

    Dabrowska-Zielinska, K. et al. Assessment of carbon flux and soil moisture in wetlands applying sentinel-1 data. Remote Sensing 8, 756, (2016).

    ADS  Article  Google Scholar 

  67. 67.

    Keyel, A. C. et al. Seasonal temperatures and hydrological conditions improve the prediction of west nile virus infection rates in culex mosquitoes and human case counts in new york and connecticut. PLOS ONE 14, e0217854, (2019).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Bomblies, A., Duchemin, J.-B. & Eltahir, E. A. A mechanistic approach for accurate simulation of village scale malaria transmission. Malaria Journal 8, (2009).

  69. 69.

    Gómez, D., Salvador, P., Sanz, J. & Casanova, J. L. Modelling desert locust presences using 32-year soil moisture data on a large-scale. Ecological Indicators 117, 106655, (2020).

    Article  Google Scholar 

  70. 70.

    Tawfik, A. B., Lawrence, D. M. & Dirmeyer, P. A. Representing subgrid convective initiation in the community earth system model. Journal of Advances in Modeling Earth Systems 9, 1740–1758, (2017).

    ADS  Article  Google Scholar 

  71. 71.

    Dirmeyer, P. A. & Halder, S. Sensitivity of numerical weather forecasts to initial soil moisture variations in cfsv2. Weather and Forecasting 31, 1973–1983, (2016).

    ADS  Article  Google Scholar 

  72. 72.

    Beck, H. E. et al. Evaluation of 18 satellite- and model-based soil moisture products using in situ measurements from 826 sensors. Hydrology and Earth System Sciences 25, 17–40, (2021).

    ADS  Article  Google Scholar 

Download references


We thank the researchers and funding agencies that maintain the open-source in-situ soil moisture networks, and the researchers from the SMAP Science Team that shared data from the SMAP core calibration/validation sites. New York State Mesonet data access was made possible through funding from NOAA grant NA19OAR4310368. This work was supported by NASA Soil Moisture Cal/Val Activities as a SMAP Mission Science Team Member (grant number NNX14AH92G); by the “Modernizing Observation Operator and Error Assessment for Assimilating In-situ and Remotely Sensed Snow/Soil Moisture Measurements into NWM” project from NOAA (grant number NA19OAR4590199); and the High Meadows Environmental Institute at Princeton University through the Mary and Randall Hack ‘69 Research Fund Award.

Author information




N.V. and E.F.W. conceived and designed the research. N.V. performed the analysis and lead the paper writing. N.C., M.P. and J.S. provide insights on the methodology development. All co-authors provided critical feedback and contributed to the writing.

Corresponding author

Correspondence to Noemi Vergopolan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

The Creative Commons Public Domain Dedication waiver applies to the metadata files associated with this article.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vergopolan, N., Chaney, N.W., Pan, M. et al. SMAP-HydroBlocks, a 30-m satellite-based soil moisture dataset for the conterminous US. Sci Data 8, 264 (2021).

Download citation


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing