Introduction

Anthropogenic climate change is predicted to increase global average air temperature (TA) 1.4–5.8 °C by 2100, with substantially higher increases in winter TA in northern high latitudes1 and concomitant effects on the timing, form, and amount of precipitation1,2. In northern high latitude ecosystems (boreal forests and tundra) that occupy ~22% land area and store ~40% soil carbon globally3, soil temperature (TS) may respond differently than TA due to the decoupling effects of snow cover4,5. Snow manipulation experiments have indicated large impacts of snow cover on TS regimes and responses of soils and vegetation2,6,7. Changes in freeze-thaw events (FTEs) are of particular concern2, posing an “agent of surprise”8 in the functioning of northern ecosystems, with large potential effects on root mortality9, post-winter sapling survival7, soil nutrient losses10, soil microbial activities11,12, and the stability of stored carbon13,14.

TS data, particularly under field conditions, are scarce compared to TA4, and climate-model-prediction scenarios have typically only been developed for TA5. Several snow manipulation studies2,15,16 have suggested that in a warmer world soils during winter months may be colder as a result of decreased and delayed snow insulation. In contrast, most simulation models4,5,17 have predicted a rise in TS in warm climates, as a synergistic effect of increased TA and reduced snowfall, though a few models18,19 do suggest that climate warming could reduce TS under some conditions. TS measurements in high latitude ecosystems have commonly been made at a depth of ≥10 cm, but the soil is most responsive and biologically active above 10 cm soil depth10,20. Contradictory model predictions indicate that TS sensitivity to climate change is not well understood18, and a lack of data to test alternative models has been noted10.

Here we provide a first test of the ‘cold soil in a warm world’15 hypothesis in the boreal forest using a proxy year approach, making use of recent climate variability to compare TS patterns between a proxy baseline year (YB) and a warm future year (YW) (Fig. 1). We used soil and micrometeorological tower sensor data from study plots distributed in open, partially forested, and forested sites in a mixedwood boreal forest in northwestern Ontario, Canada. Snow cover durations were inferred from diurnal TS patterns and confirmed using synthetic high-resolution imagery (fusing MODIS and Landsat 8 data).

Figure 1
figure 1

Local TA anomalies compared against the HadCRUT4 northern hemisphere (NH) anomalies to establish proxy years. A polynomial regression curve (solid black line) (with standard errors in grey shading) is fitted to the annual HadCRUT4 data (baseline 1961–1990) (blue circles) to show a general anomaly trend for NH over the 21st century. YW (December 2015–November 2016) local annual and seasonal anomalies (orange circles) (with standard deviations) are significantly higher (p < 0.01) than those for NH and YB (December 2013–November 2014). YB local anomalies are not significantly different than NH anomalies, thus are not presented here.

Results

Field measurements (Fig. 2) and secondary data (Fig. 1) indicate that YW successfully represented a warm future year in the northern high latitude ecosystems. Consistent with climate model predictions, differences in winter TA between YW and YB were particularly large: average winter TA in open, partially forested, and forested sites were 6.58 °C, 9.17 °C, and 9.46 °C higher (p < 0.05), respectively, in YW than those in YB (Fig. 2).

Figure 2
figure 2

Daily mean TS, TA, and RH under different site conditions in YB and YW. Daily mean TS and TA/RH values are calculated from hourly data of 5 plots each with 8–9 sensors and one sensor, respectively. (af) show how TA/RH is related to TS with respect to snow start/end dates estimated from both satellite (synthetic Landsat images by fusing MODIS and Landsat 8 data) and TS data. For simplicity error estimates of the daily mean values are not shown (see Supplementary Fig. S7 for daily mean TS error estimates).

Sensor data indicate that average winter TS were significantly lower in YW compared to YB (Fig. 3a). In open, partially forested, forested sites, respectively, average TS in YW was 0.43 °C, 1.22 °C, and 1.13 °C lower (p < 0.01) in YW than those in YB. Average minimum and maximum winter TS also showed similar patterns under all site conditions (Supplementary Results and Fig. S5a). The differences in average spring TS between YB and YW were not as consistent (Fig. 3b). In YW they were 1.54 °C lower (p < 0.01) in partially forested sites, but marginally higher (0.12 °C, p = 0.38) in open sites, and lower (0.34 °C, p = 0.2) in forested sites than those in YB. In YW average spring minimum TS were consistently lower and maximum TS showed similar patterns as mean TS compared to YB (Supplementary Results and Fig. S5b). Overall seasonal patterns in mean and average minimum/maximum TS in different site conditions throughout YB and YW are presented in Supplementary Figs S4 and S6, respectively.

Figure 3
figure 3

TS, snow cover duration, and number of freeze-thaw events in different site conditions in YB (open circles) and YW (closed circles). (a,b), average winter (December–February) and spring (March–May) TS, respectively, in YB and YW. Average summer and fall TS along overall seasonal TS trends for each year are presented in the Supplementary Fig. S4. (c), mean snow cover duration for each year estimated from daily TS ranges and maximum values. (d), average number of freeze-thaw events for each year. Each data point (with standard error) in (a,b) is calculated from monthly means over the season, and in (c,d) is calculated from daily TS data of 5 plots each with 8–9 sensors.

Snow cover started earlier and lasted longer in YB than in YW. TS-based snow cover estimates show that in YB snow started on average 18, 22, and 23 days earlier (p < 0.01) in open, partially forested, and forested sites, respectively, than in YW. Likewise, in YB snow ended generally on average 3, 16, and 9 days later in open, partially forested, and forested sites, respectively, than in YW. TS and satellite-based estimates of snow start/end dates agreed well with each other with some year- and site-specific variations (Supplementary Table S3). Satellite-based snow start dates for YB were 2–14 days later than TS-based estimates, which were 3, 10, and 9 days earlier in open, partially forested, and forested sites, respectively, for YW. Snow end dates had the least disagreement of only ±1 day for both years across all site conditions (Fig. 2). TS-based snow cover duration (SCDST) estimates suggest that SCDST in YB were 48, 49, and 51 days longer (p < 0.01) in open, partially forested, and forested sites, respectively, than those in YW (Fig. 3c). The number of FTEs was substantially higher in YW compared to YB (Fig. 3d), increasing by 53% (p = 0.86), 657% (p < 0.01), and 69% (p = 0.07) in open, partially forested, and forested sites, respectively.

Discussion

YW average winter TS at 1–2 cm depths were, depending on site conditions, 0.43–1.22 °C lower than those in YB (Fig. 3a). This result clearly supports the ‘cold soil in a warm world’ hypothesis in the boreal forest context. Snow manipulation studies2,15,16 and model results18,19 in other ecosystems have also found evidence in support of this hypothesis. Although studies have predicted a rise in spring TS5,17, our data suggest that YW average minimum TS was 0.45–1 °C lower compared to YB with a considerable reduction in magnitude with increasing Leaf Area Index (LAI) (open to forested sites). In contrast, spring mean and maximum TS did not exhibit any consistent pattern (Fig. 3b and Supplementary Fig. S5b). The opposite pattern in spring minimum and maximum TS and the number of days with daily average TS ≤ −5 °C (Fig. 3d), corresponds to a higher frequency of FTEs in YW than in YB. This effect was more pronounced in forested sites than in open sites. Climate-warming-induced spring FTEs have been suggested by other studies5,10. Cold winter soil and frequent FTEs in warm future years are likely to substantially impact terrestrial plants and microorganisms8. Winter soil freezing can adversely affect tree growth and functioning21 and alter soil carbon dynamics11,12. FTEs have also been reported to increase nitrogen mineralization in high-latitude ecosystems11,12.

Snow cover and its interaction with forest stand structure were the major drivers of TS differences between YW and YB in the present study. Early soil freezing events were associated with delayed snowfall in YW (Fig. 2). Likewise, reduced SCDST (by 48–51 days) (Fig. 3c), higher relative humidity (RH) (5.55–9.15%; indicative of high latent heat from melting snow) (Fig. 2), and data from nearby weather stations22 (maximum snow thickness and total precipitation in winter and spring were ~40 cm and 190.5 mm, respectively, in YW and were ~100 cm and 165.3 mm, respectively, in YB) imply that reduced insulation from thinner or less spatially continuous snow cover decreased TS2,6,15 in YW compared to YB. It is also evident from Fig. 2 that TS in YW was tracking TA more closely than in YB2. Higher forest cover was associated with an increased magnitude of differences in TS and number of FTEs between YW and YB (Figs 2 and 3a,b,d). It is likely that in YW with a shallower snowpack tree stems and other vegetation cover reduce TS by creating small pockets in the snowpack that allow penetration of cold air into the subnivean space, increasing FTE frequency by a ‘tree well effect’23,24.

‘Proxy year’ or ‘analog year’ approaches have been widely used to examine potential effects of climate change on hydrology and agriculture, but only recently applied to ecological processes25. This approach allowed us to test the ‘cold soil in a warm world’ hypothesis for the first time in the boreal forest by realistically simulating composite effects of future climate warming8. Although we found reduced TS at shallower depth under warming conditions, the findings are still consistent with projected long-term warming in the deep soil. TS at shallower depths are prone to rapid changes modulated by soil moisture and insulating effects of snow, litter, and vegetation26, while deep soils respond to the integrated transfer of thermal energy. Because of soil’s high thermal capacity and low heat conductivity, diurnal/seasonal TS changes attenuate with increasing depth and lag considerably behind those of shallower soils. Studies have found the usual soil frost depth is ~15 cm in high-latitude ecosystems27. We thus can assume the ‘cold soil in a warm world’ effect is limited to a similar depth. Since carbon in boreal forest soils is primarily stored in the uppermost soil horizons and organic layer28, wintertime reductions in surface TS might create a negative climate feedback by reducing soil heterotrophic respiration6,12. By assuming simple linear relationships between TA and TS, most existing models will miss this feedback and likely over-estimate warming effects on soil C loss. Conversely, increases in FTE are predicted to negatively affect boreal forest regeneration and productivity, which could constitute a positive climate feedback. The insights from our study are thus an important input to development of credible predictions of climate-induced TS change at shallow soil depths that are most important to carbon processes in high-latitude ecosystems14.

Methods

Study area

Chronosequence plots were established in the boreal forest of northwestern Ontario, Canada ~200 km north of Thunder Bay and 100 km south of Armstrong. Three 10-m radius circular plots (314.15 m2) were established in each of two post-fire (fires occurred in 1998 and 2006) and three post-harvest (harvested in 1998, 2006, and 2013) stands (Supplementary Fig. S1); microenvironmental measurements were made from July 2013 until June 2017.

The study area is generally flat with an average elevation of 416 m a.s.l. The soil in this area is a moderately deep Brunisol (coarse loamy texture) with organic layer thickness (LFH) 1–25 cm29 and average pH ~5.3. The growing season for this area varies from 110–120 days29. Climate normals for annual temperature and precipitation (measured at Armstrong), and snow depth (measured at Thunder Bay) are –1.1 °C, 738.4 mm, and 9 cm, respectively. Mean annual daytime and nighttime windspeeds, measured at Armstrong at 10 m aboveground over the study period, were 0.7 ms−1 (maximum 1.2 ms−1) and 0.4 ms−1 (maximum 1 ms−1), respectively22.

The study area is a mixedwood boreal forest characterized by trembling aspen (Populus tremouloides Michx.), black spruce (Picea mariana (Mill.) BSP), white spruce (P. glauca (Moench) Voss), jack pine (Pinus banksiana Lamb.), eastern white cedar (Thuja occidentalis L.), balsam fir (Abies balsaema (L.) Mill.), and paper birch (Betula papyrifera Marsh.). Stand structural attributes are presented in Supplementary Table S1.

Instrumentation and measurements

Plots were established in locations with at least 1 ha of identical disturbance (either harvest or fire) of similar age-class and were at least 1 km away from each other and from any water body. We used fire maps (obtained from the Ontario Ministry of Natural Resources) and forest management plans (obtained from Resolute Forest Products) to collect information about the forest management history, disturbance type, and stand age in aiding the selection of plot locations.

At the center of each plot a micrometeorological tower was set up to measure air temperature (TA) and relative humidity (RH) every hour at 1.5 m height from the ground (data collected using a LogTag HAXO-8; range (TA/RH): –40 to +85 °C/0 to 100%; minimum accuracy (TA/RH): ±0.5 °C/0.1%). Additionally, we installed nine soil temperature (TS) sensors (LogTag TRIX-8) in each plot at ~1–2 cm soil depth (following the guidelines of Lundquist and Lott24), which recorded measurements at hourly intervals (Supplementary Fig. S2). The sensors used are rated by the manufacturer from –40 to +85 °C with a minimum accuracy: ±0.5 °C; in lab calibration trials we found the RMSE to be ±0.11 °C in the range –10 to 35 °C (see Supplementary Texts for accuracy reports on this sensor). Each TS sensor was sealed in thin (0.09 mm) waterproof plastic film and was placed at least 50 cm away from tree trunks. Sensor locations were recorded as bearings from the center of the plot and marked with flagging stakes. Microclimatic data were collected annually in summer, and any compromised sensors were replaced.

Leaf Area Index (LAI) was determined using hemispherical photographs (HPs) taken with a Nikon CoolPix 4500 (4 Megapixels) camera with a Nikon FC-E8 fisheye converter (angle of view 183°) mounted on a tripod. Except in 2013, summer and winter HPs were taken each year in early July and late September/October, respectively, in three equally spaced locations within each plot at 1 m above ground as shown in Fig. S2. Exposure settings and analysis of HPs, using Gap Light Analyser30, were done as per the guidelines of Zhang et al.31. The average of the three LAI-4 (LAI estimated over zenith angle 0–60°) values was used as the LAI for a plot in a given season/year.

Stand density was measured every year as the number of trees (diameter at breast height ≥5 cm and height >1 m) within each plot and converted to stems/ha. Heights (m) of these trees were measured every year using a Suunto PM-5 Clinometer. Similarly, litter depths (mm) were measured using a ruler in locations adjacent to each soil temperature sensor within each plot. We set up four 1-m2 subplots within each plot and visually determined the percent cover of ground-layer vegetation every year (Supplementary Fig. S2).

Proxy year establishment

Proxy baseline (YB) and future warm (YW) years were determined by comparing local (study area) TA anomalies with the northern hemisphere (NH) anomalies. The HadCRUT4 NH monthly TA anomaly data32 over 1840–2017 were used for this purpose. A simple polynomial regression curve was fitted to the annual NH TA anomaly data to show a general trend over the 21st century (Fig. 1). The GHCN-D (v2) (Global Historical Climatology Network - Daily data) daily average TA data33 for weather stations around the study area (48°–50° N and 88°–90o W) were analysed via the KNMI Climate Explorer (https://climexp.knmi.nl) platform to calculate local monthly TA anomalies with respect to the 1961–1990 baseline year (since HadCRUT4 anomaly data are also based on 1961–1990).

Results from a one-way ANOVA with robust estimation indicated that December 2013–November 2014 had the lowest annual TA anomaly among the years over the study period (2013–2017) and did not differ significantly (p = 0.11) from NH average annual anomaly. December 2015–November 2016, however, had the highest TA anomaly for all seasons over the study period and the annual TA anomaly was significantly higher (p < 0.01) than the NH average annual anomaly (Fig. 1). These years are representative of the historical baseline years and projected warm future years. So, for this study, we chose December 2013–November 2014 as the YB and December 2015–November 2016 as the YW.

LAI-based site categorization

To assess the generality of the ‘cold soil in a warm world’ hypothesis in different site conditions, we classified plots based on their LAI values as: ‘open’, ‘partially forested’, and ‘forested’. K-means clustering algorithm was used to determine the LAI cluster centers; lowest center value was assigned to ‘open’, medium value to ‘partially forested’, and highest value to ‘forested’. The LAI cluster center for the summer was 0.05 in open, 0.72 in partially forested, and 1.36 in forested sites. The winter LAI cluster center was 0.05 for open, 0.42 for partially forested, and 0.79 for forested sites.

TS-based snow cover duration (SCDST)

To determine snow cover duration (SCDST) from TS for each sensor in each plot, hourly sensor data were converted to daily TS ranges (ΔTS = daily maximum TS – daily minimum TS). If ΔTS remained ≤1 °C over 48 hours and the daily maximum TS was <2 °C, we considered ‘snow present’ for that day. The resulting daily snow present/absent time series were checked against TS sensor data and snowfall event data from nearby Armstrong airport weather station to ensure prediction quality. A number of studies24,34,35 have successfully used similar algorithms to determine SCDST.

TS, TA, RH, SCDST, and freeze-thaw events (FTEs) data analysis

Hourly sensor data were first converted to daily mean, minimum, and maximum values that were then used to calculated plot-wise monthly mean, minimum, and maximum TS/TA/RH for each sensor. Plot-wise seasonal mean TS, average minimum and maximum TS, and mean TA/RH for each sensor were calculated from monthly data. Seasons in this study were defined as: winter (December–February), spring (March–May), summer (June–August), and fall (September–November).

The frequency of freeze-thaw events (FTEs) for each year in all site conditions were calculated as the number of days with daily average TS ≤ −5 °C (there was no more than 1 FTE per day). Instead of using TS < 0 °C, we choose −5 °C because studies have found that at TS ≤ −5 °C soil microbial activities are inhibited substantially in high-latitude ecosystems11.

Site-specific differences in TS, TA, RH, SCDST, and FTE between YB and YW were tested using linear mixed effect (LME) models. For comparison of TS we focused both on mean and minimum/maximum values, because in projected future warm years maxima/minima of the extreme climate events can have more serious consequences for plants and microorganisms than changes in projected mean values1. In LME models, sensor replications nested within each plot were considered random effects, and proxy year and site conditions (and their interactions) were considered as main effects. Dependent variables (TS, TA, RH, SCDST, FTE) were log-transformed where necessary to meet the residual normality assumption of LME models. All analyses were preformed using the R language platform36.

Snow cover duration from satellite data (SCDS)

Remote sensing assessments of snow cover duration in spatially heterogeneous sites require high-resolution spatiotemporal satellite data. None of the freely available satellite images meet this resolution requirement; for example, the MODIS (Moderate Resolution Imaging Spectroradiometer) satellite provides daily global data at a low spatial resolution (maximum 250 m) and the Landsat satellites provide high spatial resolution (30 m) global data at a 16-day interval (pixels are often cloud contaminated). Thus, integrating high-temporal-resolution MODIS data with high-spatial-resolution Landsat data to produce synthetic data with high spatiotemporal resolution is necessary to study highly dynamic land surface processes that operate at a small scale.

MODIS-Landsat fusion has been achieved by a number of models and algorithms, including the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM)37, the Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM)38 and spatiotemporal image-fusion models39. We have chosen the STARFM, originally proposed by Gao et al.37 and tested in a Canadian boreal forest, to generate daily snow cover maps to supplement TS-based findings. In this algorithm, a first order approximation of the relationship between coarse MODIS (M) data and Landsat (L) reflectance data for a pixel located at (xi, yi) and acquired on date tk was assumed as:

$$L\,({x}_{i},\,{y}_{i},\,{t}_{k})=M\,({x}_{i},\,{y}_{i},\,{t}_{k})+{{\epsilon }}_{k}$$

Where k represents error in observed MODIS and Landsat reflectance resulting from differing bandwidth and solar geometry37. STARFM is one of the most extensively tested fusion techniques that requires only one MODIS-Landsat pair input (but performs better with two pair input) and requires less computational power than alternative approaches.

Data preparation for fusing

We used the Normalized Difference Snow Index (NDSI) approach to determine SCDS. It is a widely used satellite-image-based spectral index usually calculated from reflectance in green and shortwave infrared bands40. To properly set the NDSI threshold in forested areas, Normalized Difference Vegetation Index (NDVI), calculated from the reflectance in red and near infrared (NIR) bands, was used as an auxiliary input in the snow-mapping algorithm41. So, for this study, we used green, red, near infrared (NIR), and shortwave bands to map snow cover duration.

Radiometrically, atmospherically, and geometrically corrected MODIS (horizontal tile: 12, vertical tile: 4) (MOD09GA V006)42 and Landsat 8 (Level 2)43 (WRS2 path/row: 25/26, 26/25) surface reflectance products for the study area over October 2013–May 2014 and October 2015–May 2016 were used in this study. MOD09GA daily surface reflectance data in green (band 4: 545–565 nm), red (band 1: 620–670 nm), NIR (band 2: 841–876 nm), and shortwave infrared 2 (SWIR2) (band 6: 1628–1652 nm) bands were in 500-m resolution (total images = number of bands × day = 4 × 440 = 1760). The equivalent Landsat 8 surface reflectance data in green (band 3: 525–600 nm), red (band 4: 630–680 nm), NIR (band 5: 845–885 nm), and SWIR2 (band 6: 1560–1660 nm) bands were in 30-m resolution, and land cloud cover per scene was less than 20% (total images = 4 × 19 = 76).

The Landsat 8 shares similar sensor geometry with MODIS and both visit the same place at almost the same time. It can thus be assumed that they have an almost identical viewing and illumination geometry, and can be used in the fusion process without further angular adjustments44. MODIS images, however, were re-projected to UTM (Universal Transverse Mercator, Zone 16 N) and pixels were resampled (using the nearest neighborhood method) to 30-m resolution to match with Landsat 8 images. MODIS and Landsat 8 images were also precisely co-registered using the common point comparison method and brought into the same spatial extent.

Only cloud and water-body free high-quality pixels were used as input to STARFM. The MOD09GA surface reflectance 500 m quality assurance band was used to mask pixels with a status bit flag other than ‘0000’ for each of the four bands. Similarly, for Landsat 8 images, the level-2 pixel-quality band and radiometric saturation QA bands were used to mask radiometrically saturated, cloud-contaminated, and low-quality pixels. Finally, the pixels with missing values were set to –9999 and images were converted to signed 16-bit binary format. A series of R scripts36 was used to prepare satellite images for input in STARFM. C codes to implement STARFM were adapted from Gao et al.37. Algorithm details and information on data preparation can also be found in Gao et al.37 and Zhu et al.38.

Inputs to STARFM for producing synthetic Landsat 8 images

Two pairs of same-day MODIS-Landsat 8 images within two months either side of the prediction date45, along with MODIS image of the prediction date, were used as inputs to STARFM to predict synthetic Landsat 8 images for the dates for which Landsat 8 images were either not available or cloud contaminated (>20%) (Supplementary Fig. S3). Landsat 8 and MODIS equivalent bands were used to produce synthetic Landsat 8 images of the equivalent band. For example, Landsat 8 green (band 3) and MODIS green (band 4) bands were used to produce the synthetic Landsat 8 green-band image.

Accuracy assessment of predicted Landsat 8 images

To assess the accuracy of STARFM predicted images, synthetic Landsat 8 images were produced in green, red, NIR, and SWIR2 bands for three dates (2013-12-07, 2014-02-16, and 2016-03-18) spanning the study period. Predicted synthetic images were compared pixel-to-pixel with actual Landsat 8 images of the corresponding dates, and Spearman Rho (using the complete observation method), root-mean-square-error (RMSE) and mean absolute error (MAE) estimates were calculated to assess the accuracy. STARFM prediction maintained a reasonable accuracy over the study period compared to other studies37,45 (see Supplementary Table S2).

NDSI-based snow-mapping algorithm

Using the daily high-resolution synthetic data, NDSI was calculated as:

$$NDSI=\frac{reflectance\,in\,green\,band-reflectance\,in\,SWIR2\,band}{reflectance\,in\,green\,band+reflectance\,in\,SWIR2\,band}$$

Similarly, NDVI was calculated as:

$$NDVI=\frac{reflectance\,in\,NIR\,band-reflectance\,in\,red\,band}{reflectance\,in\,NIR\,band+reflectance\,in\,red\,band}$$

Using reflectance property of clouds in SWIR2 band, NDSI can successfully separate clouds from snow. For mapping snow cover with NDSI, a physically based threshold value > 0.4 is usually used to indicate snow cover40. There is, however, evidence46 that in conifer-dominated forests NDSI < 0.4 can also be snow. Moreover, Hall et al.41 found that NDSI values < 0.4 can also indicate snow if NDVI value is ~0.1. It is, therefore, important to identify area-specific NDSI threshold values to delineate snow cover.

After extensive visual inspections, using Google Maps, Sentinel-2 images (red, green, and blue bands), and predicted NDVI maps, for this study 1 ≥ NDSI > 0.35 was used for October 2013–May 2014 and 1 ≥ NDSI > 0.3 was used for October 2015–May 2016 to define snow cover. This year-specific NDSI threshold mainly stemmed from snow patchiness as a result of high difference in winter TA and TS between study years47,48. To prevent NDSI overestimation, green band reflectance values ≤ 0.1 were masked before NDSI calculation49. To prevent snow-cover underestimation, at 0.1 < NDSI < 0.3 it was also considered as snow, only if 0.08 ≤ NDVI ≤ 0.1241.

To determine snow-start and snow-end dates, and to compare the plot-wise results with SCDST, decisions on presence/absence of snow were made based on the information (NDSI and NDVI) extracted from each plot centers with a 15-m buffer around it. The NDSI-based snow cover mapping algorithm works only for pixels with at least 50% snow cover41. To ensure consistent comparisons between SCDST and SCDS, for SCDST we consider snow present only if ≥50% of the TS sensors in a plot agreed that there was snow on the particular day.

Accuracy assessment of NDSI-based snow cover maps

A confusion matrix was generated to show the overall agreement in snow cover duration estimated from TS (SCDST) and satellite (SCDS) data (Supplementary Table S3). Considering SCDST as the reference and SCDS as the predicted, results from the confusion matrix show that 72.8% of the time SCDST agreed with SCDS. The all-weather overall accuracy of our fused daily snow-cover map is higher than the MODIS daily snow products (MYD10A1 and MOD10A1) (31–45%)50. It is interesting to note that the overall accuracy of NDSI-based snow cover maps varied between years with a higher overall accuracy in YB (80%) than in YW (67%). This suggests that, in a warmer world, spatial variability in TS and snowpack will likely be higher than what it is now17, and we may need high-spatiotemporal-resolution satellite images complemented by field measurements35 to accurately capture this variability.

Code availability

Data preparation and all statistical analyses were implemented in the R programming language (version 3.4.3) (ref.36). R codes can be requested to M.A.H. (abdul.halim@mail.utoronto.ca).