Background & Summary

Local near-surface air temperature variations and extreme temperatures in Antarctica resulting from mesoscale climate processes can have a large effect on the biodiversity1,2,3 and have a significant influence on hydrological4 and glaciological processes5. With a globally warming climate, Antarctica has been the focus of climate research for its impacted melting ice shelves, sea ice changes, and surface mass balance6,7,8,9. There has been a growing focus throughout several regions of Antarctica on localized climate extremes caused by foehn wind warming events10,11,12 impacting the ice shelf mass balance13 and thereby being an important component in the breakup of ice shelves14. Hydrological extremes in areas of high biodiversity have also been linked to foehn warming events15. Shorter term climate perturbations and extreme meteorological events have an important indirect influence on species distribution and ecosystem functioning through its contribution to temporal availability of meltwater16,17. The availability of a high-resolution air temperature dataset over multiple decades is therefore important to analyse spatio-temporal variability and understand the effects of these mesoscale temperature variations and trends on the physical and biological systems across entire Antarctica.

Hourly in-situ measurements of air temperature are available from Automatic Weather Stations (AWS), but with a low spatial density due to cost and logistical issues in operating in such a remote location. Therefore, AWS data are not sufficient for a comprehensive spatio-temporal analysis of climate variables. Atmospheric reanalysis products such as ERA5 also provide Antarctic-wide near-surface temperatures with a high temporal resolution, but with a grid spacing of 31 km, which is not sufficient for resolving local patterns and processes causing near-surface temperature variability. Numerical weather prediction models such as The Antarctic Mesoscale Prediction System (AMPS), provide both a high temporal and spatial resolution air temperature with up to an hourly temporal and approximately 1 km spatial resolution18, but solely for selected regions such as the Antarctic Peninsula and the Ross Sea Region and currently only available for a limited span of years.

Remote sensing data in the thermal bands can be used as proxies for air temperature19, and may therefore be a solution to this demand for high spatial-temporal resolution products capable of estimating mesoscale dynamics. The MODerate-resolution Imaging Spectroradiometer (MODIS) is onboard the Aqua and Terra satellites and with their polar orbits, they provide Antarctic wide imagery of a large range of products several times a day with 1 km spatial grid resolution20. Studies from all over the world including the polar regions have shown that the MODIS land surface temperature (LST) and ice surface temperature (IST) products are good proxies for air temperatures21,22,23,24,25,26,27,28,29,30,31,32. As in-situ skin temperature measurements are sparse for Antarctica, AWS measurements of air temperature have in previously studies been used for validating the MODIS skin temperature products. Wang et al.27 compared AWS measurements with MODIS LST over the Lambert Glacier Basin in East Antarctica and proposed that there is great potential in using MODIS LST for improving the reconstruction of spatio-temporal variability of temperature in Antarctica. A study of the MODIS IST by Shuman et al.33 concluded that the MODIS IST product has the consistency to provide knowledge of the surface temperature of the Greenland Ice sheet. They furthermore suggested an empirical correction to refine the correlation between air temperature measured at 2 m height by AWS and MODIS IST values due to a cold bias between the MODIS LST and IST and air temperature measured in AWS at 1 m to 3 m height in arctic regions29,33,34. This apparent cold bias is due to a near-surface temperature inversion with thermal stratification near the snow surface29. Using AWS measurements as ground truth for creating a validated air temperature dataset from MODIS surface temperature products was published by Hooker et al.30 but Antarctica was excluded from their global dataset. Recently Zhang et al.31 developed an Antarctic Near-Surface Air Temperature based on MODIS observations and AWS measurements but in monthly means and a 5.6 km spatial grid resolution which is not enough to resolve these mesoscale warming phenomena. The previous case studies in Antarctica are also only based on either MODIS LST or MODIS IST which does not provide comprehensive temperature information for identifying warming that extend along the terrestrial costal margin.

Meyer et al.28 showed the utility of training machine learning models and a linear regression technique to make predictions of the near-surface air temperature over Antarctica based on MODIS LST values for 2013. The models were trained on air temperature measured at 3 m height at the exact time of a satellite overpass. It was concluded that the methods were promising but further research needed to include more training samples from a longer period to improve the models. The study by Meyer et al.28 further showed that a linear model was comparable to the more advanced machine learning algorithms. Tree-based machine learning models have large issues with extrapolation outside the range of the original training dataset35,36. With a shift in the distribution of measured temperatures in Antarctica in the past decades and an increasing number of extreme high temperatures37 a model must be robust towards these extremes and capable of extrapolation. This would favour the applicability of a linear model as simpler but yet robust model for this purpose, over machine learning models.

Here, we develop AntAir ICE, a near-surface air temperature dataset for the entire terrestrial Antarctica, the ice shelves, and the seasonal sea ice around Antarctica by using MODIS IST and LST as a proxy for near-surface temperature. This study uses air temperature measured by 117 AWSs at 3 m height for ground truth for this calibration. The linear relation between the daily mean of MODIS IST and LST and daily mean air temperature measured at 3 m height was established and then applied on the full MODIS record for the past 19 years. The developed AntAir ICE dataset has a spatial grid resolution of 1 km and represents daily average air temperature. The dataset is developed for the purpose of researching local temperature variations and extremes that are currently not well represented in available numerical climate models and reanalysis products.

Methods

The flowchart (Fig. 1) shows the schematic overview of the processing steps to generate the near-surface air temperature dataset, AntAir ICE.

Fig. 1
figure 1

Schematic overview of the processing steps to generate AntAir ICE. The output of this process was the coefficients for the linear relationship between the MODIS LST and IST and air temperature measured in AWS separately. These linear models (LMLST and LMIST) were then applied to the full MODIS record creating the AntAir ICE near-surface air temperature dataset.

Remote sensing data

The MODIS sensor delivers the land surface temperature and ice surface temperature as a standard product also referred to as skin temperature or surface radiometric temperature38. The MODIS IST product is a sea ice product available for ice shelves and surrounding sea ice and the MODIS LST product is available for continental Antarctica. The MODIS skin temperature is estimated using a generalized split-window algorithm from the emissivity from MODIS sensor radiance data product band 31 and 3239. The emissivity for the LST product is estimated based on information about the land cover type, the lower-boundary air surface temperature and atmospheric column water vapour39, whereas the IST algorithm uses a fixed snow/ice emissivity40. In the presence of clouds, the MODIS cloud mask product, MOD35 and MYD35, is used for removing cloud contaminated pixels in both the MODIS LST and IST product. For MODIS IST a pixel is considered cloud free if the cloud mask indicates that there is a 66% chance or higher that the pixel is clear and this probability is increased to 95% or higher for the MODIS LST products40. For the MODIS all pixels are furthermore quality controlled and the quality information for LST is saved in the products quality flag. Besides a swath for each satellite overpass, the MODIS LST and MODIS IST products are compiled into a night and daytime product by selecting the best quality measurements for that day. The MODIS LST product is divided into night and daytime products based on local solar time40 and the products over Antarctica are therefore not reflecting the solar position but a time interval, whereas the MODIS IST products are divided into a day and night product based on the solar zenith angle, and 0° to 85° is defined as daytime40.

In this study, the MOD11A1 and MYD11A1 LST products41 containing both a daily daytime and nighttime skin temperature product each at a 1 km grid resolution, were used. The data is retrieved from the NASA LPDAAC website (https://lpdaac.usgs.gov/) part of collection 6 and automatically downloaded using the R package RGISTools42. For IST the day products40 MOD29P1D, MYD29P1D, and night products MOD29P1N and MYD29P1N that is available daily and in a 1 km grid resolution from collection 6 was used. The data was accessed through NASA NSIDC DAAC website (https://nsidc.org/) and downloaded using the available python script from the NSIDC DAAC Data Access Tool. It is a well-known issue that the MODIS LST product still contains pixels with clouds that the cloud mask product cannot detect43,44. In this case, LST or IST values reflect cloud top temperatures which are usually far below the skin temperature measured on surrounding days. A threshold of two times the standard deviation below the mean was applied for each pixel to remove this cloud contamination as also used in previous studies of MODIS skin temperature products44. Further preprocessing was applied to the MODIS IST since a snow or ice surface with a skin temperature above 0 °C is not possible in the MODIS IST product45, but the MODIS IST is known to have these wrongly estimated pixels44,45. Following the procedure from Yu et al.44, all pixels with a MODIS IST above 0 °C were removed.

Automatic weather station data

Measurements of daily mean air temperature from AWS (Tair) were used as the ground truth in this study for modelling and validation. The spatial distribution of the available AWS from five different providers used for AntAir ICE throughout Antarctica are illustrated in Fig. 2. Data from AWS were obtained from Meteo-Climatological Observatory of Italy46, the Long-Term Ecological Research program47, the United States Department of Agriculture or the Antarctic Meteorological Research Center at the University of Wisconsin48, Institute for Marine and Atmospheric Research, Utrecht University49 and Antarctic Climate Data Collected by Australian Agencies50. There is a total of 117 stations (compared to 70 stations used for AntAir version 1 by Meyer et al.51) available in this study. All stations provided 15-minute to hourly sampling intervals of air temperature in 2 or 3 m height and only station records with the complete 24-hour period of measurements aggregated into daily averages. The temperature sensors on the stations have a shield for natural ventilation that prevents heating from direct sunlight and the uncertainty lies within ± 0.5 °C52.

Fig. 2
figure 2

Automatic weather stations, AWS used in this study. Colors indicate source of data: The Long-Term Ecological Research program (LTER), the United States Department of Agriculture or the Antarctic Meteorological Research Center at the University of Wisconsin (AMRC), Institute for Marine and Atmospheric Research, Utrecht University (IMAU), Meteo-Climatological Observatory of Italy (PNRA), and Antarctic Climate Data Collected by Australian Agencies (BOM). Basemap from Quantarctica56.

Compilation of training and validation data

For each AWS location a 19-year record of MODIS skin temperature was derived by extracting the MODIS time series of the grid cell containing the station. The daily mean of either MODIS LST or IST were matched with the Tair for the corresponding station depending on the location of the station. For LST only days with 4 available scenes with good quality were selected based on the pixel’s qualify flag. There were in total 47,865 cloud free data points of matching daily MODIS LST and measured Tair within the period from 2003 to 2021. For the IST the day and night products are based on the solar zenith angle and due to the high latitude, there will only be a day (or night) product available during the austral summer (austral winter). However, due to the extent of Antarctica, this is only valid for the most southern part of Antarctica. A latitude threshold of 75°S was therefore chosen. For 75°S latitude and northwards there must be either a day and night product together or all 4 scenes available. For 75°S to 90°S latitude one scene during the summer period and at least two scenes during the winter period were selected. There were in total 38,809 cloud free data points of matching daily MODIS IST and measured Tair within the period from 2003 up to and including 2021. Every third year was used for model validation (2003, 2006, 2009, 2012, 2015, 2018, 2021), and the rest for fitting the model.

Model training and validation

A linear regression model between the daily skin temperature from MODIS and Tair was derived using the MODIS data as predictor and the corresponding air temperature as measured by the stations as the response. Due to the difference in the MODIS LST and IST products’ temperature range and environment, two separate models were made for each product referred to as LMLST and LMIST. The linear relation between the measured Tair and daily mean MODIS LST and MODIS IST are shown in Fig. 3a,b respectively (p-value < 0.01).

Fig. 3
figure 3

Linear regression line in black between daily mean air temperature measured in Automatic Weather Stations and daily mean MODIS skin temperature. (a) MODIS LST and (b) MODIS IST.

The trained models were then used to predict the held-back validation data for the purpose of model testing. The models were used to make predictions for the entire spatial time series for each product and then spatially merged into one product. The final predictions are the continues near-surface air temperature dataset in a daily temporal resolution and 1 km grid resolution for the period 2003–2021, AntAir ICE. The daily mean near-surface air temperature from the LMLST and the LMIST model will be denoted TAntAir in the following.

Data Records

The AntAir ICE dataset is published on PANGEA53. The geographic coverage of the AntAir ICE dataset includes the entire continent of Antarctica, the surrounding ice shelves, and partly sea ice in 1 km spatial resolution and a daily average temporal resolution for the period 2003–2021, inclusive. The dataset is in GeoTIFF format and in the Antarctic Polar Stereographic projection (EPSG 3031) with one file per day. Each day is a spatial raster group with two layers; the first layer is the predicted air temperature for that day in degree Celsius using a scaling factor of 0.1, the second layer is the number of available MODIS scenes for each grid cell for that day, ranging from 0 to 4. Areas with cloud contamination or without sea ice are marked with no data. Files are named AntAir_ICE_<YYYY>_<DOY>.tif, where <YYYY> represents the year and <DOY> represents the day of the year. Files are divided into quarters with January, February, March as 1, April, May, June as 2, July, August, September as 3 and October, November, and December as 4, for each year (2003–2021) and compressed to a ZIP files. Data are also available on New Zealand modelling consortium open environmental digital library (https://www.envlib.org) with access through the tethysts python package.

Technical Validation

Temporal validation of the LMLST and LMIST models was done based on every third year (2003, 2006, 2009, 2012, 2015, 2018, 2021) of measurements from all AWS and on days with no clouds and good quality according to the flag. As performance measures, the Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R-squared (R2) and Mean Bias (MB) were used (Table 1). The LMLST model has a MAE of 2.26 °C and a R2 value of 0.97 which indicates that the model could well estimate the overall patterns in air temperature. The LMIST model does have a slightly lower, but still very high R2 value of 0.93 and a MAE of 2.66 °C. Since the final AntAir ICE temperature dataset is containing days with cloud contaminated scenes and lower quality pixels, the RMSE, MAE, R-squared and MB were additionally calculated for the validation years for the full non-filtered days (Table 1). The models performance slightly decreased for both, the LMLST model (MAE = 3.48 °C, R2 = 0.92) and LMIST model (MAE = 3.52 °C, and R2 = 0.82) for the entire dataset but the models were still capable of predicting air temperature within reasonable error. The relationship between the missing MODIS scenes and the performance of the models is illustrated in Fig. 4 where it is clear that the LMIST model is not as sensitive to requiring all 4 scenes as the LMLST model. The daily availability of 4 cloud free scenes in the MODIS LST is also more occurrent (Fig. 4a) than for the MODIS IST (Fig. 4e). For the LMLST model, the R-squared was low for pixels with only one available scene (R2 = 0.64) but it increases drastically when two scenes were available (R2 = 0.87). As the final datasets contain information in the metadata about the number of available scenes the end user can choose to not include the cloud contaminated days. The continued validation and comparison were based on the temporal validation set with four available scenes and good quality flags.

Table 1 Temporal validation of the LMLST and LMIST using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R-squared (R2) and Mean Bias (MB).
Fig. 4
figure 4

The relationship between the number of cloud free pixels, and the RMSE, MAE, and R2. (ad) from the linear model for LST and (eh) from the linear model for IST.

The newly published Antarctic AWS dataset AntAWS52 contains quality controlled daily temperature records from 216 AWS records from the period 1980–2021. For validation AntAir ICE, temperatures for the temporal validation period were compared to the 176 available AWS records from AntAWS using performance measures MAE, RMSE, R2 and MB (Table 1). The R-squared values were still high compared to the two models individually and similarly the MAE for cloud free days (MAE = 2.73 °C) was only slightly increased when the quality controlled AntAWS dataset is used for validation.

Seasonal variation

Due to the location of Antarctica, there is a large seasonal variation driven by the change in incoming solar radiation and it is therefore important to test the LMLST and the LMIST models’ performance for each season in the validation years (Table 1). There was a clear trend that the summer months, December, January, and February, (DJF) have the lowest RMSE and MAE for both LMLST and LMIST models. The spring season, September, October, and November (SON) also showed a low RMSE and MAE for both models. The autumn, March, April, and May (MAM) and winter months, June, July, and August, (JJA) had the highest RMSE and MAE for both models. This larger bias is expected due to the highly variable temperature conditions in winter caused by wind, which is only partly reflected in AntAir ICE, however, the MAE did not exceed 3 °C or any season for either the LMLST or LMIST models. The bias, calculated as the difference between the Tair from the AWS measurements and the TAntAir, for each season for the temporal validation years is illustrated in Fig. 5. The box plot indicates that for the majority of data, the bias was close to zero.

Fig. 5
figure 5

Seasonal bias between model and in-situ air temperature measurements. Bias in °C is the difference between the measured air temperature from automatic weather stations and the models’ prediction. (a) is for the LMLST and (b) is for the LMIST. The Bias is measured for each seasons December, January, and February, (DJF), September, October, November (SON), March, April, and May (MAM) and June, July, and August (JJA) for all weather stations and for the temporal validation years.

Trend comparison

To analyse how the AntAir ICE is comparing to Tair time series and trend, the records of daily mean air temperature in the AWS Henry, AmeryG3, AWS09 and Minna Bluff were compared using the AWS data from AntAWS52 along with the temporal trend in annual mean for both the station and correlating AntAir ICE grid cell (Fig. 6). The annual mean from AntAir ICE also includes cloud interfered days where not all four scenes were available, nevertheless the correlation between the Tair from the AWS measurements and the TAntAir is significant for all four stations throughout the year. The annual mean trends were furthermore very similar and significant for all stations except AmeryG3. At AmeryG3 and Minna bluff there were cold spikes that could be related to cloud contamination.

Fig. 6
figure 6

Time series of daily mean air temperature from AntAir ICE in blue and AntAWS data in green for AWS (a) Henry 2008, (c) AmeryG3 2019, (e) AWS09 2003 and (g) Minna Bluff 2004. Mean annual temperature from AntAir ICE in blue and AntAWS data in green for AWS (b) Henry, (d) AmeryG3, (f) AWS09 and (h) Minna Bluff with temporal trend line for years with available AWS record.

Comparison with ERA5 reanalysis

ERA554 is a widely used climate reanalysis dataset for air temperature, however, with a grid spacing of 31 km, it is in some cases not capable of resolving local processes causing near-surface temperature variability. By comparing AntAir ICE to ERA5 performance in multiple Antarctic regions this study can illustrate how the high resolution of AntAir ICE is improving the understanding of spatial temperature patterns. The ERA5 temperature data were obtained from ECMWF (https:// apps.ecmwf.int/datasets/) from 2003 to 2021. Using a similar method as Zhu et al.55 the hourly 2 m temperature record from the ERA5 was extracted from the nearest grid locations of all 117 AWS and a daily mean was produced, TERA5. Every third year was used for model validation (2003, 2006, 2009, 2012, 2015, 2018) were selected for comparing ERA5 to the in-situ measured Tair, which excluding 2021 the same years used for validation of AntAir ICE. To account for the elevation difference between the selected ERA5 grid cell and AWS location, a correction in elevation using the dry adiabatic lapse rate of 9.8 °C/km was applied for all 117 stations, and the MAE, RMSE, and R2 from the comparison was calculated (Table 2). The lapse rate correction improves ERA5 performance significantly, with a RMSE = 4.31 °C, MAE = 3.02 °C, and R2 = 0.92, which were a higher value for RMSE and MAE and a lover R-squared than the LMLST and LMIST models’ temporal validation.

Table 2 Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and R-squared (R2) for temporal validation of ERA5 for all 117 Automatic Weather Stations.

The spatial variations in AntAir ICE and ERA5 performance measured as MAE when compared to in-situ measured Tair (Fig. 7a,b) indicated a very similar MAE for the two temperature datasets. For TAntAir 70% of the AWS locations have a MAE below 3 °C whereas it was 67.5% for ERA5. The AWS located in West Antarctica seemed to have a lower MAE for TAntAir than for TERA5. The Ross Sea Region around Ross Island have for TERA5 two AWS locations with very high MAE, which could be due to the very complex terrain that ERA5 was not as capable of resolving. The spatial variations in AntAir ICE and ERA5 performance measured as R2 were very similar for the two datasets with most stations above R2 = 0.8 (Fig. 7c,d). For the stations located in West Antarctica and the Antarctica Peninsula the mean bias were lower for AntAir ICE than for ERA5 (Fig. 7d,e). The East Antarctic Plateau stations also had a low mean bias in AntAir ICE (Fig. 7e) showing a good performance.

Fig. 7
figure 7

Prediction error in the estimation of air temperature. Mean absolute Error, MAE for all 117 automatic weather stations for (a) AntAir ICE and (b) ERA5 for the temporal validation using dry lapse rate correction. R-squared for c) AntAir ICE and d) ERA5. Mean Bias for e) AntAir ICE and f) ERA5. Plotted on top of a terrain model from the Reference Elevation Model of Antarctica57.

To compare the similarity spatially between ERA5 and AntAir ICE the annual mean from 2003 to 2021 was calculated for both datasets and AntAir ICE was resampled to ERA5 resolution (Fig. 8a,b). The difference between the annual mean in Fig. 8c shows a very similar pattern but there is a clear higher warming pattern in AntAir ICE along the Transantarctic Mountains and a colder bias on the Antarctic Peninsula. To investigate the structural similarity in the two annual means a structural similarity index (SSIM) was used. SSIM can be used as a tool to assess perceived changes in structural information between two spatial inputs and the metric has been found to be a more meaningful assessment of structural changes compared to a traditional metrics like Mean Square Error (MSE). A SSIM of 1 represents a perfect match and it was found that the mean SSIM between AntAir ICE and ERA5 annual mean was SSIM = 0.74. There was a clear difference in the structural similarity index (Fig. 8d) and difference between ERA5 and AntAir ICE in areas with complex terrain. Especially around the Transantarctic Mountains there was a warmer trend in AntAir ICE which could be due to this area is exposed to warming katabatic outflows and these warming patterns were not resolved in ERA5. This was also supported by the lower MAE and MB in AntAir ICE than ERA5 for the stations in this area (Fig. 7).

Fig. 8
figure 8

(a) Annual mean temperature from AntAir ICE from 2003–2021 resampled to ERA5 grid resolution. (b) Annual mean temperature from ERA5 from 2003–2021. (c) difference between AntAir ICE and ERA5 annual mean temperature and (d) structural similarity index between AntAir ICE and ERA5 annual mean temperature.

To illustrate the uniqueness of this high-resolution near-surface air temperature dataset in comparison to the ERA5, the accumulative sum of days with temperature above 0 °C for each pixel in three different regions of Antarctica for the full 19-year temperature record for DJF were calculated from ERA5 and AntAir ICE respectively (Fig. 9). It is clear to see how AntAir ICE resolves inter-valley variability in the McMurdo Dry valleys in the Ross Sea Region (Fig. 9c) compared to ERA5 (Fig. 9d). The sea ice areas in the Antarctic Peninsula seemed to have a higher occurrence of above 0 °C for ERA5 (Fig. 9f) than for AntAir ICE (Fig. 9e). This is caused by the fact that AntAir ICE were only available for that area when sea ice was present due to the nature of the MODIS IST product whereas ERA5 provides air temperature for the full period independently of sea ice. This detailed spatial resolution highlights its potential for monitoring of temperature patterns in Antarctica.

Fig. 9
figure 9

Accumulative days with above 0 °C air temperature for summer season DJF 2003–2021. Data from AntAir ICE for (a) Amery Ice Shelf, (c) Ross Sea and (e) Antarctic Peninsula. Data from ERA5 for (b) Amery Ice Shelf, (d) Ross Sea and (f) Antarctic Peninsula. Basemap from Quantarctica56.