Sea ice surface temperatures from helicopter-borne thermal infrared imaging during the MOSAiC expedition

The sea ice surface temperature is important to understand the Arctic winter heat budget. We conducted 35 helicopter flights with an infrared camera in winter 2019/2020 during the Multidisciplinary Drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition. The flights were performed from a local, 5 to 10 km scale up to a regional, 20 to 40 km scale. The infrared camera recorded thermal infrared brightness temperatures, which we converted to surface temperatures. More than 150000 images from all flights can be investigated individually. As an advanced data product, we created surface temperature maps for every flight with a 1 m resolution. We corrected image gradients, applied an ice drift correction, georeferenced all pixels, and corrected the surface temperature by its natural temporal drift, which results in time-fixed surface temperature maps for a consistent analysis of one flight. The temporal and spatial variability of sea ice characteristics is an important contribution to an increased understanding of the Arctic heat budget and, in particular, for the validation of satellite products. Measurement(s) Surface temperature Technology Type(s) Infrared camera Sample Characteristic - Environment Sea Ice Sample Characteristic - Location Arctic Ocean Measurement(s) Surface temperature Technology Type(s) Infrared camera Sample Characteristic - Environment Sea Ice Sample Characteristic - Location Arctic Ocean


Background & Summary
The measurement program was part of the Multidisciplinary Drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition (https://mosaic-expedition.org), which took place in the Arctic from October 2019 to October 2020 and was divided into five legs 1 . RV Polarstern 2 was drifting attached to an ice floe along the transpolar drift from about 85 N north of the Laptev Sea towards Fram Strait between Svalbard and Greenland 1,3 . To study the Arctic climate, a research camp, called Central Observatory (CO), was installed on a second-year ice floe that survived the summer and originated from the Laptev Sea 4 . The ice of the MOSAiC floe was very rotten and porous from the summer melt at the beginning of the drift. The level ice areas were characterized by a high fraction (>60%) of refrozen melt ponds, which had physical properties more similar to first-year ice. First-year ice was developing in the surroundings of the floe (later reaching about 30% fraction) and in cracks frequently cutting through the CO throughout the drift. Additional measurements were performed on the surrounding sea ice within a distributed network (DN) of autonomous measurement platforms and from the air by helicopters and drones.
From October 2019 to April 2020, we performed 35 helicopter flights with an Infrared camera (IR camera) on the Airbus Helicopter MBB-BK 117 C-1 over Arctic sea ice. All flights covered a similar sea ice area in the near vicinity of the drifting RV Polarstern. The IR camera took more than 150000 images in total with 640 × 48s0 brightness temperature values, which we converted to surface temperatures. All images of each flight are combined in one surface temperature map for every flight with a high spatial resolution of 1 m. The maps allow an investigation of spatial and temporal variability on a seasonal time scale based on the series of flights throughout the whole winter season. Measurements of the thermal radiation can be performed without daylight. Thus, thermal infrared observations are valuable source of data in the polar night during the Arctic winter. Detecting the thermal radiation is an appropriate tool because there are large temperature differences between open water

Methods
The processing of the temperature images resulted in two types of datasets. First, the processed images are provided for every flight including the corrections applied here. Based on the corrected images, gridded data are provided as temperature maps which result from the whole set of images in one flight as an advanced dataset. The processing steps are displayed in Fig. 1. A list of flights and variables are described and, if necessary, updated in our data manual 16 . Camera details. The images were recorded with the Infrared VarioCAM HD head 680 from InfraTec 17 which was positioned in a helicopter with the nadir downwards view towards the sea ice surface (Fig. 2). All essential technical details about the IR camera are highlighted in this paragraph and listed in Table 1. The broadband thermal infrared imaging was performed in the thermal infrared region of the electromagnetic spectrum from 7.5 to 14 μm, which includes the Earth's radiation peak at 10 μm [ 18 , page 293]. The accuracy for the measured brightness temperature is 1 K and its precision is 0.02 K. To distinguish features on the ground within one image the precision is more important than the accuracy. The calibration of the device was done down to the target temperature of −8 °C with a measured deviation of 0.6 K. It has to be considered that the measurements had mainly been performed at lower temperatures. The minimum temperature by the manufacturer is −40 °C while we could measure even lower temperatures of the sky. During the measurement we reached the limit a few times, but did not exceed the limit extensively. Therefore the accuracy could slightly change for colder temperatures but this is neglected in this study. A sufficient overlap of the images is guaranteed by an acquisition rate of 1 Hz. The raw data were measured at an even higher frequency of 4 Hz, but down-sampled here to 1 Hz. During the recording, every 15 to 60 seconds a non-uniformity compensation (NUC) ensured that there was a minimized camera-internal Fig. 1 The flow chart is presenting the processing steps from the raw data of the IR camera to the time-fixed surface temperature map as part of the finished product (NC = NetCDF). The grey boxes indicate the data status and the grey, black-framed boxes with the bold font indicate the published data products. The blue boxes show the performed processing steps according to the methods explained in this study. The dashed arrow shows the input of the single images, which is used for finding the best fit of the temporal temperature drift correction.
Sea ice surface temperature. In the thermal infrared range, the measured brightness temperature is close to the actual surface temperature. As a first approximation, the surface temperature T s was calculated from the up-welling brightness temperature T b,up , measured with the IR Camera, and the emissivity e [ 18 , page 294] with  For emissivities close to one, as in our case, the measured brightness temperature can be approximated to change linearly with the physical temperature. The Plank law is already considered during the recording of the brightness temperature. The influence of the down-welling brightness temperature was neglected for simplicity. The emissivity e is assumed to be a constant value of 0.996 for all incidence angles and surface types following Høyer et al. (2017) 19 . Here the downwelling contribution with (1-e) T s would be 1 K at 250 K surface temperature which is the same range as the accuracy of the measured temperature. We used this emissivity, because the instrument of the Hoyer et al. study, i.e., a Cambell Scientific IR120 radiometer, has with 8 to 14 μm a very similar spectral range to our instrument. However, the spectral weight function might differ. An emissivity of 0.97 was used for other infrared observations, e.g., by the NSIDC during Operation IceBridge flights with a KT19 thermometer with a spectral response from 9.6 to 11.5 μm. The lower emissivity value of 0.97 would lead to overestimated surface temperatures compared to our used value of 0.996. The accuracy of the surface temperature is determined by the accuracy of the brightness temperature (1 K, see Camera details) as well as of the accuracy of the emissivity. The additional uncertainty by the emissivity can be influenced by the incidence angle or change of surface type 20 .  www.nature.com/scientificdata www.nature.com/scientificdata/ In future studies, the aim is to apply a more detailed theory to retrieve a higher accuracy of the surface temperature T s . Additionally, it has to be considered that the penetration depth for thermal infrared radiation is very small of a few μm, which allows an influence by atmospheric parameters, like the air temperature, wind speed, or cloud cover [ 18 , page 294].
image correction. Corner mask. In all flights, warm temperature anomalies were found in one corner.
The IR camera looked through an open cutout in the helicopter bottom structure. The higher values were probably caused by a shielding effect from the structure of the helicopter, which slightly influenced the temperature recording in the respective corner. To remove false high temperatures in the respective corner, a corner mask is provided for every flight and can be applied to each image. We generally recommend to exclude the corner areas of the images in any case due to distortions in the higher incidence angle regime.
Radial image gradient. In every image, a radial gradient of the recorded temperature occurs. This gradient is likely caused by an artificial effect of the camera. We corrected it by an empirical radial gradient which is calculated by averaging images of a flight and applied for every flight independently. For this, only images with an average temperature below the 25th percentile of all images are considered to exclude warm features like leads which could influence the resulting gradient. The gradient is relative to the center temperature, which is assumed to be the truth. This lens effect correction was applied to all datasets and the correction array is included in the image-based dataset for every flight so that the original, uncorrected images can be reproduced if needed. In Mapping. After processing the image data, we produced flight maps by combining all images in one grid.
The gridded data set consists of temperature values with assigned coordinates (longitude and latitude) as well as relative coordinates to Polarstern for co-location with other measurements. This mapping was done in three steps, which are described in more detail in the following and which is illustrated in Fig. 6. First, the ice drift correction was performed to correct the coordinates to the target time which is the middle of the flight. Second, all image pixels had to be georeferenced under consideration of the helicopter position and rotation. Third and last, all pixels were assigned to the closest grid cell in the equidistant grid.
Some images were discarded if they were outside the main flight pattern, not yet at the main flight altitude level of around 300 m, or not usable due to the internal calibration (NUC). Here one image per NUC had to be discarded. As a benefit of excluding the discarded images, the maps are more compact in terms of data coverage of the covered area and better suited for the statistical and visual interpretation of the spatial variability in the defined area. In the following part, we explain the detailed methods for the image mapping.
Ice drift correction. Assuming the average ice drift of 8.52 km/d according to Krumpen et al. 3 , the sea ice drifted about 500 m during a 90-minute flight. Thus, a piece of ice at a specific location at the beginning of the flight was at another location later during the flight. Therefore, we corrected the coordinates of every image based on the drift of RV Polarstern to be in the coordinate system of the target time in the middle of each flight. The position and heading data 21-23 of RV Polarstern, which was achored directly to the ice floe, were recorded with the high-performance inertial navigation system (INS) Motion Sensor Hydrins 1 24 . For the ice drift correction, we use linear interpolated data of 10-minutes interval Polarstern positions to correct each image accordingly with the assumption of homogeneous ice movement across the entire flight area. We took the GPS position at the target time as the reference point for the stereographic projection which is used for the final gridded temperature maps. RV Polarstern is always in the center of the stereographic projection and for the rotated relative coordinates RV Polarstern's heading is defining the positive y-axis and the starboard direction is the positive x-axis. By this, relative distances of the flight data are kept comparable between all flights. The resulting map contains the locations of the observed sea ice surface at the target time. This was used to calculate a relative coordinate system with distances towards the reference point at RV Polarstern. www.nature.com/scientificdata www.nature.com/scientificdata/ Georeferencing. Based on the corrected image location and the stereographic projection, every pixel is geolocated. All projections are done with the WGS84 ellipsoid. The geolocation of each pixel is determined based on a pinhole camera model and a transformation from the camera to the plane 25 . While the pitch angle was generally low during the flight, the roll angle was increased during turns and discarded in cases of a high angle with more than 40 degrees to avoid large distortions in the projection. Values from images with a high roll angle were discarded if data with a roll of less than 20 degrees were available. In general, we would recommend using the data of high roll angle with caution because large distortion can cause uncertainties in the geolocation. We correct the ellipsoid height, measured by the helicopter GNSS, by the mean sea surface height by Andersen and Knudsen (2009) 26 . Additionally, we perform a distortion correction for the images to reach a more accurate pixel position. We use the positioning data which were recorded with an embedded GNSS inertial system Applanix AP 60-AIR 27 which was installed in the helicopter next to the IR camera (Fig. 2).
For an increased accuracy in the geolocation, we performed an optimization for the following six input parameters: • Time (time offset between IR Camera and GNSS) • Three components of a fixed camera rotation with respect to the helicopter (roll, pitch, heading) • Internal camera parameters (effective focal length, radial distortion coefficient) The optimization procedure consists of different steps. At first, common features between different images were determined using a consecutive execution of edge detection, pattern matching, homograhic fitting and cleaning using a RANSAC (RANdom SAmple Consensus 28 ) based method. The euclidean distance between the same feature in pairs of two different images in the final coordinate system is subject to minimization. Since the feature detection, despite initial outlier correction, were not perfect, even for a good matching parameter, a high bias in the sum or mean of the determined distances. Therefore, we used a loss function employing the 70th percentile of the distances of ground projected point matches in one images pair and then another 70th percentile of these values from all used images as loss. This parameter was then minimized using an adaptive differential evolution optimizer 29 .
For our processing we determined two different sets of correction parameters because there was change of the camera setup at the start of Leg 2. The parameter are listed with the order (time offset, roll offset, pitch offset, heading offset, effective focal length, radial distortion coefficient):

Fig. 6
The schematic illustrates the different parts of the mapping method applied to the images. We calculate the positions on the ground based on the helicopter position (a1) which change with the helicopter position and rotation (a2). We also consider the reduction of the ellipsoid height from the GPS by the mean sea surface height h mss because it changes the actual position of the pixel on the plane. The pixel locations are determined based on the reference point (x i , y i ) at the known nadir location. In the next step, the projection method starts with the data array from the image (b1), which is further processed to the georeferenced images with x,y coordinates (b2). Finally, the pixels of the image are assigned to an equidistant grid with a stereographic projection (b3). (2022) 9:364 | https://doi.org/10.1038/s41597-022-01461-9 www.nature.com/scientificdata www.nature.com/scientificdata/ For the respective leg the optimization was done for one specific flight, i.e., 20191002_01 for Leg 1 and 20191224_01 for Leg 2. Since there was the same camera on all three legs, the focal length and radial distortion parameter should not change. As a result, determined by the optimization, these parameters ended up very close. For all flights, we use these optimized parameters. Additionally, we manually determined a time offset for each flight because the IR camera is recording the time only with the full second. But for a flight speed of around 45 m/s, a time difference of 0.5 s can already cause a significant position shift of the images. As expected, we have mostly a time correction below 1 s. In cases of a higher time shift, probably the required initial time synchronisation was not performed. The time offset for every flight is included in the image NetCDF file while the time variable contains the final (corrected) time used.
In addition, the following modifications had to be done during the processing: Gridding. After having a georeference for all pixels in every image, we assigned the pixels of all images to an equidistant grid of 1 m. The original images have a higher resolution of approximately 0.5 m at nadir. However, 1 m guarantees full coverage of all grid points for the final map, even under a higher incidence angle. After the ice drift correction, the pixel coordinates of all images are assigned to a grid cell of the stereographic grid map referenced to the target time (Fig. 6). In case several values fit in one grid cell, the value closest to the target time was taken and the overlap discarded. In addition to the gridded surface temperature, we included the longitude, latitude, time, roll, and pitch, measured by the Applanix AP 60-AIR 27 , as gridded data as well. This set of parameters contains all necessary information for the interpretation of the physical parameters. We also included relative coordinates which are comparable for the whole dataset period. They have one constant reference point at RV Polarstern whereas the longitude and latitude coordinates were changing constantly with the drift. The rotated relative coordinates have always the same orientation,also with RV Polarstern as reference (see Ice drift correction). The 5 m resolution are the block average of the 1 m resolution data.
Uncertainties of the geolocation. Uncertainties in the mapping can be caused by the ice drift correction which is well-defined for the area of the CO, close to RV Polarstern, but can differ for outer areas. The surrounding sea ice could have drifted differently than the CO during the flight, which cannot be considered because there are no data available. However, this effect is small for the duration of one flight. Three to five Automatic Identification System (AIS) base-stations were installed in the CO, which could have served as an alternative to provide the ice drift correction. However, compared to the ship GNSS and INS system the AIS stations showed jitter of tens of meters, i.e., noise in the GPS position with time which could lead to a temporal variability of the signal and therefore some inconsistencies. Therefore we decided to use the RV Polarstern position data. The only disadvantage of that would occur if RV Polarstern would move relative to the CO during a helicopter flight, which did not happen for our flights here mentioned. The uncertainty of the ship's navigation system Hydrins is 0.01 deg for the heading and 0.03 m for the position 24 which is neglectable for our 1 m resolution data. Additionally, there are uncertainties caused by the installation of the instruments in the helicopter. The IR camera and the INS were deployed at different positions on the instruments plate in the helicopter, both with independent spring dampers ( Fig. 2), which could respond differently to vibrations of the helicopter. This would result in small angular noise, which was neglected. The uncertainty from the positional difference is small and is mostly overcome with the optimization which provides us with the optimal input parameter for the given setup. The pixel distance at the surface is dependent on the flight altitude which we corrected by the mean sea surface height (Fig. 6(a1)), but did not consider the changes due to tides, currents, or atmospheric pressure loading. In addition, IR camera images were projected to a plane at the distance of the helicopter to the mean sea surface height, which is probably lower than the sea ice surface. But, compared to the difference to the actual surface height, we consider this error negligible. The outer areas of the images (100 pixels in y-direction, the small axis of the image) were cut off to reduce the effect of perspective and corresponding lowering of the resolution in the mapped data. Even with this cut of the data, coverage along flight direction is sufficient since the overlap is large. There might be a small time-drift in the IR camera time (<2 s/day, i.e., about 0.1 s per flight), which can reduce the matching of the position data and the image. Based on the results of the georeferencing and gridding, we estimate the error of geolocation to a maximum of 10 m. For the L-sites flights with large distances to RV Polarstern a larger shift is possible due to, e.g., inhomogeneous ice drift. Since the flights were performed quite far north it has to be considered that the INS data quality was degraded due to the close proximity to the earth rotational axes 30 . Further quantification of the actual pixel-wise uncertainty in geolocation can not be performed with the methods available to us. Small shifts in the geolocation can cause issues in the matching of the same features from different images, but this does not change the statistics of the surface temperatures. Further quantification of geolocation uncertainty requires co-location with other point measurements on the floe which was not performed for this dataset. temperature-drift correction. During most flights, the temperature changes with time. This becomes clearly visible when different swathes in one flight are crossing. As an example, two neighboring images from www.nature.com/scientificdata www.nature.com/scientificdata/ different times within a flight have a temperature difference of 2.6 K for the local flights 20191105_01 or 20191230_01 and even 3.95 K for the regional flight 20191230_01. That is physically reasonable because changes in atmospheric parameters like cloud cover, wind speed, or air temperature can alter the surface temperature within the flight duration of about 90 minutes. A further investigation of the reasoning for this temperature drift will be performed in another study. However, this temporal temperature change would make it challenging or not possible to analyze the temperature maps of a whole flight coherently. Thus, we derived the time-fixed surface temperature by a temporal correction of the temperature change. This makes surface features in the data more easily interpretable because the temporal temperature changes do not have to be taken into account. For this, we corrected the surface temperatures of a whole flight by relative differences to the target time in the middle of the flight. The fixed surface temperature T s,fixed was corrected based on the fit function f i of the 10th percentile as lowest surface temperature T s values for each image (Eq. 2). The 10th percentile was chosen because it represents the fairly constant background temperature of cold snow-covered thick ice and is not influenced by varying warmer features like cracks in the sea ice.
The temporal temperature change was corrected by a fit function with the target time t 0 and the actual time of the pixel t (Eq. 2): ( ) The best fit type was selected by the smallest Chi-squared statistic. The selected, flight-specific fit type and parameters are listed in Table 2. The fit parameters a 1 , a 2 , a 3 , a, b, and c are fixed for each flight. The time t is given in seconds since day start. For the exponential fit, the initial guess was: (a = 0, b = 0.0001, c = 200). Some flights needed a slightly different initial guess with (a = −0.1, b = 0.0001, c = 200) or (a = 0, b = −0.0001,c = 200) because otherwise no solution could be found. The fit was performed to data only in the vicinity of the center of the CO, i.e. the position of RV Polarstern with 1 km extent in every direction and 100 measurement points from the start and end of the flight to cover the whole period. Often the flight starts and ends close to RV Polarstern anyway. As an exception, the mean longitude and latitude were used as the center point for the first CO and the three L-Site grids because during the flight RV Polarstern was not overflown. This restriction ensures that mostly the same sea ice surfaces of the CO are considered for the fit. That is especially important for the regional flights where a larger area and therefore a larger variability of surface temperatures is included. The temperature range for every flight (Table 2) is an indicator of the magnitude of temperature change in the respective flight. Here we considered the range between start and endpoint as an estimation of the homogeneity of the respective flight.

Data Records
All data of the IR camera, including the images, the maps, and the plots of the maps, are published in PANGAEA with open access 31 . Specific information about the data, like the list of processed and unprocessed flights as well as definitions of variables and their dimensions can be found in our data manual 16 , which will be updated if changes occur. The data were converted from the raw, binary format IRB to a NetCDF4 file format. There are two types of datasets with one that contains the image-based data and the other that contains processed maps with 1 m and 5 m horizontal resolution. The 5 m grid is added for smaller file size and easier data handling but is otherwise identical to the 1 m dataset. Each dataset type contains one NetCDF file for each flight. For the single images dataset all images of one flight are combined in one NetCDF file with the single images aligned along the third dimension (time). All files include the stereographic projection parameters used for the mapping. The flights have two identifications that can be used: (i) the Flight-ID, including the date of the flight, and (ii) the Device Operation, a unique ID, which was created during the MOSAiC expedition. The main variable is the surface temperature complemented by additional data necessary to process and interpret the data. All the position data of the helicopter are required to reference each pixel to the correct geolocation. The position data are published with the temperature data from the IR Camera but recorded with the embedded GNSS inertial system Applanix 27 which was installed in the helicopter as well. The position data of RV Polarstern 21-23 from the device Hydrins 1 are used in a temporal resolution of 10 minutes, but were interpolated to 1 second for the ice drift correction. Also the 10 min weather data from Polarstern are used [32][33][34] . Both, position and weather data, were downloaded from the AWI data platform (https://dship.awi.de). The mean sea surface height data is based on Andersen and Knudsen (2009) 26 and taken from https://ftp.space.dtu.dk/pub/DTU21/1_MIN/. The processing steps of the data are summarized in Fig. 1. www.nature.com/scientificdata www.nature.com/scientificdata/ technical Validation temperature structures in images. Single images have a lot of information and illustrate the potential of the data. Here we present three example images from 05 November 2019, 25 January 2020, and 23 April 2020. Please mind that the three cases have different temperature ranges. Figure 7(a) illustrates the thermal signature due to human influence. RV Polarstern is visible in the image due to the warm structures of the ship. A warmer area on the starboard side of the ship was the logistics area and appears because some snow was compressed or removed and therefore the heat conductivity increased. Even single tracks can be identified because they are warmer than the surroundings which can be explained with the same principle of heat conduction. Other features in the bottom half of the image are linear and cold structures, which indicate the deformation of the ice, i.e., ridges. We can also identify blurry, cold spots that could be caused by snowdrift. Thus, topographical features can be investigated from this two-dimensional temperature variation.
In Fig. 7(b), a lead with internal surface temperature variability is present. The warmer temperatures in the middle of the image are caused by a newly opened lead, which is covered by thin ice, whereas the surrounding thick ice is substantially colder. There is a temperature change orthogonal to the lead edges with warmer temperatures on the left-hand side and colder on the right-hand side. The temperature difference can be explained by different thin ice thicknesses which could be caused by ice movement and associated rafting due to wind forces. Wind pressure can cause compressed and therefore colder ice whereas reduced ice thickness leads to higher temperatures.  www.nature.com/scientificdata www.nature.com/scientificdata/ For Fig. 7(c) most of the surface is thick, snow-covered ice, which is characterized by colder temperatures. The warm linear structure, i.e., a lead, reaches from the lower left to the upper middle part. Next to the lead, there is a slightly warmer line and in the lower left a high variability of warm and cold surface temperatures. The corresponding RGB image, taken at the same time, illustrates the visual appearance of the same surface ( Fig. 7(d)). We selected this particular case because sufficient daylight was available for the RGB images, whereas during the polar night it was too dark to see the surface in the visible range. The white area corresponds with the cold, snow-covered sea ice, whereas the warm crack has the dark ocean color. The comparison illustrates that the surface of leads is affected predominantly by the warmer ocean compared to the colder atmosphere. Also, the thin line next to the lead can be identified in the visible range. The high variability in the lower left part is caused by broken-up ice, i.e., small floes, which result in partly missing insulation from consolidated snow-covered ice.
High resolution temperature maps. Combining the images to a map provides the opportunity to analyze the spatial characteristics and variability of the surface temperature for a larger area. The gridded temperature data are corrected for ice drift as well as for the temporal temperature change as explained above and named as time-fixed surface temperature. Figure 8(a) shows a map of the time-fixed surface temperature for the local flight on 02 October 2019 and shows the initial MOSAiC ice floe and its surroundings. This flight covered the area of and around the CO which was not installed yet. In the center, very low temperatures can be associated with thicker ice which survived the summer and was the main area for the CO. The surroundings have higher temperatures which represents www.nature.com/scientificdata www.nature.com/scientificdata/ thinner ice. The central thicker ice area is framed by warm thin structures from the north-east and south-west. These are cracks in the ice with a surface of open water or thin ice.
The time-fixed surface temperatures for the regional flight on 23 January 2020 appear cold and rather homogeneous ( Fig. 8(b)). This is plausible because the air temperature was very cold for a longer period. Consequently, the ice thickness increased and the MOSAiC ice floe consolidated with the surrounding ice. For this triangle flight pattern, the maps are more difficult to interpret because the area is larger with a smaller coverage of surface temperature data. Nevertheless, warmer linear structures, i.e. leads, are present in this flight, for example at the northernmost L-Site as well as between the two southern L-Sites.
The temperature distributions of the two flights have different characteristic (Fig. 8(c,d)). The flights have a different temperature regime and a different range (1st to 99th percentile). The local flight on 02 October 2019 reaches from 264.42 K to 268.77 K, resulting in a range of 4.35 K, whereas the regional flight on 23 January 2020 has a wider range from 233.56 K to 252.51 K, resulting in a range of 18.96 K. Local minima discriminate different ice classes or surface types.
The local flight (Fig. 8(c)) has two peaks in the colder part and one at the warmer end of the temperature distribution. These peaks in the distribution reflect the sea ice properties described above. The warmest peak at 268.6 K represents the cracks in the ice covered by open water or thin ice. The coldest peak at 265.0 K is caused by the thick ice of the future CO in the center of the map whereas the middle peak at 265.8 K is related to the slightly thinner ice in the surroundings.
The regional flight (Fig. 8(d)) has mainly very low surface temperatures and therefore the global maximum in the coldest part of the distribution at 234.2 K. Aligned with the two-dimensional appearance in the map, the www.nature.com/scientificdata www.nature.com/scientificdata/ ice conditions on 23 January 2020 were dominated by thicker ice. Still, there is a variation in the warmer part of surface temperature distribution, associated with thin ice, which is separated from the lowest temperatures by the minimum at around 247.5 K. This thin ice cover can still appear under very cold conditions because it started to grow later in cracks of the ice that were opened due to ice dynamics. The several local minima in the warmer part of the distribution indicate that there are several distinct thin ice classes. These are likely associated with lead opening events at different lag times before the overflight, i.e. the length of the ice growth period is different for the different temperature classes. Another option is rafting of thin ice in leads. artefacts in the maps. As already discussed in the methodology there are some uncertainties with the data, which influences the outcome of our processing, i.e, the maps. In Fig. 9, we present possible artefacts the user should be aware of. Within the surface temperature maps temperature jumps can exist ( Fig. 9(a)). There are two possible reasons. 1) The regular internal calibration adjusts the temperature to compensate for a temperature drift and thus can cause an abrupt shift. 2) (shown here) The temporal drift which could not be fully corrected results in a temperature shift along the intersection of two swaths with a given time difference. The geolocation uncertainty ( Fig. 9(b)) can cause shifts between two overlying tracks which results in a discontinuity of the sea ice structures. Figure 9(c) shows the influence of high incidence angles. Large incidence angles (i.e., roll) during a turn have larger distortions and can have slight shifts in geolocation. From this, blurry structures can be caused and therefore we recommend using temperature values with incidence angles (roll) smaller than 20 degrees. But for statistical interpretation, the high incidence areas can be used as additional data. For a low temperature  www.nature.com/scientificdata www.nature.com/scientificdata/ contrast scene, vertical lines can be visible. These vertical lines come from the images and are a known issues of the measurement technology of microbolometers as in Alhussein and Haider (2016) 35 . atmospheric conditions. As described above, the surface temperature is influenced by the atmosphere.
Here we provide the basic meteorological information for all flights. There is no sufficient data for a quantitative analysis. The atmospheric data lack the spatial information which we would need to cover our flight area, especially for the spatially variable cloud coverage. The atmospheric parameters from Polarstern are presented with start/end conditions in Fig. 10 and the cloud conditions from the flight weather reports 36 are listed in Table 3. This helps to better assess the potential atmospheric effect on the surface temperature. The flights were performed only during calm conditions which reduce the atmospheric influence to a minimum. The atmospheric parameters measured at Polarstern are mostly stable within one flight while they only change from flight to flight (Fig. 10). The air temperature ( Fig. 10(a)) was stable within each flight with changes of at most 2 K. The relative humidity changes ( Fig. 10(b)) for the flights on 27 February 2020 and 21 March 2020 and thus can influence the measured surface temperature due to changes in emissivity within the atmospheric pathway. Figure 10(c) shows that the wind speed varies the most within the flights in October/November and on 21 March 2020. The given cloud conditions are expected to have a minor influence except for potential local fog patches or sea fog over leads. The reported clouds are considerably higher than the flight altitude. Nevertheless, a change in cloud cover could change the incoming longwave radiation and therefore the surface temperature.  www.nature.com/scientificdata www.nature.com/scientificdata/

Usage Notes
All data are provided in a NetCDF and do not need specific software to be read.
images. The image-based data include time, positioning of the device, and georeferenced coordinates for each pixel which allows gridding of all images without repeating the georeferencing. Thereby the overlap of the images can be used differently than done here or other methods could be applied for the gridding. Based on the temperature arrays and the position and rotation data of the helicopter, the georeferencing can be reproduced.

Maps.
The map-based data are provided with longitude/latitude coordinates as well as the relative coordinates, which are distances relative to the position of RV Polarstern. The rotated relative coordinates provide a consistent coordinate system for all flights. This allows co-locating measurements for different times during the expedition while the longitude and latitude are changing due to the ice drift. For co-location with measurements without direct relation to the expedition, longitude and latitude can be used. For the analysis of the time series of gridded data, the RV Polarstern-centered coordinate system based on a stereographic projection with constant relative distances is recommended. Within the range of 5 to 40 km, the distortion is assumed to be negligible. application examples. Further ideas for the application of the surface temperature could be the use of temperature distributions of the sea ice surface to derive other surface properties, like ice types. Two-dimensional temperature patterns can be used to determine topographic features, which is difficult based on the one-dimensional temperature distribution due to the small temperature differences. Besides the statistical analysis, digital image processing methods could be applied to the image data. Generally, there is a lot of potential for analysis and comparison, in particular, in the context of the broad spectrum of measurements during the MOSAiC expedition [37][38][39] . The variety of measurements taken during the expedition should motivate to compare findings among different variables to gain an improved understanding of connected processes between ocean, ice, air as well as biochemical components of the Arctic. Based on the data presented here, further investigations of the thermal properties of sea ice will be performed and compared to other measurements taken at the same time and/or within the same area to gain a better process understanding of the Arctic climate system with the focus on the snow/ice-air interface. While the dataset presented here is suitable for most applications, there is still room for improvement for specific cases, mainly in terms of accuracy for the surface temperature values and the geolocation. All presented data are available under open access. New approaches for the data are encouraged and discussions with the authors of this paper are welcome.

Code availability
The data processing and analysis were performed with Python 3. The code responsible for processing and analyzing the maps and the images is published under open access 40 . The used colormap for the temperature is individually defined and provided as well. It is suited to visualize temperature structures for Arctic winter conditions. In case of specific requests, please contact the corresponding author directly.