District-scale surface temperatures generated from high-resolution longitudinal thermal infrared images

This paper describes a dataset collected by infrared thermography, a non-contact, non-intrusive technique to acquire data and analyze the built environment in various aspects. While most studies focus on the city and building scales, an observatory installed on a rooftop provides high temporal and spatial resolution observations with dynamic interactions on the district scale. The rooftop infrared thermography observatory with a multi-modal platform capable of assessing a wide range of dynamic processes in urban systems was deployed in Singapore. It was placed on the top of two buildings that overlook the outdoor context of the National University of Singapore campus. The platform collects remote sensing data from tropical areas on a temporal scale, allowing users to determine the temperature trend of individual features such as buildings, roads, and vegetation. The dataset includes 1,365,921 thermal images collected on average at approximately 10-second intervals from two locations during ten months.


Background & Summary
Urban ecosystems are the biggest, most dynamic, and most complicated man-made systems, with millions of people interacting and hundreds of governing agencies [1,2].Urban modernization has encouraged a significant portion of the global population to move to urban regions.Consequently, the built environment has grown rapidly, and building energy consumption has increased, which augments the release of CO 2 [3,4].Improving energy efficiency has been urged by various sectors [5].The scientific community has exerted significant efforts to enhance its understanding of the built environment by leveraging a wide range of sensing technologies [4].Furthermore, the advent of computational techniques has revolutionized data collection and analysis, enabling us to process vast quantities of information.[1].
Infrared thermography has been widely used in the built environment for many research purposes, such as urban heat island [6,7], building diagnostics [8,5,9], and urban heat fluxes [10].Infrared thermography can provide images showing the surface temperature of different elements in the built environment with a lower cost and effort compared to similar kinds of sensor networks [11].It detects infrared electromagnetic radi-ation emitted by the inspected objects and is commonly used for non-destructive testing and non-contact diagnostic technology [8,5].It is based on an infrared imaging system calibrated to measure surfaces' emissive power distribution at various temperature ranges.The infrared camera generates a series of two-dimensional, readable thermal images, with different colors and tones representing different temperatures [8,5].Infrared systems have been leveraged to analyze the built environment across various scales: Satellites, capable of wideranging data collection, are primarily used for city-scale analyses.This typically covers entire cities or large urban regions.Aerial vehicles, possessing a more focused reach, are frequently employed for investigations at both city and district scales.The term district scale indicates a specific urban district or neighborhood.It refers to a geographical area larger than a single building or block but smaller than an entire city or region.This level of analysis allows for considering the collective impact of multiple buildings and infrastructure elements within a defined urban district.Due to their unique vantage point, rooftop observatories are particularly effective for district-scale studies.These offer a detailed view of the urban microclimate within a specific district or neighborhood.Lastly, drones and handheld devices are used for detailed investigations at smaller scales.Their portability and close-range capabilities make them particularly suited for analyses at the building scale (focusing on individual buildings) and human scale (focusing on the immediate surroundings of individuals or small groups) [11].Many previous and current studies focus on the city and building scales and could not show the dynamic interactions on the district scale with high-resolution [4,1].Furthermore, urban climate studies have been focusing on the temperate climate zones, especially in North America and Europe, and there is a lack of such studies focusing on the tropical climate zones [12,13,14].
This work introduces the first thermal observatory deployed in Singapore, a tropical city, with a multi-modal platform with flexibility and high resolution to assess a wide range of dynamic processes in urban systems.The thermal observatory was set up on the roof of buildings that overlooked several educational buildings on the National University of Singapore campus.The platform gathers remote sensing data from a tropical area at a high temporal resolution at approximately 10-second intervals, providing the temperature trends of specific urban elements such as buildings, roads, and vegetation.Demonstration codes were given with data preprocessing, such as segmentation, in order to handle and analyze the generated raw data and allow users to use the data with high flexibility, catering to their research needs.

Thermal images data collection
Thermal images are taken from a FLIR A300 (9 Hz) thermal camera atop an urban-scale IR observatory, as shown in Figure 1.The detailed specifications of the thermal camera are listed in Table 1.The thermal camera is protected by a housing against rains with IP67 protection.It was attached to a pan/tilt device that can spin 360 • along its horizontal axis to record thermal images in multiple locations.These structures (thermal camera, housing, pan/tilt) were placed on a 2-m-high truss tower to avoid impediments while taking thermal images.Concrete blocks serve as a stabilizing support, and an air terminal serves as lightning protection for the truss tower.Two plugs were added to the truss tower to allow a backup battery in the water tank room to power both the thermal camera and the motor of the pan/tilt device.The backup battery is continually charged from the building's electrical supply to enable the pan/tilt unit and thermal camera to be operational for up to two hours during a power outage.A laptop with a weatherproof casing in the water tank room was also linked to the pan/tilt device and thermal camera for customizing and reviewing the collection of thermal images.
The thermal images are collected from two separate locations, namely Kent Vale and S16.The Kent Vale observatory is located on the rooftop of a 42-m-tall building in a residential area, which is located in front of a university campus consisting of office and educational buildings, in order to access the district-scale analysis.The pan/tilt unit was configured from two separate software to allow the thermal camera to take images from four positions (I, II, III, and IV), as shown in Figure 2. The first software was installed in a video encoder to control the positions of the pan/tilt unit to take a thermal image.The second software developed by NAX Instruments Pte Ltd was installed in the laptop to control the moment to take a thermal image, which could then be saved either in JPEG or FFF file format.The observatory can capture thermal images of four buildings, as shown in Figure 1.Building A, known as CRE-ATE, is centrally located at Position I and at a height of 68 meters.Curtain walls cover a significant portion of its facade.Buildings B (E1A) and C (EA) are each centered at positions II and III, respectively.Both buildings are roughly 27 meters tall, characterized by their concrete walls and single-pane windows.Building D, or SDE4, can be observed at Position IV.It is structured with metal grids mounted on a concrete frame and is designed to be a net-zero building.It stands at a height of approximately 24 meters.Surface temperatures of trees are also observed from the above positions.A road with traffic exists in front of Buildings B, C, and D, and the road can be observed in thermal images taken from Position IV.The observatory moved from Position I to IV sequentially and recorded thermal images at various intervals.The detailed time for the image taken is shown in its file name.Then, using a 4G Internet connection fitted on the laptop, the captured thermal images are stored on the Google Drive repository.The S16 observatory is located on the rooftop of a nine-story building in an educational area on a university campus.The pan/tilt unit was configured to capture images from three distinct targets, denoted as Positions 1, 2, and 3 in Figure 3, sequentially at different time intervals.The observatory can capture information from three different buildings with thermal images (Figure 4).The observatory moved from Position 1 to 3 sequentially and recorded thermal images at various time intervals per view.The thermal images obtained in this study elucidate a range of elements, including air conditioning units, glass materials, vegetation, and solar panels.These elements can be systematically analyzed to attain an in-depth understanding of their thermal behavior and consequential impact on the ambient environment.Specifically, this research is poised to investigate the influence of these elements on local temperature variations, energy consumption patterns, and the phenomena of urban heat islands.These insights carry significant implications for urban planning and the development of efficient energy management strategies.

Image
Due to the necessary adjustments to the second observatory's positioning system, thermal images at the S16 site were not consistently captured from the exact same locations-slight Figure 1: Observatory installed on the rooftop of the residential area in Singapore and its captured information.Source of the imagery: Google Earth.[15] variations to the left or right were noted.Consequently, automated segmentation tools are strongly recommended to account for these minor positional differences.It is acknowledged that these variations may impact the interpretation of temporal patterns in the data.Future enhancements to the observatory system are intended to improve the stability of image capture, leading to more consistent data collection.Furthermore, the positions from which images were captured were altered due to network issues with the observatory system on both Kent Value and S16.In an effort to achieve greater stability in the image capture locations on S16, measures were taken to ensure these positions were closely aligned with each other.Figure 5 depicts this more stable configuration.
The thermal camera comprises an optical system that directs radiation from the scene onto a microbolometer [16].Longwave infrared radiation changes the resistance of the detector, Figure 2: Positions where the observatory captured thermal images on Kent Vale [15] which is translated to apparent temperature (T ob j ) readings as follows [16]: where U ob j is the total signal response to the incident long-wave infrared radiation on the detector.This response is influenced by the material's emissivity, denoted by ϵ, and the current environmental conditions.For a comprehensive computation of the influence factors for U ob j , please refer to Equation 6.The other terms, B, R 1 , R 2 , O, and f , are calibration constants determined by the camera manufacturer in a controlled environment and are stored in the thermal image metadata.However, these parameters usually need to be re-calibrated in an outdoor environment where a significant discrepancy can be observed between U ob j and T ob j .The re-calibrated parameters are presented in Table 5. Emissivity (ϵ) measures a material's ability to emit infrared energy.Different materials have different emissivities, which can significantly impact the surface temperature measure-ments.Therefore, accounting for the material's emissivity under observation is crucial when processing thermal images.The dataset is presented in a radiometric format to preserve the raw thermal data, facilitating necessary corrections based on specific emissivity values.The total signal response U ob j is also affected by ambient conditions, such as air temperature and humidity.By incorporating local weather data into the calculations, these influences can be corrected, and the accuracy of the surface temperature measurements can be improved.Please refer to the Usage Notes section for further information.

Data preprocessing
Data preprocessing was done before analyzing the images for the Kent Vale dataset.First, data filtering was conducted to manually remove images unsuitable for analysis.Thermal images captured during the rainy period were disturbed by rain droplets and needed removal.Then, a convolution neural network (CNN) classification model was applied to exclude blurred images taken while the pan/tilt unit was moving.The CNN model is also used to classify the building type for the first

Weather station data collection
The weather station locations are shown in Figure 6.The weather stations measured meteorological data in multiple locations on the university campus around 2 meters above the ground, except for Locations 1 and 7 on rooftops.The six recorded parameters are air temperature ( • C), relative humidity (RH, %), dew point ( • C), wind speed (m/s) and direction (degree), gust speed (m/s), and solar radiation (W/m 2 ) at a 1minute interval.For location 9, only the first three parameters were recorded.The weather stations were calibrated before installation.The weather station specifications are listed in Table 2.The weather station installment details are available in Chen et al. [17], and Yu et al. [18].

Data Records
The thermal image and weather station data for the same periods are published in Zenodo with open access [20].The thermal images dataset contains two locations: Kent Vale and S16.A set of 483,915 thermal images were taken from Kent Vale between November 8, 2021, and March 8, 2022, while 882,006 thermal images were taken from S16 between August 3, 2022, and December 14, 2022.The images taken time are indicated in their file names.The data were converted and stored in JPEG format.The thermal images were stored in folders according to locations and views and listed sequentially.The detailed hierarchy of the data folders is listed in the data/README file on GitHub (https://github.com/buds-lab/project-iris-dataset) and Figure 7. Missing values are intended, as the preprocessing process filters unsuitable images for analysis.The CNN model classified the im-   [18] ages taken at Kent Vale into five different groups: CREATE (I), E1A (II), EA (III), SDE4 (IV), and Corrupted.The Corrupted group collected images that were not classified into any of the above four views, which means that these collected images were of low quality and unable to be distinguished; hence, they needed to be excluded.The images taken at S16 were not entirely classified; while some were divided into three different viewpoints (view_1, view_2, and view_3), some were not, requiring further classification.The summary of the provided dataset is shown in Table 3.In Zenodo, the Kent Vale dataset is stored in four independent zip files in sequential order, while the S16 dataset is stored in six independent zip files in sequential order.In each subcategory, the data is zipped in days.

Sensitivity analysis
Before evaluating the infrared receptor (T i j ) from the observatory, a sensitivity analysis on factors that might significantly impact its variance was carried out.First-order Sobol index [21] was used to estimate the contribution of a variable to the variance of T i j .The total Sobol index was also computed to consider the variables' interactions.
During the sensitivity analysis, the variables and their corresponding constraints are measured and shown in Table 4 [15].The Satteli sampler [22] on thermal images taken at the four positions during a sunny day in Singapore was used to calculate the first order and total Sobol indices.T i j was determined for each sample at various times of the day and thermal camera placements.The sensitivity of each variable at different periods was given using the mean indices across the four positions.

Parameters calibration
The thermal images' surface temperature estimates were calibrated in accordance with the information gathered by contact The output voltage recorded by the infrared receptor at position i j in the thermal image is represented by an array U r i j and a header in these files.Equation 2 can be used to transform U r i j into longwave radiations (L r i j ) between 7.5 and 13 micrometers: where the term c is not truly constant but depends on the emissivity of the surface material being observed, with lower emissivities reflecting more radiation from other sources.The FLIR A300 (9 Hz) thermal camera that was installed in the observatory was calibrated such that the following Equation 3 holds true for the surface temperature of a target element    detected by the infrared receptor (T r i j ): where b, r 1 , r 2 , O, and f are calibrated parameters.However, the parameters in Equation 3 need to be calibrated in order to minimize the discrepancy between T r i j and T i j in an outdoor environment.In this case, these values are acquired in a controlled setting where the target element's actual surface temperature T i j at location i j in the thermal picture is almost identical to T r i j .
The FLIR A300 camera was calibrated by manually tuning the parameters in Equation 3 until there was a satisfactory agreement between surface temperature estimations and observations.The agreement was assessed in terms of the Mean Bias Error (MBE) and Root Mean Square Error (RMSE) as shown in Equation 4 and 5: 4) where T n S ,m is the surface temperature obtained by contact surface sensors, and T n S is the surface temperature calculated from thermal pictures captured by the FLIR A300 camera at time t = t 0 + n • ∆t.The calibration aims to achieve an MBE as close to 0 with the lowest RMSE.The RMSE and MBE were calculated with different ∆t at three different positions to optimize the temporal resolution for each location (30 minutes at Position A and 5 minutes for Positions B and C).At Positions A-C, the RMSE and MBE obtained after calibration are shown in Table 5, achieving an RMSE below 2 degrees Celsius and an MBE below ± 1 degree Celsius.Furthermore, Figure 9 illustrates a comparison between the estimated and measured surface temperatures after calibration at Positions A-C, offering additional information that enables the data to be freely accessed and tailored to suit specific needs and interests in further research.

Factors consideration
When evaluating T i j from the rooftop observatory, several factors must be considered.First, the buildings, streets, and trees that could be seen from the observatory are exposed to longwave radiations mainly coming from the skydome.Then, in tandem with the longwave radiation released by the element (L) being observed from the observatory, the longwave radiation from the sky (L sky ) is reflected into the air.Both L sky and L pass through the atmosphere before arriving at the observatory, then together with the longwave radiation emitted by the atmosphere (L atm ), transmitting through the window of the housing before reaching the infrared receptor.Along with other types of radiation, the infrared receptor also captures longwave radiation from the window (L win ).The camera output voltage corresponding to U i j combining the relationship defined in Equation 3 and the aforementioned factors could be expressed as the following equation : where ε i j is the thermal emissivity from the element at position i j in the thermal image, τ atm i j is the transmissivity of the atmosphere between the element and the rooftop observatory, and τ win is the transmissivity of the window.T i j could be calculated through Equation 3 if U i j is known.ε i j and τ win could be estimated from the material properties of the element and the window, respectively.τ atm i j can be calculated from weather data collected from a weather station.
Considering building material properties is paramount in analyzing heat fluxes, as various materials exhibit distinct thermal behaviors.Specifically, Building A's facade comprises two steel walls and one glass wall.In contrast, Buildings B and C predominantly feature concrete walls, while Building D's facade uniquely incorporates steel and concrete walls.The diversity in materials necessitates a detailed understanding of their emissive properties, which are critical in accurately calculating temperatures.Any oversight in this regard could potentially lead to misleading results.Therefore, the thermal properties, including the corresponding emissivities of the building facades and vegetation, have been meticulously defined.The thermal properties of the building facades and vegetation, including the corresponding emissivities, are defined and presented in Table 6, which has been constructed based on the relevant literature.

Weather stations data
The weather information obtained from the weather stations could be used together with the thermal images gathered by the rooftop observatory to estimate urban heat fluxes.Eight weather stations measured air temperature, relative humidity, dew point, wind speed and direction, gust speed, and solar radiation.
Weather stations near the observatory camera could be used to calculate τ atm i j in the following equation according to Waldemar and Klecha [34,15]: ) where d i j is the distance between the element and the observatory at position i j of the thermal image, ω is the atmosphere's water vapor content, α, β, and x are empirical coefficients.ω can be estimated from the air temperature (T air ) and relative humidity (ϕ) measured from the weather stations using the following equation: where γ is another set of empirical coefficients described in Waldemar and Klecha [34,15] Preprocessing Following the classification of CNN, segmentation of the desired region and extraction of its radiometric data are required in order to evaluate the dataset.Flirextractor python package [35] is recommended, as it allows extracting temperature data from regions of interest in the thermal images and storing it in CSV (comma-separated values) format with a simple process.The temperature data from thermal images were converted using Equation 1.The python package Labelme [36] is helpful in the segmentation of regions of interest for analysis.The relevant codes are provided in the Code Availability section.0.98 4 [32,33] Table 6: Thermal properties of building facades and vegetation [15] Image information The thermal image data are classified and stored based on their positions.The image datasets contain information including time, device positioning, and observatory locations (Kent Vale or S16).The building detail and surrounding environment could be seen based on the corresponding real images in this work.Analysis can be conducted in the time domain, frequency domain, or both, based on the nature of the time series and the degree of information to be retrieved [16].

Potential applications
The thermal images provide surface temperatures of urban elements such as buildings, roads, and vegetation.Some example applications of surface temperature data are urban heat island analysis [15,37], urban energy monitoring [38], and building thermal performance monitoring [8,39,16].A detailed application of urban observatory is available in Dobler's work [1].Thermal images collected by the observatory can provide high-resolution data to study dynamic interactions in buildings at the district scale, much smaller than the city scale captured by satellite thermal image data, which are used frequently in relevant fields.Further studies of the thermal properties of non-residential buildings can be conducted and compared with those estimated at the city scale.The image data also lends itself to several digital image processing techniques, including edge detection, image enhancement, segmentation, and statistical analysis.For instance, edge detection could help discern different elements within the urban environment, such as buildings, vegetation, and transportation infrastructure.Image enhancement techniques could enhance the visibility of thermal patterns, and segmentation could be used to isolate areas of interest.The dataset provided here is well-suited for various applications, especially those involving urban heat island effects, building energy efficiency studies, and urban microclimate analysis.However, there remains potential for improving the accuracy of surface temperature measurements.Discussions about the data and innovative approaches to its interpretation and application are welcomed.

Privacy and safety control
These thermal images form the foundation for various research analyses related to urban living environments.For instance, they can be used to study temperature variations within an urban block, understand the effect of building materials on heat retention, or analyze the influence of vegetation on local microclimates.Researchers are encouraged to use various analytical techniques to probe the data, including computational statistics, machine learning, and sequential analysis.Machine learning, for instance, could be used to predict future temperature patterns based on past data, while sequential analysis could help understand temporal changes in the thermal landscape.However, data misuse needs to be prevented [1,40].Since IoT cameras can be easily abused [41], the dataset collected here should include sufficient privacy protections for relevant individuals and cities [1,42].In such case, the images collected in this work do not contain personally identifiable information [43] such as facial features, and the individual feature cannot be tracked.All taken images were strictly limited in pixel resolution so that interiors of buildings cannot be seen [1].

Code Availability
Python 3 was used for data processing and analysis.The simple code demonstration suggested in this work can be found in the GitHub repository stated in the Data Record section.The code demonstrates the methodology for extraction and subsequent processing of the thermal images.Few images were selected from the dataset for illustration purposes.The code uses two specific GitHub packages: Flirextracter [35] and Labelme [36].Flirextractor is an efficient Python package for extracting temperature data from thermal images and converting it into an array, then saving it as a CSV file for further access.The link to the GitHub package describing the usage of Flirextractor in detail is as follows: https://github.com/aloisklink/flirextractor.Labelme is a graphical image annotation tool written in Python.It allows users to demarcate the desired area of any shape with simple mouse clicks.Detailed instructions for Labelme can be found at the following link: https://github.com/wkentaro/labelme.The GitHub code repository exhibits A meticulously structured hierarchy designed to enhance navigation and understandability.It incorporates three primary directories: 'data', 'notebook', and 'src'.The 'data' directory segregates original, converted, and processed files, whereas the 'notebook' directory encompasses Jupyter notebooks detailing file conversion, data visualization, and data analysis procedures.The 'src' directory hosts a Python script dedicated to image conversion.Furthermore, the 'data' folder manifests a systematic organization structured to efficiently manage different stages of data processing.It consists of three primary subdirectories: 'original', 'convert', and 'labelme'.The 'original' and 'processed' subdirectories mirror each other, containing dated folders representing different acquisition dates, with each date folder further divided into 'view_1', 'view_2', and 'view_3' subdirectories.These subdirectories contain the respective images captured from each view named "smap-YYYY-MM-DDTHH-MM-SS.MS.jpeg".During the data processing phase, corresponding 'json' files for each image are generated in the 'labelme' directory.In addition, for each image, a dedicated directory is established within the 'convert' subdirectory, encapsulating original, processed, and labeled images and a text file enumerating the detected labels.Detailed file structure tree plots, included within the respective README files, furnish comprehensive visual representations of the file and folder organization, thus facilitating an understanding of the hierarchical structure and interrelationships among different components of the codebase.Moreover, an example involving a representative image has been conducted to illustrate the efficacy of the employed image processing techniques; the results highlight the segmentation, further processing, and final presentation of the image augmented with enhanced colors and informative labels.Please contact the corresponding author directly if you have any particular requirements.

Figure 3 :
Figure 3: Observatory installed on the rooftop of the university campus in Singapore and its captured information.Source of the imagery: Google Earth.

Figure 4 :
Figure 4: Positions where the observatory captured thermal images on S16

Figure 5 :Table 3 :
Figure 5: Positions where the observatory captured thermal images on S16 after adjustment

Figure 7 :*
Figure 7: File structure demonstration example for the dataset

Figure 8 :
Figure8: Locations where sensors were placed in to calibrate thermal images and measure the surface temperature[15]

Figure 9 :
Figure9: Comparison of estimated and measured surface temperatures after calibration of the A300 infrared camera at the observatory.[15]

Table 1 :
Information about the FLIR A300 thermal camera

Table 2 :
Specifications of weather station sensors The code structure demonstration is also available in the Readme file in the GitHub repository.
directory runs preprocessing scripts.A more detailed description of the code file organization and explanation is available in the Code Availability section.The weather station data is in Excel format and stored in Zenodo as WS_data.zip.It con-tains some missing values; therefore, prior data preprocessing is needed.

Table 4 :
Constrains of each variable considered during the sensitivity analysis of T i j