Introduction

Body temperature is a key physiological parameter that affects a host of physiological processes and can be utilised as a scientific and humane endpoint in biomedical research1,2,3,4. Despite its physiological relevance, body temperature is often ignored in rodent studies because of the practical difficulties associated with its measurement; especially when continuous measurements are required.

The measurement of body temperature in small rodents is typically performed by contact method, implanted telemeter or thermal imaging5. Each of these methods is associated with pros and cons5. Contact measurements such as inserting a rectal temperature probe are relatively easy to perform and do not require expensive equipment but are associated with an increased body temperature due to handling stress for the animal, potential health complications due to probe insertion, and only provide a snapshot of a continuously changing variable. Telemeter implantation enables the continuous and accurate recording of core body temperature (Tcore) but requires invasive surgery resulting in stress, may alter physiology, requires substantial time and skill from the researcher, and malfunctions of the telemeter can typically not be remedied. Thermal imaging and other non-contact temperature measurements provide a non-invasive method to record body temperature6, but these methods typically require animal handling, are not continuous, and do not measure Tcore7,8,9,10,11,12,13,14,15,16,17,18. The non-invasive nature of thermography measurements does however provide the potential to assess body temperature in undisturbed freely-moving laboratory mice, removing the confounding factor of handling stress and representing an obvious refinement in terms of animal welfare.

Here, we develop and optimise an algorithm for processing thermal imaging data of freely-moving mice with the goal of assessing whether the resulting Tcore estimate, based on continuous measurements of the highest temperature on the outside of the body (Tskin), can be used to describe (changes in) Tcore. Mice were implanted intraperitoneally with a body-temperature telemeter while Tskin was recorded every second by thermal imaging for three days. This was done under standard laboratory conditions as well as in a subgroup of food-restricted mice exhibiting daily torpor, a transient hypometabolic state associated with a marked decrease in body temperature3. Assessment of different algorithm parameters (summary statistics, sampling and averaging intervals) identified averaging of the maximum Tskin per 60 s over 30-min intervals (Tskin,max) as the most reliable way to estimate Tcore. Tskin,max provides an accurate description of relative changes in Tcore within individual animals. Between-animal variation in the relationship between Tcore and Tskin,max does however limit the utility of Tskin measurements as a measure of absolute within-animal changes in Tcore or differences in Tcore between animals.

Results

Estimating core body temperature non-invasively by continuously recording skin temperature

Measuring Tskin using near-infrared thermography enables the continuous assessment of body temperature during both day and night in freely-moving animals. Here, Tskin was recorded in five wildtype mice housed in open-topped cages at an ambient temperature of 22 ± 1 °C (Fig. 1a). A limited amount of nesting material was provided, to ensure that the mice were fully visible at all times. As expected, the warmest spot in each image was associated with the location of the mouse in the cage (Fig. 1b), thus enabling the description of Tskin by recording the temperature of the warmest pixel each second (Fig. 1c). As illustrated in the representative 10-min recordings (Fig. 1c), Tskin often changed rapidly (< 1 min) by 1–2 °C while the simultaneously recorded Tcore did not reveal corresponding changes. Based on observations of the mice during these recordings, we established that these rapid changes in Tskin were typically associated with movement of the animal. The observed Tskin was typically higher and more variable during movement, likely as a result of changes in the exposed parts of the skin due to the animal’s change in position and posture (Fig. S1). Consistent with this interpretation, periods of high Tskin variability were more common during the night when mice are most active. The high variability of Tskin compared to Tcore (Fig. 1c) highlights the importance of processing Tskin measurements to obtain a reliable proxy for Tcore rather than relying on raw Tskin measurements.

Figure 1
figure 1

Continuous recording of skin (Tskin) and core temperature (Tcore) in freely-moving laboratory mice. (a) Mice were each individually housed in an open-topped cage placed under a thermal camera. (b) Thermal images of three mice at rest (left) and while moving through the cage (right). Skin temperature was recorded by storing the temperature of the warmest pixel in view (1 Hz). (c) Representative 10-min traces of Tcore (1 min−1, black dots) and Tskin (1 Hz, grey dots) for three mice in the middle of the light- (left) and dark-phase (right). The distribution of Tskin measurements within each minute is quantified by the minimal (blue), mean (pink) and maximal (red) Tskin. Large fluctuations in Tskin can be observed especially at night, likely as a result of variability in the warmest observed pixel due to movement of the animal.

Reducing skin temperature variability by optimising algorithm parameters

The algorithm developed here was designed to estimate Tcore based on Tskin measurements taken every second. For this, a summary statistic was used to describe Tskin during each short sampling interval (1 s–10 min) and these values were averaged over a longer averaging interval (30 s–12 h). This averaged measure of Tskin (Tskin,max) was subsequently transformed to obtain an estimate of Tcore using the slope and intercept describing the linear relationship between Tskin,max and Tcore (Fig. 2). The present paper describes the optimisation of algorithm parameters with the objective of estimating Tcore with high accuracy, equal variance at different levels of Tcore, and ideally a relationship between Tskin,max and Tcore with a slope of 1 (i.e. Tcore = Tskin,max + constant).

Figure 2
figure 2

Dataflow used to optimise the estimation of Tcore based on Tskin measurements.

The optimal algorithm parameters were determined by assessing how the possible parameter combinations (sampling interval, summary statistic, averaging interval) affected the reliability of the Tcore estimate. The optimal sampling interval and summary statistic for estimating Tcore based on Tskin measurements was determined by comparing the goodness of fit associated with each combination of algorithm parameters (Fig. 3a). Using the minimal Tskin per sampling interval to estimate Tcore resulted in a progressively worse fit with increasing sampling interval length while all other summary statistics resulted in an improved fit with longer sampling intervals. In all five mice, use of the maximum temperature as a summary statistic resulted in a better fit of Tcore compared to the median, arithmetic- or geometric mean, especially at intermediate sampling interval lengths (Fig. 3a, Fig. S2). The superiority of using the maximum per sampling interval as opposed to calculating the arithmetic mean over the whole averaging interval is illustrated in subsequent analyses (Fig. 3b,c) by the improved goodness of fit associated with different sampling intervals compared to the 1-s interval (since the Tskin sampling rate was also 1 s, the 1-s sampling interval estimate is equivalent to taking the mean over all measurements within an averaging interval). The optimal sampling interval also depended on the chosen averaging interval with sampling intervals of 60 s or 120 s resulting in the best estimate of Tcore while both shorter and longer sampling intervals were associated with a reduced goodness of fit (Fig. 3b,c). The quality of the Tcore estimate was strongly influenced by the length of the averaging interval (Fig. 3b,c). The accuracy of discrete Tcore estimates increased consistently with longer averaging intervals in all five individual mice although the most pronounced increase occurred between averaging intervals of 10 and 60 min (Fig. 3b). The increasing accuracy by which progressively longer (> 60 min) discrete averaging intervals estimated the mean Tcore over that same (long) interval (Fig. 3b) was however inherently coupled with a decreasing ability to describe Tcore changes over time (Fig. S3). The optimisation of this trade-off between the accuracy of the average and describing Tcore changes over time was done by sliding the averaging interval in 30 s steps to estimate a rolling average for Tcore (Fig. 3c). This analysis demonstrated that averaging intervals of 30 or 60 min maximised the accuracy of the average and the description of the temporal changes in Tcore. Furthermore, an averaging interval of 30 min resulted in a relationship between Tskin,max and Tcore with an average slope close to 1 (Fig. S4). Based on these outcomes, we conclude that the optimal algorithm to estimate Tcore based on Tskin measurements taken every second samples Tskin,max per 60 s and averages these values over 30 min intervals.

Figure 3
figure 3

Optimisation of algorithm parameters used to estimate Tcore based on thermal camera measurements of Tskin. (a) Goodness of fit associated with different summary statistics calculated over intervals between 1 s and 10 min. The outcomes associated with the arithmetic mean and geometric mean are indistinguishable. (b) Goodness of fit associated with discrete estimates of Tcore for each averaging-interval duration based on Tskin,max over different sampling intervals (1–600 s) and averaged over intervals between 30 s and 12 h. Fill and line colour become progressively darker with increasing sampling interval duration. Sampling of Tskin,max over intermediate sampling intervals (30–120 s) consistently resulted in a better description of Tcore compared to shorter and longer sampling intervals. Sampling Tskin,max over an intermediate sampling interval resulted in an ~ 15% improvement of fit compared to taking the mean temperature (i.e. 1 s sampling interval) over the same averaging interval. (c) Goodness of fit associated with estimating each measurement of Tcore (30 s time resolution) using a rolling average based on Tskin,max over different sampling intervals (1–600 s) and averaged over intervals between 30 s and 12 h. Fill and line colour become progressively darker (lightest: 1 s → darkest: 600 s) with increasing sampling interval duration. Data represents the between-individual mean and SEM goodness of fit associated with the presented combination of algorithm parameters.

Relationship between skin and core temperature

Combining the Tskin-derived estimate obtained using the algorithm described above with the slope and intercept describing the linear relationship between Tskin,max and Tcore optimised for each individual mouse resulted in an excellent description of Tcore over the three-day test period (Fig. 4). Such an individualised optimisation does however require the implantation of a telemeter, thus negating the main benefit of using non-invasive thermal imaging to estimate Tcore. Our goal here is to describe the average relationship between Tskin,max and Tcore and assess whether these group-level parameters enable an adequate estimation of Tcore based on Tskin measurements in individual mice.

Figure 4
figure 4

Three-day core temperature recordings measured directly (Tcore, red) and estimated based on skin temperature (Tskin,max, black) in five mice. Core temperature estimates based on Tskin,max are depicted as a rolling average (black line) and as discrete averages (black dots, 1 per 30 min). Tskin,max was calculated by averaging the maximum Tskin per minute over a 30-min interval. Slope and intercept describing the linear relationship between Tskin,max and Tcore were optimised for each mouse individually. Day and night are represented by the white- and light-grey background, respectively. As expected for a nocturnal species, both Tcore and Tskin,max measurements show that body temperature is highest during the night in mice. ZT: Zeitgeber time.

The variance in the difference between Tcore and Tskin,max was independent of Tcore for all five mice (Fig. 5a), this temperature difference was only minimally influenced by the time of day (range of hourly averages: 4.7–5.1 °C, p < 0.0001; Fig. 5b), and the temperature difference was consistent across measurement days within each of the five mice (Fig. 5c). The difference between Tcore and Tskin,max did not correlate with Tcore in two of the five mice but in the other three mice a significant positive correlation was observed between Tcore and the difference between Tcore and Tskin,max (Fig. 5a). The relationship between Tcore and the difference between Tcore and Tskin,max was strongly dependent on the chosen sampling interval but not the averaging interval duration (Fig. S5A). Between-animal variance in this relationship was substantial, however, and precluded the selection of algorithm parameters that would prevent a correlation between Tcore and the difference between Tcore and Tskin,max in all mice (Fig. S5B). As noted above, the selected algorithm parameters resulted in a relationship between Tcore and Tskin,max with a slope of ~ 1 (Fig. 5d). When this average slope was used to estimate Tcore based on Tskin,max, the observed residual difference between Tcore and Tskin,max (intercept) was consistent between days within all mice, although the between-animal variance was substantial (Fig. 5e). Overall, the selected algorithm parameters (averaging maximum Tskin per 60 s over 30 min) resulted in an estimate of Tcore that was highly consistent between days with equal variance at different Tcore values, a minimised correlation between Tcore and the difference between Tcore and Tskin,max, and a relationship between Tskin,max and Tcore with a slope and intercept of 0.93 and 7.1 °C respectively.

Figure 5
figure 5

Core temperature (Tcore) estimation by continuous skin temperature (Tskin,max) measurements; accuracy assessment. (a) Correlations of Tcore with the difference between Tcore and Tskin,max for all five mice. Comparisons are made between the average Tcore per 30 min and the average of the maximum Tskin per minute over the same averaging interval. Solid lines represent the least-squares linear fit while dashed lines enclose the 2-standard-deviations area surrounding this fit. (b) The difference between Tcore and Tskin,max was only marginally correlated with time of day (~ 0.3 °C, p < 0.0001). Traces of individual mice (dark grey lines) are slightly offset on the x-axis to improve visibility. (c) The difference between Tcore and Tskin,max on each of the three measurement days in all five mice. (d) The slope of the relationship between Tcore and Tskin,max in all five mice. Mean and SD summarise the within-individual variance in slope between the three measurement days. (e) The intercept of the relationship between Tcore and Tskin,max in all five mice. This assessment incorporated the group-average (0.93) as the slope for all mice. Within-individual variance in the difference between Tcore and Tskin,max (c), slope (d) and intercept (e) was substantially lower than the between-individual variance. Solid lines in (ce) represent the group mean while dashed lines enclose the 2-standard-deviations area surrounding this average. Error bars represent SD.

Between-animal variability in the relationship between skin and core temperature

A key aim of this study was to determine whether thermal imaging could be used to reliably estimate Tcore non-invasively in freely-moving mice. To this end, it would be essential that Tcore can be estimated without having to determine the relationship between Tcore and Tskin,max for each individual animal. As a minimal assessment of this requirement, the group-average slope and intercept were used to estimate Tcore based on Tskin,max in the five mice for which these group averages had been optimised. The use of the group average slope and intercept dramatically reduced the quality of Tcore estimates in some of the mice (Fig. 6a) because it resulted in a systematic under- or overestimation of Tcore (Fig. 6e) due to between-animal differences in slope and intercept. Changing of algorithm parameters could not further reduce the between-animal variance in slope and intercept (Figs. S4B, S6). As a result of the high between-animal variance in the relationship between Tcore and Tskin,max observed in the current group of five mice, Tskin,max did not provide a reliable estimate of the absolute value of Tcore in individual mice (systematic deviation range: − 0.6 to + 0.9 °C; Fig. 6b). To place these values in context, these deviations span approximately half the observed Tcore range (3.1–3.8 °C; Fig. 4). Conversely, between-animal comparisons of absolute changes in Tcore based on Tskin,max could be made with greater accuracy (systematic deviation range: − 0.5 to + 0.5 °C per 2.5 °C Tcore change; Fig. 6c). Within-animal comparisons of relative changes in Tcore could be estimated with the highest accuracy (systematic deviation: 0 °C, within-animal day-to-day intercept range: 0.2 °C, within-day intercept SD: 0.3–0.4 °C; Figs. 5e, 6d), thus demonstrating the utility of thermography for comparisons of relative Tskin changes between days (or treatments) within animals.

Figure 6
figure 6

Between-animal variability in the relationship between Tskin,max and Tcore limits utility of Tskin-derived estimates of Tcore. (a) Goodness of fit associated with estimating Tcore based on Tskin,max using group average or individually optimised values for the slope and/or intercept in individual mice (dark grey). Group averages are plotted in black. (bd) Expected systematic temperature deviations at low, mean and high Tcore for models using group averages or individually optimised values for the slope and/or intercept in individual mice (dark grey). Expected systematic deviations are calculated based on the difference between the group-average and individually-optimised slope and intercept for each individual mouse. Group averages are by definition 0, with greater SD values representing higher between-animal variation in the Tcore estimation error. (e) Two representative examples of the measured Tcore (red) and the estimated temperature based on Tskin,max (black) with group averages used as slope and intercept. Core temperature estimates based on Tskin are depicted as a rolling average (black line) and as discrete averages (black dots, 1 per 30 min). Day and night are represented by the white- and light-grey background, respectively. Error bars represent SD. ZT: Zeitgeber time.

Estimating core body temperature during daily torpor

Exposure to energetically challenging conditions (e.g. hunger, cold) induces energy saving strategies such as daily torpor in mice3. Here, food intake of three mice was restricted to a single daily meal consisting of ~ 70% of their ad libitum intake resulting in daily torpor bouts in all mice (duration: 4–8 h, minimum core temperature: 25–27 °C; Fig. 7a). In line with our findings in mice fed ad libitum, averaging Tskin,max per 60 s over a 30-min interval resulted in an accurate estimate of Tcore (Fig. S7A–C) with comparable variance at different values of Tcore (Fig. 7b). The relationship between Tcore and Tskin,max had a slope that was consistently higher than that observed in homeothermic mice (Fig. S7D,E), reflecting an altered relationship between Tcore and Tskin,max in mice under energetically challenging conditions. The difference between Tcore and Tskin,max decreased linearly with lower values of Tcore (Fig. 7b), thus complicating the Tskin-derived estimation of Tcore (i.e. slope > 1). This correlation between Tcore and the difference between Tcore and Tskin,max could not be eliminated by altering algorithm parameters (Fig. S7F,G). Although the sample size was insufficient to reliably estimate between-animal variance in the relationship between Tcore and Tskin,max, the observed difference in slopes in individual mice (range: 1.42–1.53, Fig. S8) would translate to systematic deviations of ± 0.5 °C between mice over the 10 °C temperature difference observed under these energetically challenging conditions. In line with our observations in ad libitum fed mice, the day-to-day within-animal variance in the relationship between Tcore and Tskin,max was very limited (Fig. S7H–K). This demonstrated the utility of non-invasive continuous thermography measurements to perform within-animal comparisons of relative changes in Tcore in mice during daily torpor.

Figure 7
figure 7

Body temperature during daily torpor in mice exposed to food restriction. (a) Three-day core temperature recordings measured directly (Tcore, red) and estimated based on skin temperature (Tskin,max, black) of three mice during food restriction. Daily torpor was induced by chronic food restriction to ~ 70% of ad libitum food intake. Food was provided daily, three hours before lights-off (dashed line). Day and night are represented by the white- and light-grey background, respectively. Skin temperature was calculated by averaging the maximum Tskin per minute over a 30-min interval. Slope and intercept describing the linear relationship between Tskin,max and Tcore was optimised for each mouse individually. ZT: Zeitgeber time. (b) The difference between Tcore and Tskin,max was strongly correlated with Tcore in all three individual mice. Comparisons are made between the average Tcore per 30 min and Tskin,max over the same averaging interval. Solid lines represent the least-squares linear fit while dashed lines enclose the 2-standard-deviations area surrounding this fit.

Discussion

Monitoring body temperature provides important information about the physiological and metabolic state of animals. Established techniques for measuring body temperature are associated with restraint stress, do not allow continuous recordings, and/or require complicated invasive surgery5. The use of infrared thermography has the potential to provide a non-invasive method to measure Tcore but its own methodological limitations have to be taken into account5,6. Previous applications of non-contact Tskin measurements in biomedical research have been limited by the requirement that animals needed to be handled by the experimenter7,8,9,10,11,12,13,14,15, measurements were taken at a limited number of timepoints7,8,10,11,12,13,14,15,16,17, and/or measurements resulted in large datafiles requiring complex data analysis10,12,13,16,17,18. The present study developed, optimised and validated an algorithm that enables estimation of relative changes in Tcore based on the continuous and non-invasive automated measurement of Tskin of mice housed at room temperature. The high variability in Tskin compared to Tcore measurements (Fig. 1c) necessitates data processing to obtain a less-variable estimate of Tskin. Here we show that averaging the maximum Tskin per 60 s over a 30 min interval (Tskin,max) provides the most accurate estimate of Tcore. High between-animal variability in the linear relationship between Tskin,max and Tcore (i.e. slope and intercept) severely limits the accuracy of Tskin recordings as a measure of absolute Tcore. Instead, because of the low day-to-day within-animal variability in the relationship between Tskin,max and Tcore, Tskin recordings provide an excellent tool to assess relative differences in Tcore within individual animals.

Given the aforementioned strengths and limitations in estimating relative changes in Tcore based on Tskin, this method provides an excellent tool to continuously monitor relative Tcore changes in undisturbed, individually-housed, freely-moving mice. This was illustrated here by characterising the temporal fluctuations in body temperature throughout day and night as well as during daily torpor. The automated and continuous nature of the measurement and data processing steps presented here compare favourably with previous approaches using thermography to assess Tcore changes10,12,13,16,17, albeit at the expense of accuracy of its absolute Tcore estimate13,17. Although the inability to accurately estimate absolute Tcore values compares negatively to telemeter implantation, this cost will often be outweighed by welfare, time and financial benefits associated with not having to perform surgery, especially in cases where (physiologically-relevant) changes in body temperature are the prime concern14,16,18. When used as a humane endpoint, body temperature is often compared to a reference value at a single timepoint8,11,14. Although such a between-animal comparison does not suit the current method, the continuous nature of its Tcore estimate enables welfare decisions to be based on multiple characteristics such as the daily body temperature profile, its timing, and an individually calibrated set point. The requirement that animals are individually housed in open-top cages with reduced access to nesting materials (to ensure visibility of the animal) also provides a limitation of the current approach, although depending on the experimental paradigm this might be a worthwhile trade-off. Overall, we view the method presented here as a useful addition to a repertoire of different approaches to monitor body temperature5, that, depending on the specific research question, might provide benefits compared to other established techniques.

Methods

All animal procedures were approved by the ACER AWERB of the University of Oxford and performed under a UK Home office license in accordance with all relevant laws and regulations. Five wildtype C57Bl6/J mice were implanted intraperitoneally with an Anipill temperature telemeter. Following post-operative recovery mice were housed at an ambient temperature of 22 ± 1 °C in open-top cages, each positioned under a thermal camera. Tskin was measured every second by storing the temperature of the warmest pixel. Tcore was measured every 30 s by the implanted Anipill. The quality of the Tskin-derived Tcore estimate was optimised based on the goodness of fit and variance distribution associated with each combination of different summarising statistics (minimum, median, arithmetic mean, geometric mean and maximum), sampling intervals (1 s–10 min), and averaging intervals (30 s–12 h). The linear relationship (slope and intercept) between Tskin,max and Tcore was assessed in 5 ad libitum fed mice and subsequently under energetically challenging conditions in 3 of these mice. Systematic deviations represent the difference between the estimated Tcore calculated based on individually-optimised versus group-average based descriptions of the relevant relationship between Tskin,max and Tcore for each of the animals and presented assessments. Extended methodological details are available in the SI Methods and software templates to calculate Tcore estimates based on the methods described here have been uploaded to Figshare (10.6084/m9.figshare.12587909).