Continuous and non-invasive thermography of mouse skin accurately describes core body temperature patterns, but not absolute core temperature

Body temperature is an important physiological parameter in many studies of laboratory mice. Continuous assessment of body temperature has traditionally required surgical implantation of a telemeter, but this invasive procedure adversely impacts animal welfare. Near-infrared thermography provides a non-invasive alternative by continuously measuring the highest temperature on the outside of the body (Tskin), but the reliability of these recordings as a proxy for continuous core body temperature (Tcore) measurements has not been assessed. Here, Tcore (30 s resolution) and Tskin (1 s resolution) were continuously measured for three days in mice exposed to ad libitum and restricted feeding conditions. We subsequently developed an algorithm that optimised the reliability of a Tskin-derived estimate of Tcore. This identified the average of the maximum Tskin per minute over a 30-min interval as the optimal way to estimate Tcore. Subsequent validation analyses did however demonstrate that this Tskin-derived proxy did not provide a reliable estimate of the absolute Tcore due to the high between-animal variability in the relationship between Tskin and Tcore. Conversely, validation showed that Tskin-derived estimates of Tcore reliably describe temporal patterns in physiologically-relevant Tcore changes and provide an excellent measure to perform within-animal comparisons of relative changes in Tcore.

returned to their home cage once locomotor behaviour was observed (after 15-30 min). Postsurgery, analgesia was provided orally (meloxicam in jelly) for at least 3 days while animals were checked 1-2x daily to confirm proper recovery.

Experimental setup and procedures
Following full post-operative recovery, mice were transferred to open-topped recording cages (sawdust bedding) that were each positioned under a thermal camera (Optris PI 160 with standard 61° lens, Optris GmbH, Berlin, Germany). Accurate calibration of the thermal camera was confirmed pre-experimentally by comparison to a common heat source. The camera was positioned above the middle of the cage to ensure that the whole mouse was always in view. Similarly, only a limited amount of nesting material was provided to ensure that the mouse was unable to shield from the overhead camera. Food was provided on the cage floor and replenished regularly under ad libitum feeding conditions. Tskin was recorded every second for a 3-day period by storing the temperature of the warmest pixel in view using the software provided by the camera's manufacturer (Optris PIX Connect, Optris GmbH). During this experimental period, Tcore was recorded and stored every 30 s using the standard Anipill recording module. Post-experimental confirmation of the accuracy of the Anipill temperature telemeters/loggers at different temperatures showed that individual device calibration was not required.
The relationship between Tskin and Tcore during daily torpor was assessed in a subgroup of 3 of the 5 mice following the 3-day ad libitum feeding condition described above. Daily torpor was induced by restricting daily food intake to a single meal (~70% of ad libitum) provided 3 h before lights off (zeitgeber time 9). The exact meal size was calibrated daily based on the body mass (measured daily at lights off) to maintain body mass at 85-90% of ad libitum feeding weight. After 1-2 weeks of this torpor-induction protocol, Tskin and Tcore were recorded for multiple days and a 3-day experimental interval during which the mice exhibited daily torpor bouts on each day was selected for each mouse individually.

Data analysis
All data analyses were performed using custom-written scripts in Scilab 6.0.1 (www.scilab.org). Recorded Tskin measurements (1 s recording interval) were subdivided in sampling intervals of different durations (range: 1 s -10 min) with the goal of producing a description of Tskin that would produce a better estimate of Tcore than simply taking the average over the averaging interval. Sampling intervals of 5 s and longer were expressed as a fraction/multiple of the Tcore sampling interval to compensate for minor differences in sampling interval duration of the Tcore measurements (range: 28 -31 s). Sampling/averaging intervals were centred around the timing of Tcore measurements. Tcore was assumed to be equal to the nearest measurement for sampling intervals shorter than the Tcore interval duration while Tcore was averaged over the full assessment interval for sampling/averaging intervals for longer assessment intervals. The Tskin distribution during each sampling interval was described by 5 different summary statistics (minimum, median, arithmetic mean, geometric mean and maximum) which were subsequently assessed to determine the optimal summary statistic to estimate Tcore. The duration of averaging intervals (30 s -12 h) was defined as the multiple of the expected number of Tcore measurements during the chosen interval (duration / 30 s). Discrete averages were calculated by averaging the summary statistics describing all sampling intervals occurring during each of the non-overlapping averaging intervals. Rolling averages were calculated by shifting the averaging window by 30 s for each interval. The quality of Tskin-derived Tcore estimates was assessed based on the associated goodness of fit, distribution of residuals as well as within-and between-animal variability. As part of these analyses, the optimal slope and intercept describing the linear relationship between Tskin,max and Tcore were estimated for each assessment separately. Group averages of these individually optimised slope and intercept values were subsequently assessed for their ability to estimate Tcore based on each mouse's Tskin,max measurements. The systematic deviation in estimated Tcore was calculated for each mouse at different body temperatures (low: 35 °C, mean: 36.166 °C, high: 37.5 °C) as the within-individual average deviation resulting from the difference between the individually-optimised and group-average relationships in the part of the relationship between Tskin,max and Tcore that is relevant for the presented assessments (i.e. between-individual absolute differences, within-individual absolute differences, within-individual relative differences). Statistical tests were performed as mixed-effects general liner models with animal# included as a random variable if appropriate while residuals were inspected visually to confirm the assumptions of normality and heterogeneity of variance. Figure S1: Representative thermal images of a single individually-housed mouse. The warmest spot on the body of the mouse depends on the positioning of the animal relative to the camera but is typically associated with the head or upper back. Reflections of the mouse coming off the walls of the cage can be observed in most images. The colour scale represents temperature in °C.

Figure S2: Correlation between Tskin and Tcore for all nine sampling intervals in all five mice.
Dark-grey dots represent the correlation between the maximum Tskin sampled during each interval and the average Tcore over that same interval. Black lines represent the least-squares linear fit. Figure S3: Tcore measurements and estimates during three-day period for all 10 possible averaging intervals in all five mice. Tcore was either measured directly (dark-grey dots) or estimated based on Tskin,max (discrete averages: black dots, rolling averages: black lines). The maximum Tskin was sampled every 60 s and averaged over the specified averaging interval. The relationship between Tskin,max and Tcore (slope and intercept) was optimised for each mouse individually and separately for discrete and rolling averages. ZT: zeitgeber time.    Goodness of fit associated with discrete estimates of T core for each averaging interval based on Tskin,max over different sampling intervals (1-600 sec) and averaged over intervals between 30 sec and 12 hours. (C) Goodness of fit associated with estimating each measurement of Tcore (30 sec time resolution) using a rolling average based on Tskin,max over different sampling intervals (1-600 sec) and averaged over intervals between 30 sec and 12 hours. (D, E) The slope describing the relationship between Tskin,max and Tcore depends on the sampling-but not the averaging interval. (F, G) The slope describing the relationship between Tcore and the difference between Tskin,max and Tcore depends on the sampling interval but not on the averaging interval. (H, J) The slope of the linear relationship between Tcore and Tskin,max in all three mice on the three assessment days. (I, K) The intercept of the linear relationship between Tcore and Tskin,max in all three mice on the three assessment days. This assessment used the group-average as the slope for all mice. Solid lines in (H, K) represent the group mean while dashed lines enclose the 2-standard-deviations area surrounding this average. Calculations were based on discrete averages (B, D, F, H, I) or rolling averages (C, E, G, J, K). Error bars represent between-animal SEM (A-G) or within-animal within-day SD (I, K). Fill and line colour become progressively darker with increasing sampling interval duration (B-G). Figure S8: The slope describing the relationship between Tskin,max and Tcore in energeticallychallenged mice exhibiting daily torpor. Dark-grey lines represent the observed slope in each of the three mice. Black lines represent group averages and are identical to data presented in S7D. Tskin,max per sampling interval was used as the summary statistic and all slopes were based on analysis of the discrete averages.