External auricle temperature enhances ear-based wearable accuracy during physiological strain monitoring in the heat

Body core temperature (Tc) monitoring is crucial for minimizing heat injury risk. However, validated strategies are invasive and expensive. Although promising, aural canal temperature (Tac) is susceptible to environmental influences. This study investigated whether incorporation of external auricle temperature (Tea) into an ear-based Tc algorithm enhances its accuracy during multiple heat stress conditions. Twenty males (mean ± SD; age = 25 ± 3 years, BMI = 21.7 ± 1.8, body fat = 12 ± 3%, maximal aerobic capacity (VO2max) = 64 ± 7 ml/kg/min) donned an ear-based wearable and performed a passive heating (PAH), running (RUN) and brisk walking trial (WALK). PAH comprised of immersion in hot water (42.0 ± 0.3 °C). RUN (70 ± 3%VO2max) and WALK (50 ± 10%VO2max) were conducted in an environmental chamber (Tdb = 30.0 ± 0.2 °C, RH = 71 ± 2%). Several Tc models, developed using Tac, Tea and heart rate, were validated against gastrointestinal temperature. Inclusion of Tea as a model input improved the accuracy of the ear-based Tc algorithm. Our best performing model (Trf3) displayed good group prediction errors (mean bias error = − 0.02 ± 0.26 °C) but exhibited individual prediction errors (percentage target attainment ± 0.40 °C = 88%) that marginally exceeded our validity criterion. Therefore, Trf3 demonstrates potential utility for group-based Tc monitoring, with additional refinement needed to extend its applicability to personalized heat strain monitoring.

monitoring can complement existing strategies by improving work rest cycle development and individualized safety monitoring.
Particularly, accurate prediction of T c , either prospectively or in real-time, may be crucial in preventing overor under-protection from heat-related illness 9,13 .Yet, there are currently no accurate and practical methods for monitoring T c in occupational and/or athletic settings.Although rectal or oesophageal thermistors are valid for continuous monitoring of human T c , such sensors are highly invasive, single-use, and can cause significant user discomfort thus making them unfeasible for daily implementation 14 .Furthermore, despite an improved user comfort when utilising ingestible telemetric pills, this strategy comes with a prohibitively high cost and is complex to implement due to the need to account for individual differences in gastrointestinal motility 9 .While non-invasive surrogates such as measurement of oral and axilla temperature have been implemented for recording of T c in clinical settings, these strategies remain unsuitable for use during physical activity due to a high susceptibility to environmental factors and inability to provide continuous T c measurement 14 .
Amongst the host of measurement sites explored, the ear has notably emerged as a viable option for human T c measurement.Tympanic membrane temperature (T ty ) was proposed due to the vascularisation of the tympanic membrane by the internal carotid artery which also irrigates the hypothalamus 15 .Measurement of T ty is achieved by direct contact with the tympanic membrane or indirect measurement of heat emitted from the tympanic membrane and aural canal 16 .While the former demonstrated acceptable correlations with T c 17 , it is unsafe for use in exertional heat strain monitoring as shifting of the thermistor during physical movement can lead to tympanic membrane injury or cause pain should the sensor contact the richly innervated portion of the aural canal 16 .Indirect T ty measurement using infrared sensors provides better comfort and safety.However, as a line of sight to the tympanic membrane is necessary for accurate reflection of T c , factors such as aural canal shape and/or inadequate depth of insertion can lead to discrepancies 18 .
Monitoring of aural canal temperature (T ac ) is a promising alternative.Indeed, T ac measurements displayed good correlation with rectal temperature (T re ) when the sensor was placed 10 mm away from the tympanic membrane 19 .Furthermore, Nagano, et al. 20 demonstrated small deviations between T ac and T re during intermittent cycling.This is further supported by recent findings which reported that oesophageal temperature (T es ) was reliably predicted following modelling of T ac inputs from multiple sensors along the aural canal 21 .Importantly, no subject discomfort was reported as a result of the sensor placement 16,20,22 which supports the notion that T ac monitoring can be an ideal method for monitoring of heat strain.
Despite its promise, the development of an algorithm based on T ac inputs alone does have its limitations.Prediction of T c based on T ac inputs alone is challenging as the accuracy of T ac measurements can be influenced by variations in ambient temperature 20 .In this regard, we postulate that changes in external auricle temperature (T ea ) and HR can be incorporated into the wearable ear-based algorithm to account for the effect of ambient temperature and metabolic heat on T ac .The external auricle (site of measurement for T ea ) consists largely of skin and cartilage 23 and is thus unable to generate metabolic heat due to a lack of skeletal muscle tissue.As such, changes in T ea would primarily stem from dissipation and/or absorption of heat from the surrounding environment.This allows the external auricle to serve as a suitable measurement site to account for the effect of different ambient conditions on T ac .Thus, the inclusion of T ea as additional physiological variable harbours the potential to enhance the predictive accuracy of an ear-based T c algorithm.
To achieve this, we modified a commercially available T ac -measuring ear-based wearable with an additional sensor for T ea measurement.We then sought to develop an algorithm to predict T c during passive-and exerciseinduced heat stress by using T ea and a host of other physiological variables from the ear-based wearable as inputs for model development.Finally, we evaluated the validity of the algorithm developed for non-invasive heat strain monitoring under hot and humid environmental conditions.

Methodology Participants
Twenty healthy physically active males (mean ± SD; age = 25 ± 3 years, BMI = 21.7 ± 1.8, body fat = 12 ± 3%, maximal aerobic capacity (VO 2max ) = 64 ± 7 ml/kg/min) were recruited for this study.Participants were native to Singapore and had a 10-km run time of less than 60 min.Only individuals certified fit for participation by an independent medical practitioner, with no existing musculoskeletal injury, anal piles or respiratory diseases, and/or history of digestive tract surgery, heat injuries or heart diseases were recruited.
All procedures were approved by the Institutional Review Board of the National University of Singapore (reference number: H-20-017) in accordance with the Declaration of Helsinki.The purpose, procedures, benefits and risks of the study were verbally explained, and participants provided their written informed consent prior to participation.

Experimental design
Participants performed a VO 2max test on the first laboratory visit to assess their aerobic fitness and to individualize the exercise intensity employed in subsequent trials.Anthropometric measurements were also recorded on their first visit.Subsequently, participants underwent three experimental trials: a passive heating (PAH), a running (RUN) and a brisk walking (WALK) trial (Fig. 1C).Three different modes of heating/exercise were employed to facilitate the development of a robust ear-based T c algorithm, with broad applicability over a variety of activities and exercise intensities.Participants completed all experimental trials in a randomly assigned order (Fig. 1A).

Anthropometric measures
Nude body mass and height were recorded using a floor weighing scale (BBA211 Bench Scale, Mettler-Toledo, Germany) and a stadiometer (Seca, Brooklyn, NY, USA) respectively.Body mass index (BMI) was calculated as (body mass in kg)/(height in m) 2 .Skin folds were measured from four sites (bicep, tricep, subscapular and suprailiac) using a Harpenden skinfold calliper (Model HSK-BI; British Indicators, West Sussex, UK).Skin fold measurements from these four sites were necessary to estimate body surface area 24 , body density 25 and body fat percentage 26 .

Maximal aerobic capacity (VO 2max ) test
An incremental treadmill protocol was used to measure each participant's VO 2max

27
. The first phase consisted of a treadmill run at four different speeds, with an initial speed that was 1 km/h slower than the participant's expected 10 km race pace.Treadmill speed was increased by 1 km/h every three min, for a total duration of 12 min.Following a five min rest, participants proceeded to the second phase which consisted of a treadmill run at a fixed individualized speed of moderate intensity (treadmill speed ranged from 9 km/h to 12 km/h as determined by the researcher based on the previous phase), with an initial elevation of 1%.Treadmill elevation was increased PAH, participants immersed themselves up to chest level in water maintained at 42.0 ± 0.3 °C.During RUN, participants ran on a motorised treadmill at a speed that corresponded to 70 ± 3% of their VO 2max .During WALK, participants performed a treadmill walk at a speed of 6 km/h with an elevation of 7%.Passive and/or exercise-induced heating was terminated when participants' T gi reached 39.5°C.During WALK, participants that did not achieve the target T gi within a 60 min duration underwent an extended exercise phase.This consisted of a treadmill walk at a speed of 6 km/h with an elevation of 1%, for a maximum duration of 30 min.Subsequently, participants underwent a seated recovery until T gi returned below 38.0 °C.VO 2 max Maximal aerobic capacity, RH Relative humidity, T gi gastrointestinal temperature.by 1% every min until participants reached volitional exhaustion.Oxygen uptake (VO 2 ) was measured using a metabolic cart (TrueOne 2400, Parvo Medics East Sandy, UT, USA; accuracy ± 0.1%) and VO 2max was derived from the mean VO 2 measured during the final minute prior to test termination.

Experimental trials
Participants were requested to avoid alcoholic beverages, have at least eight hours of sleep, consume sufficient water to stay hydrated and repeat a similar diet and any physical activity performed 24 h prior to each experimental trial.To facilitate their compliance with the study requirements, participants completed a 24-h dietary and physical activity questionnaire.Participants provided a mid-stream urine sample for measurement of urine specific gravity (USG) using a refractometer (UG-alpha, Atago, Bellevue, WA, USA).All participants were euhydrated (USG = 1.000 to 1.024) prior to the commencement of the trials (USG < 1.025 28 ).
Gastrointestinal temperature (T gi ) was monitored using an ingestible telemetric sensor (e-Celsius®, BodyCap, Hérouville-Saint-Clair, France) with a sampling rate of 15 s.Owing to its established validity when compared against rectal and oesophageal temperature 29 , T gi was utilized as the gold standard reference for T c in the present study.The telemetric sensor was either ingested eight to ten hours before each session or rectally inserted by participants upon arrival at the trial site.Heart rate (HR) was continuously measured every second by a chest-based monitor (M430 with H10 HR monitor, Polar Electro, Kempele, Finland).An ear-based wearable device (233621 Sense Headphones, Grandsun Electronic Co. Ltd, Shenzhen, China) was utilized to collect data for model development (Fig. 2A).The device continuously measured aural canal temperature (T ac ) using two thermocouple sensors (T ac1 and T ac2 , maximal error of ± 0.1 °C between − 20 °C to 50 °C30 ) and HR using a photoplethysmography (PPG) sensor (HR ear ) (Fig. 2B).In addition, a commercial earpiece was modified by adding an infrared (IR) sensor which was placed in close proximity to the skin to measure external auricle temperature (T ea , Fig. 2B).IR thermometry is commonly preferred over thermocouples for the measurement of peripheral skin temperature in clinical settings as IR sensors are not required to be in continuous contact with the skin 31 .This feature is especially important in instances where thermometry is performed during constant movement or physical activity.T ac1 , T ac2 , T ea and HR ear data were transmitted to a mobile application via Bluetooth and logged every second (Fig. 2C).VO 2 was measured at baseline and at 15-min intervals during RUN and WALK.Additionally, every 15 min, participants were provided with 2 g/kg body mass of ambient water maintained at 26.0 °C to prevent hypohydration (> 2% reduction in body mass due to water loss 32 ) during the trials.The experimental trial design is depicted in Fig. 1B.

Passive heating trial (PAH)
Participants donned running shorts and completed a 10 min seated baseline in an airconditioned laboratory environment (dry bulb temperature (T db ) = 21.6 ± 0.5 °C, relative humidity (RH) = 68 ± 3%, wet-bulb globe temperature (WBGT) = 19.2 ± 0.5 °C).Subsequently, participants immersed themselves up to chest level in an inflatable tub containing water maintained at 42.0 ± 0.3 °C by an external heating unit (Compact XP Dual Temp, iCoolsport, Gold Coast, Australia).Light facial fanning was applied during heating to minimize participant discomfort.Participants were passively heated until either T gi of 39.5 °C or total duration of 60 min was reached.

Running trial (RUN) and brisk walking trial (WALK)
The RUN and WALK trials were conducted in a controlled environmental chamber set to simulate a warm and humid tropical environment (T db = 30.0± 0.2 °C, RH = 71 ± 2%, WBGT = 27.1 ± 0.3 °C).Participants donned running attire with sports shoes and completed a 10 min seated baseline prior to commencement of the exercise.
In RUN, participants exercised on a motorized treadmill (h/p/cosmos Mercury, Germany) at a speed that corresponded to 70 ± 3% of their VO 2max .In WALK, participants performed a treadmill walk at a speed of 6 km/h with an elevation of 7%.The exercise was terminated if participants' T gi reached 39.5 °C.Participants whose T gi were still below that safety threshold after 60 min underwent an extended exercise phase to elicit a further rise in T gi .The extension was a treadmill walk at a speed of 6 km/h with an elevation of 1%, for a maximum duration of 30 min.Subsequently, participants underwent a seated recovery until T gi returned below 38.0 °C.

Model development
Physiological data recorded by the ear-based wearable (T ac1 , T ac2 , T ea and HR) were used as base parameters for data modelling.All base parameters were pre-processed into 15 s averages and time aligned with T gi data from the telemetric capsule.The temperature gradient (T grad ) between the internal and external regions of the ear was computed as a parameter that accounts for heat exchange between the environment and the aural canal.T grad was quantified by the following equation: Feature engineering was undertaken to generate new modelling parameters from the base physiological parameters and modality parameters.While physiological parameters are continuous, modality parameters are categorical data indicating the activity modalities (passive heating, running, walking) and the phase of trial (pre-trial baseline, heating, post-trial recovery).The feature engineering methods employed encompassed mathematical transformations, linear regression transformations, polynomial regression transformations up to order three, data segmentation.Data smoothening techniques, namely Savitzky-Golay filter and rolling average, were employed to reduce noisy data and improve overall signal-to-noise ratio.
Three regression algorithms, namely linear regression (T lin ), second-order polynomial regression (T poly ) and random forest regressor (T rf ) were evaluated in the study.These algorithms were selected for their reported potential to predict T c from various physiological parameters 20,21,33 .For each algorithm, an iterative feature selection approach was employed to compare the model performances with different subsets of parameters.Algorithm development was performed with machine learning package Scikit-learn on Python version 3.10.
Five-fold cross-validation technique was employed, where training was repeated five times with different training subsets.At each fold, the training dataset consisted of 75% of the subjects, and the testing dataset consisted of the remaining subjects.The performance of each model was averaged from all five folds to minimise any random biases.To assess the performance of the models, the selected evaluation metrics are mean bias error (MBE), mean absolute error (MAE) and 95% confidence intervals (CI).This set of metrics captures accuracy, precision, and reliability of individual estimates respectively.Optimal model performance is characterised by smaller values of these metrics, signifying the predicted values are good estimates of T c .

Data analysis
All statistical computations were performed using IBM SPSS Statistics version 29 (IBM SPSS Statistics 29.0, Armonk, NY, USA) and figures were produced using GraphPad Prism version 10.0.0 (GraphPad Software, San Diego, CA, USA).Normality of data were evaluated using a Shapiro-Wilk test.Bland-Altman plots were used to assess for the agreement between ear-based wearable data and gold standard references.The MBE was calculated by subtracting ear-based wearable data from gold standard references at each 15 s time-point and subsequently averaging all errors.The MAE was quantified by averaging all absolute errors.The 95% CI were calculated as 1.96 × standard deviation (SD) of errors.Percentage target attainment of errors within ± 0.4 °C (PTA ± 0.4 °C) were quantified for the ear-based T c algorithm(s).RMSE was calculated as the square root of the mean of the total squared bias between estimated T c and T gi .The degree of correlation was determined as follows: very strong (r > 0.90), strong (r = 0.70 to < 0.90), moderate (r = 0.50 to < 0.70), low (r = 0.30 to < 0.50) and negligible (r < 0.30) 34 .The following criterion were used to determine the validity of the ear-based T c algorithm for prediction of T gi : (a) individual prediction errors: 95% PTA within ± 0.40 °C29 , (b) group prediction errors: MBE < ± 0.27 °C35 .Mean absolute percentage error (MAPE) and two-way mixed-effects Intraclass Correlation Coefficient (ICC) were calculated to assess the accuracy of the ear-based HR sensor.ICC was determined accordingly: excellent (> 0.90), good (> 0.75 to 0.90), moderate (0.50 to 0.75), poor (< 0.50) 36 .Validity of the ear-based HR sensor was determined by a MAPE < 10% 37 and ICC > 0.90.All data were presented in mean ± SD.

Results
Data were collected across 60 experimental trials (20 participants completed three trials each).However, earbased wearable data were unavailable during eight trials due to battery and/or intermittent connectivity issues.These incomplete datasets were excluded from data modelling and analysis.Thus, the ear-based T c algorithm was developed and evaluated across 52 trials.

Agreement between T ac and T gi
The agreement between T ac and T gi was assessed to determine the validity of T ac as a surrogate measure of T c .Both PTA ± 0.40 °C (10%) and MBE (-1.25 ± 0.86 °C) did not meet the validity criterion set in the present study (Fig. 3).Moreover, 95% CI (± 1.69 °C) was large when comparing between T ac and T gi (Fig. 3).

Model selection and parameter importance
To identify prediction models capable of enhancing the accuracy of T c predictions derived from T ac1 , T ac2 , T ea and HR inputs, we compared a linear regression model (T lin ), second order polynomial regression model (T poly ) and random forest regressor model (T rf1 ).The ear-based HR data used for data modelling in our study met both validity criteria, as indicated by an acceptable MAPE of 2.1 ± 3.4% and an excellent ICC of 0.992.

Parameter engineering
Physiological parameters measured by the ear-based wearable only displayed moderate correlations (r = 0.34-0.56)with T gi , which could explain the sub-optimal performances observed from the selected T c prediction models (Fig. 4).Interestingly, we found that T ac1 + T grad , which accounts for the gradient between internal and external temperature at the ear, displayed strong correlation with T gi (r = 0.77).T grad calculated in the present study ranged from 0.0 to 4.7 °C.Hence, feature engineering was performed using these basic parameters to generate additional highly correlated model inputs for data modelling.Sixteen new parameters were developed, each demonstrating strong to very strong correlations with T gi (Table 1).

Validity of ear-based T c algorithm
To derive the best performing T c prediction model, we then performed an iterative evaluation involving different combinations of the base, engineered and activity parameters.We found that the best-performing model (T rf3 ) was a random forest regressor which utilized T eng16 (polynomial regression with T ac1 , T ac2 , T ea , HR and T ac1 + T grad ) and trial phase (pre-trial baseline, heating, post-trial recovery) as model parameters.

Validity of ear-based T c algorithm during different modes of activity
In order to assess model performance during various activity modalities, the dataset was split into five separate trial phases which comprised of a passive heating, running, walking, pre-trial baseline and post-trial recovery.
Table 1.Correlation, mean absolute error and 95% CI between basic parameters (T ac1 , T ac2 , T ea , HR, T ac1 + T grad ) and engineered data modelling parameters, against telemetric capsule (T gi ) across all trials.

Discussion
We developed an algorithm to predict T c during passive-and exercise-induced heat stress by modifying a commercially available multi-sensor ear-based wearable.In doing so, we investigated whether the inclusion of external auricle temperature (T ea ) as a model input could enhance the predictive accuracy of the ear-based T c algorithm.T ac markedly underestimated T gi which indicates that it is unsuitable as a sole surrogate measure of T c .Inclusion of T ea as a model input improved the predictive abilities of the ear-based algorithm suggesting that T ea can account for environmental influences on the aural canal.The T rf3 model (best performing T c model) had individual prediction errors (PTA ± 0.40 °C = 88%, 95% CI = ± 0.52 °C) that marginally exceeded the study validity criterion (95% PTA within ± 0.40 °C).However, T rf3 exhibited acceptable group prediction errors (MBE < ± 0.27 °C) across all modes of heating.As such, this highlights its potential utility for group-based T c monitoring, with additional refinement needed to extend its applicability to personalized heat strain monitoring.
We observed that T ac significantly underestimated T gi in the present study.Aural canal temperature displayed large negative individual prediction errors (Fig. 3) which culminated in an overall negative MBE (− 1.25 ± 0.86 °C) when compared against T gi .This is in line with previous investigations which have reported that T ac measurements were consistently lower compared to gold standard T c references during continuous exercise 16,38 and simulated work-rest cycle protocols 20,39 .Moreover, T ac measurements derived from our ear-based wearable demonstrated large individual and group prediction errors that markedly exceeded the study's predetermined validity criteria (Fig. 3).As such, this indicates that T ac should not be employed as a sole surrogate of T gi when used for heat strain monitoring.
Variations in ambient conditions can alter the level of agreement between T ac and T c 20 .Hence, several studies have sought to mitigate external environmental influences by insulating the aural canal with a padded ear patch 16 or medical film 40 .While these strategies are shown to slightly improve the agreement between T ac and gold standard T c references, these approaches are impractical in real-world scenarios.A novel finding in the present study was that T ea can be utilized to account for environmental influences on the aural canal.Inclusion of T ea data as a model input led to a notable improvement in PTA ± 0.40 °C (T rf1 = 71%, T rf2 = 65%) and narrower 95% CI (T rf1 = ± 0.82 °C, T rf2 = ± 0.94 °C) in T rf1 (T ea included in model) relative to T rf2 (T ea excluded from model).This suggests that T ea can augment the predictive abilities of an ear-based T c algorithm.Our findings agree with prior work which underscores the importance of including an external temperature sensor to account for environmental effects on the aural canal 21 .It is worth noting that while our approach shares similarities with Nakada, et al. 21, the external sensor employed in their study directly measures alterations in ambient temperature.In contrast, our T ea sensor derives temperature readings from the skin at the external auricle, offering insights into the heat exchange dynamics between the environment and the auricular region.In doing so, this could provide a valuable physiological perspective into how the external environment influences T ac .
Consideration of individual prediction errors is necessary when determining the validity of a T c algorithm for personalized heat strain monitoring 41 .Yet, few published T c algorithms have met the validity thresholds set in the present study 42 .To date, only Nazarian, et al. 43 have published a T c prediction algorithm that confers an ideal 95% PTA of errors within ± 0.27 °C.Nevertheless, it is noteworthy that their algorithm was developed within a narrower T c range (maximum T gi < 39.0 °C), with treadmill walking employed as the sole activity modality 43 .
We utilized feature engineering and selection to improve the accuracy of the random forest regressor models in the present study.Mathematically transforming and/or combining multiple physiological parameters can generate supplementary model inputs that exhibit enhanced correlations with the intended parameter of interest 44 .Accordingly, our T rf3 model displayed an 88% PTA of errors within predetermined thresholds of ± 0.40 °C which considerably out-performed earlier model iterations (T rf1 = 71%, T rf2 = 65%).Moreover, T rf3 conferred a better agreement with T gi when compared with ear-based wearables evaluated in previous research 16,20,45 .Roossien, et al. 45 validated a commercially available ear-based wearable (Cosinuss° type C-med) and reported underestimations of T gi during rest (− 0.4 ± 0.7 °C), activity (− 1.4 ± 1.5 °C) and recovery (− 1.5 ± 1.2 °C) which were larger than in the present study (Fig. 5).However, Cosinuss° was tested in the field during firefighting task simulations which might have contributed to the poorer agreement observed when compared with our fixed intensity laboratory protocol.When considering a narrower PTA of errors within ± 0.30 °C, T rf3 (79%) also surpassed other commercially available wearables such as Kenzen (70%) and the CORE heat flux sensor (40-59%) 33,46,47 .Although the T rf3 and Kenzen T c algorithms appear to exhibit a higher accuracy, it is worth noting that the models developed here and in Moyen, et al. 33 implemented the same dataset for training and validity testing.Thus, further research is necessary to ascertain whether the accuracy of the T rf3 model can be maintained when validated across new and independent datasets.
Additionally, comparison between the various heating modalities revealed that T rf3 displayed fewer incidences and smaller magnitudes of individual prediction errors when estimating T c during exertional settings relative to passive heating (Fig. 6A-C).Our findings diverged from those presented by Kato, et al. 40 who reported little difference in 95% CI whilst testing their ear-based wearable during passive heating (± 0.5 °C) and exercise (± 0.4 °C).This discrepancy is likely attributed to methodological differences between the passive heating protocols used in both studies.Notably, Kato, et al. 40 opted for a lower leg immersion protocol which resulted in comparatively lower levels of heat strain (T rec < 38.0 °C) relative to our study.Furthermore, participants were required to soak up to chest level in the present study, thereby resulting in a closer proximity between the ear-based wearable and the hot water surface.We postulate that radiative heat from the hot water along with cooler external ambient conditions may exert contrasting influences on aural canal and external auricle temperature.In turn, these conflicting signals could potentially lead to a reduction in algorithm accuracy during passive heating.Further work is thus required to better account for these dynamic influences and enhance the applicability of our ear-based T c algorithm during passive heat stress.
Although individual prediction errors were not sufficiently precise for personalized heat strain monitoring, T rf3 displayed an acceptable accuracy for estimation of group-based T c responses.Accurate measurement of group-based T c responses could offer valuable information to improve training standards and aid in the estimation of training stimulus when implementing HA protocols 48,49 .HA protocols typically aim to maintain T c above an endogenous thermal criterion of 38.5 °C to elicit an optimal adaptation stimulus 48 .It is thus worth noting that T rf3 exhibited an acceptable MBE (− 0.03 ± 0.28 °C) at T gi ranging from 38.0 to 39.0 °C (Fig. 7a-c).As such, this highlights the potential utility of our ear-based T c algorithm to function as a non-invasive tool to quantify group-based T c responses during HA.

Limitations
The present study was designed to develop and validate our ear-based T c algorithm over a variety of activity modalities and a wide T gi range.As such, we employed continuous passive and exercise heating protocols under controlled laboratory environments to impose adequate environmental and/or metabolic heat stress for elevated T gi readings to be attained.In doing so, we are unable to ascertain whether the T rf3 model would confer a similar accuracy when employed in the field.Given that environmental conditions were also tightly controlled in our protocol, the present study design may not have been able to elucidate the true benefits of including a T ea sensor.It is thus crucial for future investigations to train and test T rf3 under a wider range of ambient temperatures, fluctuating environmental conditions and during dynamic real-world activities to fully utilize T ea inputs and develop a robust ear-based wearable algorithm 42 .Additionally, aerobically fit participants were recruited due to their enhanced ability to tolerate high endogenous heat loads 50 .Yet, recruitment of a broader participant demographics is necessary in future investigations to assess the applicability of T rf3 in other vulnerable populations (e.g.sedentary adults, elderly).Our ear-based wearable T c algorithm was also developed and tested in a male cohort.Given that sex-based differences in wearable accuracy have been demonstrated in previous research 51 , future work to train and validate T rf3 in a female cohort is necessary.

Conclusion
A novel finding in this study was that the predictive abilities of an ear-based algorithm can be enhanced by inclusion of T ea as a model input to account for environmental influences on the aural canal.Despite its promise, T rf3 displayed individual prediction errors that marginally exceeded the study validity criterion.However, the T rf3 model demonstrated an acceptable accuracy for estimation of group-based T c responses when predicting T gi readings ranging from 38.0 °C to 39.0 °C across all modes of heating.Taken together, T rf3 demonstrates potential utility for group-based T c monitoring, with additional refinement needed to extend its applicability to personalized heat strain monitoring.Given the prevalent use of ear-based devices in heat-exposed occupations (e.g.radio communication sets), sports and day-to-day living (e.g.Bluetooth-enabled earbuds), this research seeks to lay the foundation for future development of a wearable ear-based physiological monitoring system that may offer protection in numerous heat-exposed activities (e.g.sports, physical labour) and/or vulnerable populations (e.g.older adults, young children).

Figure 1 .
Figure 1.(A) Schematic representation of the overall study design, (B) experimental trial design and (C) experimental trial photos (from left: PAH, RUN, WALK).Participants performed a seated baseline.During PAH, participants immersed themselves up to chest level in water maintained at 42.0 ± 0.3 °C.During RUN, participants ran on a motorised treadmill at a speed that corresponded to 70 ± 3% of their VO 2max .During WALK, participants performed a treadmill walk at a speed of 6 km/h with an elevation of 7%.Passive and/or exercise-induced heating was terminated when participants' T gi reached 39.5°C.During WALK, participants that did not achieve the target T gi within a 60 min duration underwent an extended exercise phase.This consisted of a treadmill walk at a speed of 6 km/h with an elevation of 1%, for a maximum duration of 30 min.Subsequently, participants underwent a seated recovery until T gi returned below 38.0 °C.VO 2 max Maximal aerobic capacity, RH Relative humidity, T gi gastrointestinal temperature.

Figure 2 .
Figure 2. (A) The ear-based wearable device placed in a participant's ear.(B) Schematic representation of sensor placement on the ear.Aural canal temperature was measured by two thermocouple sensors while external auricle temperature and heart rate were measured by an infrared sensor and a photoplethysmography (PPG) sensor respectively.(C) Logging of physiological parameters on mobile application.
14:12418 | https://doi.org/10.1038/s41598-024-63241-2www.nature.com/scientificreports/A wide range of T gi and HR measurements were recorded during the three experimental trials as intended by our study design.The T gi dataset consisted of 18,592 data points (15 s averages) ranging from 36.4 to 40.0 °C while the HR dataset comprised of 32,816 data points (5 s averages) ranging from 45 to 201 bpm.The ear-based PPG HR sensor met both validity criterion implemented in the present study as demonstrated by an acceptable MAPE of 2.1 ± 3.4% and an excellent ICC of 0.992.Participants reached the study's T gi cutoff in 22 trials (PAH = 9, RUN = 12, WALK = 1).

Figure 3 .
Figure 3. (A) Histogram depicting percentage distribution of errors and (B) Bland-Altman plots comparing aural canal temperature (T ac ) data against the telemetric capsule (T gi ) across all trials.The solid blue line represents the mean bias error while the red dashed lines represent fixed upper and lower limits of agreement of ± 0.40 °C.

Figure 4 .
Figure 4. Histogram depicting percentage distribution of errors when comparing T c data predicted by (A) linear regression model (T lin ), (B) polynomial regression model (T poly ), (C) random forest regressor model including T ea data (T rf1 ) and (D) random forest regressor model excluding T ea data (T rf2 ) against the telemetric capsule (T gi ) across all trials.Bland-Altman plots comparing T c data predicted by (a) linear regression model (T lin ), (b) polynomial regression model (T poly ), (c) random forest regressor model including T ea data (T rf1 ) and (d) random forest regressor model excluding T ea data (T rf2 ) against the telemetric capsule (T gi ) across all trials.The solid blue line represents the mean bias error of each model while the red dashed lines represent fixed upper and lower limits of agreement of ± 0.40 °C.

Figure 5 .Figure 6 .
Figure 5. (A) Histogram depicting percentage distribution of errors and (B) Bland-Altman plots comparing best performing ear-based T c algorithm (T rf3 ) data against the telemetric capsule (T gi ) across all trials.The solid blue line represents the mean bias error while the red dashed lines represent fixed upper and lower limits of agreement of ± 0.40 °C.

Figure 7 .
Figure 7. Histogram depicting percentage distribution of errors when comparing best performing ear-based T c algorithm (T rf3 ) data against the telemetric capsule (T gi ) at (A) low (T gi < 38.0 °C), (B) moderate (T gi = 38.0-39.0°C) and (C) high (T gi > 39.0 °C) endogenous heat loads.Bland-Altman plots comparing best performing ear-based T c algorithm (T rf3 ) data against the telemetric capsule (T gi ) at (a) low (T gi < 38.0 °C), (b) moderate (T gi = 38.0-39.0°C) and (c) high (T gi > 39.0 °C) endogenous heat loads.The solid blue line represents the mean bias error during each trial phase while the red dashed lines represent fixed upper and lower limits of agreement of ± 0.40 °C.