Proper use of noncontact infrared thermometry for temperature screening during COVID-19

Among the myriad of challenges healthcare institutions face in dealing with coronavirus disease 2019 (COVID–19), screening for the detection of febrile persons entering facilities remains problematic, particularly when paired with CDC and WHO spatial distancing guidance. Aggressive source control measures during the outbreak of COVID-19 has led to re-purposed use of noncontact infrared thermometry (NCIT) for temperature screening. This study was commissioned to establish the efficacy of this technology for temperature screening by healthcare facilities. We conducted a prospective, observational, single-center study in a level II trauma center at the onset of the COVID-19 outbreak to assess (i) method agreement between NCIT and temporal artery reference temperature, (ii) diagnostic accuracy of NCIT in detecting referent temperature \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 100.0\,^{\circ }{\mathrm{F}}$$\end{document}≥100.0∘F and ensuing test sensitivity and specificity and (iii) technical limitations of this technology. Of 51 healthy, non-febrile, healthcare workers surveyed, the mean temporal artery temperature was \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$98.4\,^{\circ }{\mathrm{F}}$$\end{document}98.4∘F (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%$$\end{document}95% confidence interval (CI) = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[98.2,98.6]\,^{\circ }{\mathrm{F}}$$\end{document}[98.2,98.6]∘F). Mean NCIT temperatures measured from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${1}\,{\mathrm{ft}}$$\end{document}1ft, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${3}\,{\mathrm{ft}}$$\end{document}3ft, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${6}\,{\mathrm{ft}}$$\end{document}6ft distances were \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$92.2\,^{\circ }{\mathrm{F}}$$\end{document}92.2∘F \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(95\%\ {\text {CI}}=[91.8\ 92.67]\,^{\circ }{\mathrm{F}})$$\end{document}(95%CI=[91.892.67]∘F), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$91.3\,^{\circ }{\mathrm{F}}$$\end{document}91.3∘F \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(95\%\ {\text {CI}}=[90.8\ 91.8]\,^{\circ }{\mathrm{F}})$$\end{document}(95%CI=[90.891.8]∘F), and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$89.6\,^{\circ }{\mathrm{F}}$$\end{document}89.6∘F \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(95\%\ {\text {CI}}=[89.2 \ 90.1]\,^{\circ }{\mathrm{F}})$$\end{document}(95%CI=[89.290.1]∘F), respectively. From statistical analysis, the only method in sufficient agreement with the reference standard was NCIT at \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${1}\,{\mathrm{ft}}$$\end{document}1ft. This demonstrated that the device offset (mean temperature difference) between these methods was \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-6.15\,^{\circ }{\mathrm{F}}$$\end{document}-6.15∘F (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%\ {\text {CI}}=[-6.56,-5.74]\,^{\circ }{\mathrm{F}}$$\end{document}95%CI=[-6.56,-5.74]∘F) with 95% of measurement differences within \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-8.99\,^{\circ }{\mathrm{F}}$$\end{document}-8.99∘F (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%\ {\text {CI}}=[-9.69,-8.29]\,^{\circ }{\mathrm{F}}$$\end{document}95%CI=[-9.69,-8.29]∘F) and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.31\,^{\circ }{\mathrm{F}}$$\end{document}-3.31∘F (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$95\%\ {\text {CI}}= [-4.00,-2.61]\,^{\circ }{\mathrm{F}}$$\end{document}95%CI=[-4.00,-2.61]∘F). By setting the NCIT screening threshold to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$93.5\,^{\circ }{\mathrm{F}}$$\end{document}93.5∘F at \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${1}\,{\mathrm{ft}}$$\end{document}1ft, we achieve diagnostic accuracy with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$70.9\%$$\end{document}70.9% test sensitivity and specificity for temperature detection \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 100.0\,^{\circ }{\mathrm{F}}$$\end{document}≥100.0∘F by reference standard. In comparison, reducing this screening criterion to the lower limit of the device-specific offset, such as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$91.1\,^{\circ }{\mathrm{F}}$$\end{document}91.1∘F, produces a highly sensitive screening test at \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$98.2\%$$\end{document}98.2%, which may be favorable in high-risk pandemic disease. For future consideration, an infrared device with a higher distance-to-spot size ratio approaching 50:1 would theoretically produce similar results at \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${6}\,{\mathrm{ft}}$$\end{document}6ft, in accordance with CDC and WHO spatial distancing guidelines.

, screening for the detection of febrile persons entering facilities remains problematic, particularly when paired with CDC and WHO spatial distancing guidance. Aggressive source control measures during the outbreak of COVID-19 has led to re-purposed use of noncontact infrared thermometry (NCIT) for temperature screening. This study was commissioned to establish the efficacy of this technology for temperature screening by healthcare facilities. We conducted a prospective, observational, single-center study in a level II trauma center at the onset of the COVID-19 outbreak to assess (i) method agreement between NCIT and temporal artery reference temperature, (ii) diagnostic accuracy of NCIT in detecting referent temperature ≥ 100.0 • F and ensuing test sensitivity and specificity and (iii) technical limitations of this technology. Of 51 healthy, non-febrile, healthcare workers surveyed, the mean temporal artery temperature was 98. 4  F at 1 ft , we achieve diagnostic accuracy with 70.9% test sensitivity and specificity for temperature detection ≥ 100.0 • F by reference standard. In comparison, reducing this screening criterion to the lower limit of the device-specific offset, such as 91.1 • F , produces a highly sensitive screening test at 98.2% , which may be favorable in high-risk pandemic disease. For future consideration, an infrared device with a higher distance-to-spot size ratio approaching 50:1 would theoretically produce similar results at 6 ft , in accordance with CDC and WHO spatial distancing guidelines.
Since the emergence of COVID-19, over 100 guidance documents have been produced by the World Health Organization (WHO) and the Centers for Disease Control and Prevention (CDC) 1,2 . Interim guidance for United States healthcare facilities recommend aggressive universal source control measures and well-equipped triage procedures at the entrances of facilities to actively screen individuals for fever. Fever is defined as either measured temperature greater than or equal to 100.0 • F or subjective fever 1 and symptoms. Respiratory symptoms consistent with COVID-19 are cough, shortness of breath, and sore throat 1 . Maintaining spatial separation at 6 ft is also advised by the CDC and keeping at least 1 m apart is recommended by the WHO [1][2][3] .
Under this guidance, our center aimed to investigate the diagnostic accuracy of temperature measurement using the Fluke 561 Noncontact Infrared Thermometer (NCIT) from a distance of 6 ft . A cursory investigation into this device's use revealed variable results in temperature measurement, which degraded as a function of distance, and a notable offset compared to the expected body temperature.
Furthermore, there is a paucity of high-quality research comparing thermal measurements obtained by noncontact infrared thermometry (IRT) versus conventional methods in public health applications, such as pandemic disease, like COVID-19.
Several studies support this dissimilitude in diagnostic accuracy across a myriad of available devices [4][5][6] . In a review of six studies by Bitar et al. 7 assessing measurement of forehead temperature using an NCIT, test sensitivity ranged widely from 4 to 89.6%, the specificity from 75.4 to 99.6%, and the Positive Predictive Value (PPV) between 3.5 and 65.4% . Four studies failed to provide technical information about the NCIT device used. Three studies failed to report environmental conditions or stabilization factors, and a majority of studies did not detail the procedural methods employed for measurement testing, such as distance away from the target. Studies using handheld devices at less than 7.9 in ( 20 cm ) from the forehead showed improved accuracy 4,8 .
This variability called into question the efficacy of screening practices that similarly equipped healthcare facilities are implementing to reduce the spread of COVID-19 and comply with spatial distancing guidance. This study will analyze temperature data collected using the Fluke 561 infrared thermometer at 1 ft , 3 ft and 6 ft distances to assess the accuracy of this device compared to conventional temporal artery thermometry and survey the technological constraints inherent to NCITs. Our intent is to inform a concerned audience of explicit and implicit limitations and recommend an optimal screening process for infection prevention and control measures during pandemic disease.

Methods
Study design. This nonblinded prospective single-center study was designed to examine the validity of the Fluke 561 NCIT (Fluke Corp, Everett, WA) at variable distances compared to conventional temporal artery reference temperature of employees in a U.S. healthcare facility during the outbreak of COVID-19.
The study was conducted at Guthrie Robert Packer Hospital, a rural 267-bed tertiary care level II trauma center serving the southern tier of New York and the northern tier of Pennsylvania 9 .
Participants were recruited by study investigators at random from different hospital wards. An institutional review board-approved letter was used to inform participants about the study and informed consent was obtained prior to enrollment.
Ethical approval. The study protocol and informed consent documentation were reviewed and approved by the Institutional Review Board of The Guthrie Clinic. All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation and with the Helsinki Declaration of 1964 and its later amendments.
Participants. Eligible participants were healthy adults aged 18 or older who were employed by the Guthrie Clinic, working on-site during the COVID-19 pandemic. Participants were excluded if they were pregnant.
Procedures and data collection. Data was collected, by a single researcher, between April 6 and April 10, 2020 on different hospital wards. Age and gender were included in data collection.
A fever threshold ≥ 100.0 • F was adopted in accordance with CDC guidelines for U.S. healthcare facility screening 1 . If participants were found to have a measured temperature ≥ 100.0 • F , they would be instructed to report to the Employee Health Office.
Mode of use, device calibration, temperature scanning and disinfection were followed according to manufacturer instructions for each apparatus and in compliance with CDC infection prevention standards. Similarly, Personal Protective Equipment (PPE) compliant with institution and CDC precautions for COVID-19, including the use of an N95 respirator, was worn.
Temporal artery thermometer (TAT). Each participants' baseline body temperature was measured using the Exergen TemporalScanner (Model TAT5000, Exergen Corp, Watertown, MA) Temporal Artery Thermometer (TAT). Technical specifications for the Exergen TemporalScanner are given in Table 1.
This accurate and noninvasive thermometer uses infrared technology to measure heat emitted from the skin's surface overlying the temporal artery 10 . Temporal artery temperature is a close approximate of rectal temperature and therefore accurately reflects a measure of core body temperature [10][11][12] .
Non-contact infrared thermometer (NCIT). Next, a digital ruler was used to measure and mark 1 ft , 3 ft , and 6 ft distances on the floor of each testing location, relative to a fixed reference point.
Three sequential measurements were obtained using the Fluke 561 NCIT at device-to-target [Target, in this context, denotes the forehead of each study participant at the specified IR measurement distance] distances of 6 ft , 3 ft , and 1 ft (see Fig. 1a) with emissivity set to Hi (see Table 1). The focal point of the device, marked by a laser, was centered on each participants' forehead whom were asked to close their eyes during collection.
NCITs measure the amount of infrared energy emitted by an object's surface; it is not a direct measure of core body temperature and can be influenced by environmental conditions and physiological factors.
To ensure accurate temperature measurement, device-to-target distance is crucial since it governs the measurement spot size according to the NCIT's distance-to-spot size (D:S) specification. The spot size, i.e. diameter of the measurement area, indicates 90% encircled energy 13 and increases proportionally with distance. For that reason, the spot size should not exceed the intended measurement area [Fluke recommends that the measurement area be twice as large as the measured spot size 14 ]. The Fluke 561 D:S ratio is 12:1 which implies a 6 in spot diameter at 6 ft (see Fig. 1    Blinding. All participants and data collectors were unblinded to data collection. Fully deidentified objective outcome data was provided to an independent data analyst. Data analysis. All data analyses were performed using MATLAB, version R2019a (The MathWorks Inc., Natick, MA). First, the validity of NCIT data at 1 ft , 3 ft , and 6 ft had to be established. Central tendency and dispersion characteristics for each dataset was computed and graphically visualized through histogram plots. Formal judgment of normality was performed using robust statistical testing. Those results, in combination with the limitations of the NCIT device, led to rejection of NCIT measurements at 3 ft and 6 ft . Then, methodological comparison between TAT and NCIT at 1 ft was conducted using Bland-Altman plots, the benchmark approach for assessing between-method differences [15][16][17] . This provided an indication of bias and 95% Limits of Agreement (LOA) between the measurement methods. Diagnostic accuracy for detecting fever was then explored. Prior to presenting the main results, we introduce the following nomenclature: T X ( • F) denotes the temperature measured by method X . That is, T TAT represents temperature measured by the TAT, while T 1 ft , T 3 ft , and T 6 ft correspond to NCIT measurements.

Results
A total of 51 healthy adults were sampled. Ages ranged 21 to 67 with a median age of 31[Median excludes one participant who did not report their age.]. Of the participants, 37 (72.5%) were female and none exhibited fever or symptoms of illness ( N = 51 , P = ∅ ). The mean temporal artery temperature was 98.4 • F with all measurements falling between 97.1 and 99.6 • F . No data was excluded from the study.
Descriptive statistics. Measures of central tendency and dispersion characteristics for each measurement method were computed and summarized in Table 2. Histograms for each measurement method are provided in No distributions displayed readily apparent asymmetry or skew; however, formal statistical tests are performed in the next section to determine whether TAT and NCIT measurements are drawn from a normally distributed population.
Distribution tests for normality. Statistical test decisions, along with corresponding p values, are summarized in Table 3 for the Lilliefors' Composite and Anderson-Darling Goodness of Fit (AD GOF) hypothesis tests.
Lilliefors and AD GOF test outcomes for both TAT ( p = 0.19 and p = 0.28 ) and NCIT at 1 ft ( p = 0.25 and p = 0.11 ) were false, indicating that the null hypothesis, data comes from a normal distribution, cannot be rejected at the 5% significance level. Conversely, the Lillefors and AD GOF results for NCIT measurements at 3 ft and 6 ft ( p < 0.1 , p < 0.05 ) reject the same null hypothesis at the 5% significance level.
Thus, 1 ft NCIT data is the only measurement set whose distribution coincides with the referent method's distribution.
Non-normal distributions of 3 ft and 6 ft NCIT data combined with insufficient distance-to-spot size ratings [The corresponding spot sizes were approximately 3 in and 6 in, respectively, which exceed reliable forehead area thresholds] for the intended measurement target using the Fluke 561 (see Fig. 1 and "Basics of noncontact

Measurement method
Mean  Fig. 3     Assessing diagnostic accuracy. In this section, we quantify test sensitivity and specificity of NCIT at 1 ft for detecting temperature ≥ 100.0 • F by the reference method based on NCIT screening thresholds.    www.nature.com/scientificreports/ With a mean measurement bias ( −6.15 • F ) and LOA ( 2.84 • F ) established using Bland-Altman, diagnostic specificity can be computed for various NCIT thresholds, denoted T ⋆ IR . Figure 4 demonstrates the effect T ⋆ IR has on specificity (SP) with threshold values encircled and presented in Table 4. For example, T ⋆ IR = 93.9 • F corresponds to a test specificity of 88.2% (FPR = 11.8% ) with True Negative (TN) and False Positive (FP) group sizes equal to 45 and 6, respectively.

Bland-Altman method comparison. A Bland-Altman plot is provided in
Open source data analysis-STRIDE cohort. Given the sampled population was constrained to a group of healthy adults, complete assessment of diagnostic accuracy, including test sensitivity, necessitates a population with fever occurrence. Thus, an open source cohort from the Stanford Translational Research Integrated Database Environment (STRIDE) 18 was imported and synthesized to validate this study's findings on a larger population with fever prevalence. The STRIDE database is comprised of 578,522 adult outpatient temperature measurements at Stanford Health Care, gathered between 2007 and 2017. The mean of the STRIDE Cohort was 98.0 • F with a standard deviation of 0.7 • F.
The method of measurement used in the STRIDE cohort was oral thermometry. Therefore, the cohort observations were adjusted by +0.8 • F to account for differences between oral and Exergen TAT thermometry 10,12 . The adjusted mean is 98.8 • F with P = 19,584 febrile observations. A simulated 1 ft NCIT measurement set is then synthesized with the adjusted cohort and repeated random sampling from the distribution where �µ −6.15 • F and ρ Diagnostic accuracy for detecting referent temperature greater than 100.0 • F using three different NCIT temperature thresholds is summarized in Table 5.
A receiver operating characteristic (ROC) curve is shown in Fig. 5 with the points in Table 5 encircled. This graph plots test sensitivity over (1-specificity). This provides a graphical depiction of the implicit trade-off between sensitivity and specificity at select screening criteria.
For the Fluke 561, this curve provides a visual guide for apriori selection of fever screening criteria adjusted to an acceptable sensitivity and specificity for a certain clinical context.

Discussion
Basics of noncontact infrared thermal measurements. Thermal radiation (first principles) An infrared thermometer measures the thermal radiation energy of an object and computes the temperature according to the fundamental Stefan-Boltzman law where M is the radiant exitance, a measure of radiation power emitted by an object into an imperfect vacuum, σ = 5.670373 × 10 −1 W m −2 K −4 , ε denotes the emissivity of the emitting object, and T (K) represents its absolute Temperature 19 . The emissivity ε is a measure of how well an object can emit energy as thermal radiation. Almost perfect emitters, such as skin, have high emissivity (ε skin = 0.98 ) while highly reflective surfaces such as polished metals are low.  www.nature.com/scientificreports/ Sources of error in the computation shown in Eq. (2) stem from the focal resolution of the device, introduced in "Non-contact infrared thermometer (NCIT)" section and discussed further below, along with the prescribed ε value which can lead to large temperature errors due to the fourth power dependence.
Choosing a device with a sufficient D:S ratio such that the spot size is completely inscribed by the measurement area is crucial. If the spot size exceeds the target measurement area, such as the participant's forehead, then the NCIT samples extraneous thermal radiation leading to incorrect forehead temperature measurements.
Lastly, thermal radiation is subject to atmospheric interference and distortion. End users should mitigate the effects of environmental factors affecting measurement error 5 , such as controlling for ambient temperature gradients, minimizing surface irregularity, and stabilizing device-to-target alignment in an effort to prevent scattering.

Discussion of results.
Infected patients are the primary source of pathogen dissemination in healthcare settings 3 . If initial screening and containment efforts fail, the ramifications can be particularly severe. Healthcare personnel may fall ill and transmit disease to others, including high-risk patients 2 . Infected staff may also require healthcare services themselves, placing additional strain on the medical system as healthcare providers become healthcare receivers.
In a case series by Wang et al. 20 of 138 hospitalized patients with COVID-19, 41% were presumed to be due to hospital-related transmission, affecting 40 (29%) health professionals and 17 (12.3%) hospitalized patients.
The lack of a highly sensitive screening test for COVID-19 undermines efforts to contain viral spread 21 , an issue compounded by imperfect COVID-19 diagnostic tests with low-moderate sensitivities estimated around 70% 22 . In a study analyzing 1099 COVID-19 patients in China this year, fever was present in 43.8% of cases upon admission and 88.7% during hospitalization 23 .
A highly sensitive screening strategy allows timely detection of outbreaks and detects almost all cases of pandemic illness, which thereby limits disease spread and optimizes infection prevention and control measures. This approach, however, is much more resource and time consuming as a consequently higher rate of false positive tests require second or multi-stage screening surveys or confirmatory diagnostic testing to correctly classify diseased versus nondiseased individuals.
Alternatively, employing a screening strategy with lower sensitivity (i.e., lower level of detection) affords higher feasibility but at the cost of a higher false negative test rate 24 . Using this approach, a negative test result cannot safely assume someone is uninfected 22 . This surveillance method is discouraged and more consequential since misdiagnosed infected persons are not identified or contained by screening strategies and can go on to infect others 22 .
Results in this study show that utilizing the Fluke 561 within manufacturer specifications yields an average temperature offset of 6.15 • F below temporal artery thermometer values when the device is held 1 ft away from the test subject's forehead.
If the screening threshold is selected at 93.9 • F in healthcare facilities, the Fluke 561 NCIT will provide a 60.5% sensitive and 80% specific test for the detection of temperature ≥ 100.0 • F by temporal artery thermometry. Lowering the screening criterion to 91.1 • F greatly increases sensitivity to 98.3% , but this is at the expense of an unacceptably high FPR of 82.9%, producing an inefficient test. This phenomena is concisely illustrated in Fig. 6. www.nature.com/scientificreports/ Designing a reference standard for the measurement of temperature elevation in individuals being screened at healthcare facilities during COVID-19 using noncontact infrared thermometry to maintain spatial separation remains a value judgement.
At our institution, it is reasonable to propose a reference standard set to 93.5 • F (Fig. 5). This would yield moderate test sensitivity and specificity of 70.9% , acknowledging a non-negligible FPR and FNR of 29.1% . Thus, the use of multi-step screening tools including issuance of symptom questionnaires and confirmatory TAT testing could further stratify individuals who screen positive by NCIT criteria.
That said, appropriate education and training of relevant staff, control of local conditions and environmental factors to reduce measurement error, and the acquisition of an NCIT with improved accuracy and a higher resolution D:S, such as the Fluke 568 with a distance to spot size ratio of 50:1 (see Table 1), may yield improved specificity while also maintaining a 6 ft distance between screening personnel and target individuals.
Individual facility resources and staff available for screening, population risk characteristics, epidemiological factors, and personnel throughput are factors that should be considered when selecting a screening criteria.
Limitations. The following limitations were identified in our study: (i) Because data collection was performed in the background of usual clinical activity, researchers did not adjust heating, ventilation, or air conditioning properties for each testing area per hospital ward. (ii) Incident angle of measurement varied due to height difference between the researcher and participants. (iii) Lack of blinding of researchers during temperature measurement and data collection. (iv) Sample population contained no febrile observations, the majority of participants were female, and the sample was drawn from healthcare providers and therefore not a random distribution of the general population.

Future work.
A second-phase investigation is currently underway using the Fluke 568 infrared device (see Table 1), which has a D:S ratio of 50:1. Theoretically speaking, the main results of this paper extend to the Fluke 568 at 6 ft since the spot-size is comparable to the Fluke 561 at 1 ft , and both devices have similar instrument accuracy. Thus, the aim is to validate the main results of this paper for temperature screening at further distances that comply with WHO and CDC separation guidelines (1 m and 6 ft , respectively).

Conclusions
Measuring skin surface temperature in mass public screening applications is an imperfect method for detecting elevated body temperature in individuals potentially infected with coronavirus. Doing so with an NCIT, in an effort to maximize social distancing and minimize risks of exposure to screening staff and healthcare personnel, introduces the potential for additional measurement error. However, a moderately sensitive screening test in the setting of high-risk pandemic virus is possible with NCIT. This requires proper device selection to match the intended application, i.e. specifying a D:S requirement that ensures a sufficient target spot size for the measurement distance, and determination of the device-specific temperature offset compared to an institution's reference standard. Figure 6. Overlay of probability distribution functions (PDF) for the reference measurement method, i.e. TAT, and the alternate or test measurement method, NCIT. Intersections of the PDFs define the FN, TN, FP, and TP groups given a referent fever screening temperature and a corresponding NCIT screening temperature. This plot illustrates the trade-off in sensitivity and specificity for temperature screening thresholds. www.nature.com/scientificreports/ Under this method, the authors strongly advocate for appropriate training of staff, clear instructions-for-use, robust device calibration, and stabilization of environmental and procedural factors to increase success and maximize diagnostic accuracy.
The incorporation of an initial fever screening and subsequent quarantine, PPE, and sanitation regimens for febrile patients may prove beneficial beyond the course of this pandemic. A growing body of clinical research points to a decline in a number of infections normally endemic in hospital settings since the emergence of COVID-19. Studies such as the analysis of Clostridium difficile infection in health care settings by Bentivegna et al. 25 attribute a statistically significant reduction of this most common pathogen in health-care associated infections with a combination of such practices. A positive externality of NCIT screening for COVID-19 induced fever likely has been the identification and isolation of other 'sentinel' fever producing pathogens and reduction of secondary infections. Such screening and measures may prove invaluable post COVID-19.
With respect to further research, the authors recommend future studies be inclusive of febrile patients and welcome collaboration testing the conclusions of this work. Data for febrile patients was necessarily synthesized for this study, but the authors hold that sound statistical methodology and understanding of the given technology was applied in drawing the conclusions presented. The authors also recommend the inclusion and analysis of NCITs from additional manufacturers, infrared cameras, and nascent technologies.
With respect to NCITs, institutions must realize this is just one of several requisite components of an effective screening program. At their most effective, NCITs such as the Fluke 561, may simultaneously achieve sensitivity and specificity results of approximately 70%. Thus, when employed properly, such tools may correctly identify a majority of febrile individuals, but a non-trivial number of febrile and afebrile persons will be misclassified. It is therefore crucial to understand that temperature measurement through IR thermometry is simply a supplemental screening mechanism. It is not a substitute for accurate and reliable diagnostic tests. For this reason, confirmatory temperature measurement by reference thermometry should be used for those who are identified as positive by NCIT. Lastly, multi-step screening tools such as exposure risk and symptom questionnaires are essential and necessary components of an effective program.

Preliminaries
This study and the analyses herein were conducted and reported in the imperial system of units to ensure consistency of reporting in conjunction with U.S. federal agencies, healthcare agencies, and public and private healthcare systems. As such, all temperature measurements were taken, recorded, and analyzed in degrees Fahrenheit ( • F) and all distances in feet (ft) or inches (in). Thus, we refer the reader to the following conversion factors.
Conversion from T • F to T • C is achieved through T • C =