Traditional screening for COVID-19 typically includes survey questions about symptoms and travel history, as well as temperature measurements. Here, we explore whether personal sensor data collected over time may help identify subtle changes indicating an infection, such as in patients with COVID-19. We have developed a smartphone app that collects smartwatch and activity tracker data, as well as self-reported symptoms and diagnostic testing results, from individuals in the United States, and have assessed whether symptom and sensor data can differentiate COVID-19 positive versus negative cases in symptomatic individuals. We enrolled 30,529 participants between 25 March and 7 June 2020, of whom 3,811 reported symptoms. Of these symptomatic individuals, 54 reported testing positive and 279 negative for COVID-19. We found that a combination of symptom and sensor data resulted in an area under the curve (AUC) of 0.80 (interquartile range (IQR): 0.73–0.86) for discriminating between symptomatic individuals who were positive or negative for COVID-19, a performance that is significantly better (P < 0.01) than a model1 that considers symptoms alone (AUC = 0.71; IQR: 0.63–0.79). Such continuous, passively captured data may be complementary to virus testing, which is generally a one-off or infrequent sampling assay.
Owing to the current lack of fast and reliable testing, one of the greatest challenges for preventing transmission of SARS-CoV-2 is the ability to quickly identify, trace and isolate cases before they can further spread the infection to susceptible individuals. As regions across the United States start implementing measures to reopen businesses, schools and other activities, many rely on current screening practices for COVID-19, which typically include a combination of symptom and travel-related survey questions and temperature measurements. However, this method is likely to miss pre-symptomatic or asymptomatic cases, which make up ~40–45% of those infected with SARS-CoV-2, and who can still be infectious1,2. An elevated temperature (>100 °F (>37.8 °C)) is not as common as frequently believed, being present in only 12% of individuals who tested positive for COVID-193 and just 31% of patients hospitalized with COVID-19 (at the time of admission)4.
Smartwatches and activity trackers, which are now worn by one in five Americans5, can improve our ability to objectively characterize each individual’s unique baseline for resting heart rate6, sleep7 and activity and can therefore be used to identify subtle changes in that user’s data that may indicate that they are coming down with a viral illness. Previous research from our group has shown that this method, when aggregated at the population level, can significantly improve real-time predictions for influenza-like illness8. Consequently, we created a prospective app-based research platform, called DETECT (Digital Engagement and Tracking for Early Control and Treatment), where individuals can share their sensor data, self-reported symptoms, diagnoses and electronic health record data with the aim of improving our ability to identify and track individual- and population-level viral illnesses, including COVID-19.
A previously reported study that captured symptom data in over 18,000 SARS-CoV-2-tested individuals via a smartphone-based app found that symptoms were able to help distinguish between individuals with and without COVID-191. The aim of this study is to investigate if the addition of individual changes in sensor data to symptom data can be used to improve our ability to identify COVID-19-positive versus COVID-19-negative cases among participants who self-reported symptoms.
Between 25 March and 7 June 2020, our research study enrolled 30,529 individuals, with representation from every state in the United States. Among the consented individuals, 62.0% are female and 12.8% are 65 or more years old. Of the participants, 78.4% connected their Fitbit devices to the study app, 31.2% connected the data from the Apple HealthKit, while 8.1% connected data from Google Fit (note that an individual can connect to multiple platforms). In addition, 3,811 reported at least one symptom (12.5%); of those, 54 also reported testing positive for COVID-19 and 279 reported testing negative. The numbers of days per different data type and data aggregator system are reported in Table 1, while the symptoms distribution for symptomatic individuals tested for COVID-19, or not tested, is shown in Fig. 1.
A minority of symptomatic participants (30.3%) who tested for COVID-19 had a resting heart rate (RHR) greater than two standard deviations above the average baseline value during symptoms. The change in RHR on its own (Table 1) did not allow significant discrimination between COVID-19-positive and COVID-19-negative participants using the RHRMetric (area under the curve (AUC) of 0.52 (interquartile range (IQR): 0.41–0.64)) (Fig. 2a).
Sleep and activity did show a significant difference among the two groups (Table 1), with an AUC of 0.68 (0.57–0.79) for the SleepMetric (Fig. 2b) and 0.69 (0.61–0.77) for the ActivityMetric (Fig. 2c), supporting that the sleep and activity of COVID-19 positive participants were impacted significantly more than COVID-19-negative participants. Sleep and activity are slightly correlated, with a negative correlation coefficient of −0.28, P < 0.01.
To evaluate the contribution of all the data types commonly available through personal devices, we combined the RHR, sleep and activity metrics in a single metric (SensorMetric, Fig. 2d). This improved the overall performance from the three sensor metrics to an AUC of 0.72 (0.64–0.80).
We also considered a model based only on self-reported symptoms (SymptomMetric, Fig. 2e), along with age and sex. With respect to the previously published model1, we measure a slightly lower AUC of 0.71 (0.63–0.79).
When participant-reported symptoms and sensor metrics are jointly considered in the analysis (OverallMetric, Fig. 2f), the achieved performance was significantly improved (P < 0.01) relative to either alone, with an AUC of 0.80 (0.73–0.86).
Our results show that individual changes in physiological measures captured by most smartwatches and activity trackers are able to significantly improve the distinction between symptomatic individuals with and without a diagnosis of COVID-19 beyond symptoms alone. Although encouraging, these results are based on a relatively small sample of participants.
This work builds on our earlier retrospective analysis demonstrating the potential for consumer sensors to identify individuals with influenza-like illness, which has subsequently been replicated in a similar analysis of over 1.3 million wearable users in China for predicting COVID-198,9. In response to the COVID-19 pandemic, a number of prospective studies, led by device manufacturers and/or academic institutions, including DETECT, have accelerated deployment to allow interested individuals to voluntarily share their sensor and clinical data to help address the global crisis10,11,12,13,14. The largest of these efforts, Corona-Datenspende, was developed by the Robert Koch Institut in Germany and has enrolled over 500,000 volunteers15.
As different individuals experience a wide range of symptomatic and biological responses to infection with SARS-CoV-2, it is likely that their measurable physiological changes will also vary16,17,18. For that reason, it is possible that biometric changes may be more valuable in identifying those at highest risk for decompensation rather than just a dichotomous distinction in infection status. Because of the limited testing in the United States, especially early in the spread of the COVID-19 pandemic, individuals with more severe symptoms may have been more likely to be tested. In fact, the majority of symptomatic participants in our study did not undergo testing. However, using the optimal tradeoff of sensitivity and specificity on the ROC, we would predict that, of the 3,478 symptomatic participants who did not undergo diagnostic testing, 1,061 would have tested positive. Consequently, the ability to differentiate between COVID-19-positive and COVID-19-negative cases based on symptoms and sensor data may change over time as testing increases, and as other upper respiratory illnesses such as seasonal influenza increase this fall.
The early identification of symptomatic and pre-symptomatic infected individuals would be especially valuable as transmission is common and people may potentially be even more infectious during this period19,20,21. Even when individuals have no symptoms, there is evidence that the majority have lung injury (according to computed tomography (CT) scans), and a large number have abnormalities in inflammatory markers, blood cell counts and liver enzymes18,22,23,24. As the depth and diversity of data types from personal sensors continue to expand—such as heart rate variability (HRV), respiratory rate, temperature, oxygen saturation and even continuous blood pressure, cardiac output and systemic vascular resistance—the ability to detect subtle individual changes in response to early infectious insults will potentially improve and enable the identification of individuals without symptoms.
In the past, the normality of a specific biometric parameter, such as RHR, duration of nightly sleep or daily activity, was based on population norms. For example, a normal RHR is generally considered anything between ~60–100 b.p.m. However, recent work looking at individual daily RHRs over two years found that each person has a relatively consistent RHR, for them, that fluctuates by a median of only 3 b.p.m. weekly6. On the other hand, what would be considered a normal RHR for an individual can vary by as much as 70 b.p.m. (between 40 and 109 b.p.m.) between individuals. The potential value in identifying important changes in an individual’s RHR as an early marker for COVID-19 infection is suggested by the description of 5,700 patients hospitalized with COVID-194: at the time of admission, a greater percentage of individuals had a heart rate of >100 b.p.m. (43.1%) than had a fever (30.7%). Similarly, work in primate models of other viral and bacterial infections found that a significant increase in heart rate can be detected ~2 days before a fever25.
Just as individuals have heart rate patterns that are unique to them, the same is true for sleep patterns. Although population norms for sleep duration have been defined by one-time survey data26, longitudinal analysis of daily sleep over several years supports much greater variation in what is normal for a specific individual7. Recognizing what is normal for an individual enables much earlier detection of deviations from that normal.
A strategy of test, trace and isolate has played a central role in helping control the spread of COVID-19. However, testing comes with many challenges, including the enormous logistical and cost hurdles of recurrently testing asymptomatic individuals. In addition, testing in a population with very low prevalence can lead to a high proportion of false positive cases. A refined predictive model, based on personal sensors, could enable an early, individualized testing strategy to improve performance and lower costs. Early testing may make the use of a contact tracing app more effective by identifying positive cases in advance and allowing for early isolation.
DETECT (and similar studies) also represent the transitioning of research from a dependence on brick and mortar research centers to a remote, direct-to-participant approach now possible through a range of digital technologies, including an ever expanding collection of sensors, applications of machine learning to massive datasets, and the ubiquitous connectivity that enables rapid two-way communications 24/727,28. The promise of digital technologies is that their evolution will continue to bring us closer to identifying the best combination of measures and associated algorithms that identify infection with SARS-CoV-2 or other pathogens. However, it is equally critical to develop and continuously improve on an engaging digital platform that provides value to participants and researchers. This has proven to be extremely challenging, with a recent analysis of eight different digital research programs involving 100,000 participants having a median duration of retention of only 5.5 days29. Digital trials such as DETECT also do come with unique challenges to assure privacy and security, which can only be dealt with by effectively informing participants before consent, storing the data with the appropriate level of security and providing access to the data only for research purposes30. App-based contact tracing, which is not part of DETECT, is an especially sensitive and ethically complicated use of digital technology that can be used to address the pandemic31.
Our analyses are dependent entirely on participant-reported symptoms and testing results, as well as the biometric data from their personal devices. Although this is not consistent with the historically more common direct collection of information in a controlled laboratory setting or via electronic health records, previous work has confirmed their value and their accuracy beyond data routinely captured during routine care32,33,34. Additionally, individuals owning a smartwatch or activity tracker and having access to COVID-19 diagnostic testing are unlikely to be representative of the general population and may exclude those most affected by COVID. Although a recent survey found no racial or ethnic variation in smartwatch or activity tracker usage (23%, 26% and 21% for Black, Hispanic and White individuals, respectively), the lowest percentage of users were identified in those with the lowest annual earnings (12%), the lowest educational attainment (15%) and in those over age 50 (17%)5. In the future, if the value of wearable devices to improve individual health is confirmed, this gap in usage will need to be proactively addressed to assure health equity. The decreasing cost of these devices, some now less than US$35, will help decrease the financial barriers to accomplishing this. Finally, in the early version of the DETECT app we were not able to track the duration or trajectory of individual symptoms, care received and eventual outcomes.
These results suggest that sensor data can incrementally improve symptom-only-based models to differentiate between COVID-19-positive and COVID-19-negative symptomatic individuals, with the potential to enhance our ability to identify a cluster before more spread occurs. Such a passive monitoring strategy may be complementary to virus testing, which is generally a one-off, or infrequent, sampling assay.
Any person living in the United States over the age of 18 years old is eligible to participate in the DETECT study by downloading the iOS or Android research app, MyDataHelps. After consenting into the study, participants are asked to share their personal device data (including historical data collected prior to enrollment), report symptoms and diagnostic test results, and connect their electronic health records. Participants can opt to share as much or as little data as they like. Data can be pulled in via direct application programming interface (API) with Fitbit devices, and any device connected through Apple HealthKit or Google Fit data aggregators. Participants were recruited via the study website (www.detectstudy.org), media reports and outreach from our partners at Fitbit, Walgreens, CVS/Aetna and others.
The protocol for this study was reviewed and approved by the Scripps Office for the Protection of Research Subjects (IRB 20–7531). All individuals participating in the study provided informed consent electronically.
Only participants with self-reported symptoms and COVID-19 test results were considered in this analysis. For each participant, two sets of data were extracted: the baseline data, which included signals spanning from 21 to 7 days before the reported start date of symptoms, and the test data, which included signals beginning at the first date of symptoms to seven days after symptoms. Three types of data were considered from personal sensors: daily resting heart rate (DailyRHR), sleep duration in minutes (DailySleep) and activity based on daily total step count (DailyActivity). The daily resting heart rate is calculated by the specific device35. The total amount of sleep for a given day was based on the total period of sleep between 12 noon of the current day to 12 noon of the next day. When multiple devices from the same individual provided the same information, Fitbit device data were prioritized, for consistency. Overlapping data were combined minute by minute, before aggregating for the whole day.
A single baseline value per individual was extracted for each data type by considering the median value over the individual’s baseline data. This value is representative of a participant’s ‘normal’ before the reported symptoms. The baseline value was compared to the test data as follows:
Values were normalized to have a unitary IQR using normalization parameters calculated on all data recorded. For all these metrics, values close to zero indicate small variations from baseline values. This allows us to focus on intra-individual changes, which are minimally affected by the inter-individual variability due to the specific sensor’s hardware and estimation algorithms. For the metric based on symptoms only, we adapted the results from the study by Menni et al.1 to our available data:
The multivariate logistic regression model from Menni et al. combined symptoms, age and gender to predict an infection. The parameters were optimized by the authors on a large dataset including over 2 million people, 18,401 of which had undergone a COVID-19 test.
A simple manual metric aggregation strategy without optimization was used to enable a clear understanding of the benefits provided when data from multiple sources were considered together. The aggregated metrics were
The main outcomes are ROC curves for each of the proposed metrics. The curves are obtained by considering a binary classification task between participants self-reported as COVID-19-positive and COVID-19-negative. The models are based on a single decision threshold, which is directly compared to the metric values, with the aim of minimizing overfitting issues while providing a fair comparison. Confidence intervals, reported with a confidence level of 95%, are estimated using a bootstrap method by repeatedly sampling the dataset with replacement. The sampling is performed in a stratified manner; that is, the balance of the classes is maintained over all experiments. Values for sensitivity (SE), specificity (SP), positive predictive value (PPV) and negative predictive value (NPV) were also calculated (Fig. 2). SE and SP are defined as the fraction of positive and negative individuals correctly classified, respectively, while PPV and NPV are the fraction of individuals predicted as positive and negative that are correctly classified, respectively. These values are based on the point in the ROC with the optimal tradeoff between sensitivity and specificity, which may vary depending on the shape of the curve. For each metric analyzed, we applied the one-sided Mann–Whitney U test with the alternate hypothesis that the underlying model of the positive class is stochastically greater than the negative class. All statistical tests were evaluated using the Python package scipy version 1.5.2. The comparison metric to assess the overall performance was the AUC of the ROC.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
All interested investigators will be allowed access to the analysis dataset following registration and pledging to not re-identify individuals or share the data with a third party. All data inquiries should be addressed to the corresponding author.
Menni, C. et al. Real-time tracking of self-reported symptoms toÿ predict potential COVID-19.Nat. Med 26, 1037–1040 (2020).
Oran, D. P. & Topol, E. J. Prevalence of asymptomatic SARS-CoV-2 infection. Ann. Intern. Med. https://doi.org/10.7326/M20-3012 (2020).
New COVID-19 Test Data (Color Genomics, 2020); https://www.color.com/new-covid-19-test-data-majority-of-people-who-test-positive-for-covid-19-have-mild-symptoms-or-are-asymptomatic
Richardson, S. et al. Presenting characteristics, comorbidities and outcomes among 5,700 patients hospitalized with COVID-19 in the New York City area.JAMA 323, 2052–2059 (2020).
Vogels, E. A. About One-in-Five Americans Use a Smart Watch or Fitness Tracker (Pew Research Center, 2020); https://www.pewresearch.org/fact-tank/2020/01/09/about-one-in-five-americans-use-a-smart-watch-or-fitness-tracker/
Quer, G., Gouda, P., Galarnyk, M., Topol, E. J. & Steinhubl, S. R. Inter- and intraindividual variability in daily resting heart rate and its associations with age, sex, sleep, BMI and time of year: retrospective, longitudinal cohort study of 92,457 adults. PLoS ONE 15, e0227709 (2020).
Jaiswal, S. J. et al. Association of sleep duration and variability with body mass index: Sleep measurements in a large US population of wearable sensor users. JAMA Intern. Med. https://doi.org/10.1001/jamainternmed.2020.2834 (2020).
Radin, J. M., Wineinger, N. E., Topol, E. J. & Steinhubl, S. R. Harnessing wearable device data to improve state-level real-time surveillance of influenza-like illness in the USA: a population-based study. Lancet Digit. Health 2, e85–e93 (2020).
Zhu, G. et al. Learning from large-scale wearable device data for predicting epidemics trend of COVID-19. Discrete Dynamics Nat. Soc. 2020, 6152041 (2020).
Mishra, T. et al. Early detection of COVID-19 using a smartwatch. Preprint at medRxiv https://doi.org/10.1101/2020.07.06.20147512 (2020).
Natarajan, A., Su, H.-W. & Heneghan, C. Assessment of physiological signs associated with COVID-19 measured using wearable devices. Preprint at https://doi.org/10.1101/2020.08.14.20175265 (2020).
Evidation Health and BARDA Partner on Early Warning System for COVID-19 (Evidation, 2020); https://evidation.com/news/evidationhealthandbardapartner/
Tempredict Study (Oura Health, 2020); https://ouraring.com/ucsf-tempredict-study
Covidentify (Duke University, 2020); https://covidentify.covid19.duke.edu/
Corona Datenspende (Robert Koch Institut, 2020); https://corona-datenspende.de/science/en
Shen, B. et al. Proteomic and metabolomic characterization of COVID-19 patient sera. Cell 182, 59–72 (2020).
Sharma, R., Agarwal, M., Gupta, M., Somendra, S. & Saxena, S. K. in Coronavirus Disease 2019 (COVID-19) (ed. Saxena, S.) 55–70 (Springer, 2020).
Tabata, S. et al. Clinical characteristics of COVID-19 in 104 people with SARS-CoV-2 infection on the Diamond Princess cruise ship: a retrospective analysis. Lancet Infect. Dis. 20, 1043–1050 (2020).
Ferretti, L. et al. Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science 368, eabb6936 (2020).
Chau, N. V. V. et al. The natural history and transmission potential of asymptomatic SARS-CoV-2 infection. Clin. Infect. Dis. https://doi.org/10.1093/cid/ciaa711 (2020).
Jing, Q. L. et al. Household secondary attack rate of COVID-19 and associated determinants in Guangzhou, China: a retrospective cohort study. Lancet Infect. Dis. 20, 1141–1150 (2020).
Meng, H. et al. CT imaging and clinical course of asymptomatic cases with COVID-19 pneumonia at admission in Wuhan, China. J. Infect. 81, e33–e39 (2020).
Inui, S. et al. Chest CT findings in cases from the cruise ship ‘Diamond Princess’ with coronavirus disease 2019 (COVID-19). Radiol. Cardiothorac. Imaging 2, e200110 (2020).
Long, Q. X. et al. Clinical and immunological assessment of asymptomatic SARS-CoV-2 infections. Nat. Med 26, 1200–1204 (2020).
Milechin, L. et al. Detecting pathogen exposure during the non-symptomatic incubation period using physiological data. Preprint at bioRxiv https://doi.org/10.1101/218818 (2017).
Sleep and Sleep Disorders (Center for Disease Control and Prevention, 2020); https://www.cdc.gov/sleep/data_statistics.html
Steinhubl, S. R., Wolff-Hughes, D. L., Nilsen, W., Iturriaga, E. & Califf, R. M. Digital clinical trials: creating a vision for the future. NPJ Digit. Med. 2, 126 (2019).
Steinhubl, S. R., McGovern, P., Dylan, J. & Topol, E. J. The digitised clinical trial. Lancet 390, 2135 (2017).
Pratap, A. et al. Indicators of retention in remote digital health studies: a cross-study evaluation of 100,000 participants. NPJ Digit. Med. 3, 21 (2020).
Coravos, A. et al. Modernizing and designing evaluation frameworks for connected sensor technologies in medicine. NPJ Digit. Med. 3, 37 (2020).
Bradford, L. R., Aboy, M. & Liddell, K. COVID-19 contact tracing Apps: a stress test for privacy, the GDPR and data protection regimes. J. Law Biosci. https://doi.org/10.1093/jlb/lsaa034 (2020).
Rivera, S. C. et al. The impact of patient-reported outcome (PRO) data from clinical trials: a systematic review and critical analysis. Health Qual. Life Outcomes 17, 156 (2019).
Basch, E. et al. Overall survival results of a trial assessing patient-reported outcomes for symptom monitoring during routine cancer treatment. JAMA 318, 197–198 (2017).
Bell, S. K. et al. Frequency and types of patient-reported errors in electronic health record ambulatory care notes. JAMA Netw. Open 3, e205867 (2020).
Heneghan, C., Venkatraman, S. & Russell, A. Investigation of an estimate of daily resting heart rate using a consumer wearable device. Preprint at medRxiv https://doi.org/10.1101/19008771 (2019).
This work was funded by grant no. UL1TR002550 from the National Center for Advancing Translational Sciences (NCATS) at the National Institutes of Health (NIH; E.J.T. and S.R.S.). We thank N. Dalton for his support of DETECT. We also thank D. Oran, T. Peters, R. Kamyar and C. Nowak for their contributions to DETECT.
S.R.S. reports grants from Janssen and personal fees from Otsuka and Livongo, outside the submitted work. The other authors declare no competing interests.
Editor recognition statement Michael Basson was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Quer, G., Radin, J.M., Gadaleta, M. et al. Wearable sensor data and self-reported symptoms for COVID-19 detection. Nat Med 27, 73–77 (2021). https://doi.org/10.1038/s41591-020-1123-x
This article is cited by
Applying machine learning to consumer wearable data for the early detection of complications after pediatric appendectomy
npj Digital Medicine (2023)
Descriptive characteristics of continuous oximetry measurement in moderate to severe covid-19 patients
Scientific Reports (2023)
Journal of Racial and Ethnic Health Disparities (2023)
npj Digital Medicine (2022)