## Main

COVID-19 is an acute respiratory illness caused by the novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Since its outbreak in China in December 2019, over 2,573,143 cases have been confirmed worldwide (as of 21 April 2020; https://www.worldometers.info/coronavirus/). Although many people have presented with flu-like symptoms, widespread population testing is not yet available in most countries, including the United States (https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/testing-in-us.html) and United Kingdom1. Thus, it is important to identify the combination of symptoms most predictive of COVID-19, to help guide recommendations for self-isolation and prevent further spread of the disease2.

Case reports and mainstream media articles from various countries indicate that a number of patients with diagnosed COVID-19 developed anosmia (loss of smell)3,4. Mechanisms of action for the SARS-CoV-2 viral infection causing anosmia have been postulated5,6. Other studies indicate that a number of infected individuals present anosmia in the absence of other symptoms7,8, suggesting that this symptom could be used as screening tool to help identify people with potential mild cases who could be recommended to self-isolate9.

We investigated whether loss of smell and taste is specific to COVID-19 in 2,618,862 individuals who used an app-based symptom tracker10 (Methods). The symptom tracker is a free smartphone application that was launched in the United Kingdom on 24 March 2020, and in the United States on 29 March 2020. It collects data from both asymptomatic and symptomatic individuals and tracks in real time how the disease progresses by recording self-reported health information on a daily basis, including symptoms, hospitalization, reverse-transcription PCR (RT-PCR) test outcomes, demographic information and pre-existing medical conditions.

Between 24 March and 21 April 2020, 2,450,569 UK and 168,293 US individuals reported symptoms through the smartphone app. Of the 2,450,569 participants in the United Kingdom, 789,083 (32.2%) indicated having one or more potential symptoms of COVID-19 (Table 1). In total, 15,638 UK and 2,763 US app users reported having had an RT-PCR SARS-CoV-2 test, and having received the outcome of the test. In the UK cohort, 6,452 participants reported a positive test and 9,186 participants had a negative test. In the cohort from the United Kingdom, of the 6,452 participants who tested positive for SARS-CoV-2, 4,178 (64.76%) reported loss of smell and taste, compared with 2,083 out of 9,186 participants (22.68%) who tested negative (odds ratio (OR) = 6.40; 95% confidence interval (CI) = 5.96–6.87; P < 0.0001 after adjusting for age, sex and body mass index (BMI)). We replicated this result in the US subset of participants who had been tested for SARS-CoV-2 (adjusted OR = 10.01; 95% CI = 8.23–12.16; P < 0.0001) and combined the adjusted results using inverse variance fixed-effects meta-analysis (OR = 6.74; 95% CI = 6.31–7.21; P < 0.0001).

We re-ran logistic regressions adjusting for age, sex and BMI to identify other symptoms besides anosmia that might be associated with being infected by SARS-CoV-2. All ten symptoms queried (fever, persistent cough, fatigue, shortness of breath, diarrhea, delirium, skipped meals, abdominal pain, chest pain and hoarse voice) were associated with testing positive for COVID-19 in the UK cohort, after adjusting for multiple testing (Fig. 1a). In the US cohort, only loss of smell and taste, fatigue and skipped meals were associated with a positive test result.

We performed stepwise logistic regression in the UK cohort, by randomly dividing it into training and test sets (ratio: 80:20) to identify independent symptoms most strongly correlated with COVID-19, adjusting for age, sex and BMI. A combination of loss of smell and taste, fatigue, persistent cough and loss of appetite resulted in the best model (with the lowest Akaike information criterion). We therefore generated a linear model for symptoms that included loss of smell and taste, fatigue, persistent cough and loss of appetite to obtain a symptoms prediction model for COVID-19:

$$\begin{array}{l}{\rm{Prediction}}\,{\rm{model}} = - 1.32 - \left( {0.01 \times{\rm{age}}} \right)\\ + \left( {0.44 \times{\rm{sex}}} \right) + (1.75 \times{\rm{loss}}\,{\rm{of}}\,{\rm{smell}}\,{\rm{and}}\,{\rm{taste}})\\ + \left( {0.31 \times{\rm{severe}}\,{\rm{or}}\,{\rm{significant}}\,{\rm{persistent}}\,{\rm{cough}}} \right)\\ + \left( {0.49 \times{\rm{severe}}\,{\rm{fatigue}}} \right) + \left( {0.39 \times{\rm{skipped}}\,{\rm{meals}}} \right)\end{array}$$

where all symptoms are coded as 1 if the person self-reports the symptom and 0 if not. The sex feature is also binary, with 1 indicative of male participants and 0 representing females. The obtained value is then transformed into predicted probability using exp(x)/(1 + exp(x)) transformation followed by assigning cases of predicted COVID-19 for probabilities >0.5 and controls for probabilities <0.5.

In the UK test set, the prediction model had a sensitivity of 0.65 (0.62–0.67), a specificity of 0.78 (0.76–0.80), an area under the curve (AUC) of the receiver operating characteristic curve (ROC) (that is, ROC-AUC) of 0.76 (0.74–0.78), a positive predictive value of 0.69 (0.66–0.71) and a negative predictive value of 0.75 (0.73–0.77) (Fig. 1b). A cross-validation ROC-AUC was 0.75 (0.74–0.76) in the 15,638 UK users who were tested for SARS-CoV-2. In this model, the strongest predictor was loss of smell and taste (Fig. 1a). Excluding loss of smell and taste from the model resulted in reduced sensitivity (0.33 (0.30–0.35)) but increased specificity (0.84 (0.83–0.86)). We also computed the ROC-AUC with stratification for sex and age groups and found that the results were similar in all groups, with no significant differences between strata, suggesting that our model works similarly within different sex and age groups. We validated the model in the US cohort and found an ROC-AUC of 0.76 (0.74–0.78), a sensitivity of 0.66 (0.62–0.69), a specificity of 0.83 (0.82–0.85), a positive predictive value of 0.58 (0.55–0.62) and a negative predictive value 0.87 (0.86–0.89) (Fig. 1c).

We also queried whether the association between loss of smell and taste and COVID-19 was influenced by mainstream media reports. We assessed the correlation between loss of smell and taste and being COVID-19 positive in different date ranges: (1) 24 March to 3 April 2020, following a number of reports in the UK mainstream media (for example, ref. 11) reporting anosmia as a symptom of COVID-19; (2) the week of 4–10 April 2020; and (3) from 11–21 April 2020. In the United Kingdom, the OR (95% CI) values for the associations of self-reported loss of smell and taste and a positive test for COVID-19 across these periods were 4.98 (4.47–5.56), 6.64 (5.75–7.68) and 10.40 (9.08–11.91), respectively, suggesting that awareness of loss of smell and taste as symptoms of COVID-19 in the UK has increased following media reports. However, this association was not found in the US cohorts: 24 March to 3 April: 8.13 (5.18–12.78); 4–10 April: 12.30 (8.96–16.90); 11–21 April: 9.13 (6.73–12.38).

Finally, we applied the predictive model to the 805,753 UK and US symptom-reporting individuals who had not been tested for COVID-19 and found that, according to our model, 140,312 (116,400–164,224) of these 805,753 participants (17.42% (14.45–20.39%)) reporting some symptoms were likely to be infected by the virus, representing 5.36% as a proportion of the overall responders to the app.

We report that loss of smell and taste is a potential predictor of COVID-19 in addition to other, more established, symptoms including high temperature and a new, persistent cough. COVID-19 appears to cause problems of smell receptors in line with many other respiratory viruses, including previous coronaviruses thought to account for 10–15% of cases of anosmia7,9.

We also identify a combination of symptoms, including anosmia, fatigue, persistent cough and loss of appetite, that together might identify individuals with COVID-19.

A major limitation of the current study is the self-report nature of the data included, which cannot replace physiological assessments of olfactory and gustatory function or nucleotide-based testing for SARS-CoV-2. Both false negative and false positive reports could be included in the dataset12, and because of the way the questions are asked, gustatory and olfactory losses are conflated. Second, at present, we do not know whether anosmia was acquired before or after other COVID-19 symptoms, or during the illness or afterwards. This information could become available as currently healthy users track symptom development over time. As more accurate tests become available, we have the ability to optimize our model. One caveat of our study is that the individuals on which the model was trained are not representative of the general population because performing tests for SARS-CoV-2 is not random. Testing is more likely to be done if an individual develops severe symptoms requiring hospitalization, if an individual has been known to have had contact with people who have tested positive for SARS-CoV-2 infection, in health workers, and if an individual has traveled in an area of high risk of exposure. Therefore, our results may overestimate the number of expected positive cases of SARS-CoV-2 infection. Additionally, volunteers using the app are a self-selected group who might not be fully representative of the general population. Another limitation is the potential effect that mainstream media coverage of loss of smell and taste and COVID-19 might have had on app responses. We found that these reports might have influenced UK responders, for whom there was a temporal trend in the strength of the association. However, there was no such association in the US cohort; therefore, we conclude that regardless of any bias introduced by mainstream media reports, the association between COVID-19 and loss of smell and taste remains strong.

Our work suggests that loss of sense of smell and taste could be included as part of routine screening for COVID-19 and should be added to the symptom list currently developed by the World Health Organization (www.who.int/health-topics/coronavirus). A detailed study on the natural history of broader COVID-19 symptoms, especially according to timing and frequency, will help us to understand the usefulness of symptom tracking and modeling, and to identify probable clusters of infection.

## Methods

### Study setting and participants

The COVID Symptom Study smartphone-based app (previously known as COVID Symptom Tracker) was developed by Zoe Global, in collaboration with King’s College London and Massachusetts General Hospital, and was launched in the United Kingdom on 24 March 2020 and in the United States on 29 March 2020. After 3 weeks, it had reached 2,618,862 users. It enables the capture of self-reported information related to COVID-19, as described previously10. The survey questions are available in Supplementary Table 1. On first use, the app records self-reported location, age and core health risk factors. With continued use and notifications, participants provide daily updates on symptoms, health care visits, COVID-19 testing results and whether they are self-quarantining or seeking health care, including the level of intervention and related outcomes. Individuals without apparent symptoms are also encouraged to use the app.

### Ethics

The King’s College London Ethics Committee approved the ethics for the app, and all users provided consent for non-commercial use. An informal consultation with TwinsUK members over email and social media before the app was launched found that they were overwhelmingly supportive of the project. The US protocol was approved by the Partners Human Research Committee.

### Statistical analysis

Data from the app were downloaded to a server and only records where the self-reported characteristics fell within the following ranges were utilized for further analysis: age: 16–90 years (18 years in the United States); height: 110–220 cm; weight: 40–200 kg; BMI: 14–45 kg m2; and temperature: 35–42 °C. The individuals whose data were included to develop and test the prediction model were those who had completed the report for symptoms in the app and who declared that they had been tested for SARS-CoV-2 by RT-PCR and received the result. Only individuals who answered at least nine of the ten symptom questions, and who answered about loss of smell and taste, were included.

Baseline characteristics are presented as the number (percentage) for categorical variables and the mean (standard deviation) for continuous variables. Multivariate logistic regression adjusting for age, sex and BMI was applied to investigate the correlation between loss of smell and taste and COVID-19 in 15,368 UK users of the symptom tracker app who were also tested in the laboratory for SARS-CoV-2 (6,452 UK individuals tested positive and 9,186 tested negative). The results were replicated in 726 US individuals who tested positive and 2,037 US individuals who tested negative. We then randomly split the UK sample into training and test sets with a ratio of 80:20. In the training set, we performed stepwise logistic regression combining forward and backward algorithms, to identify other symptoms associated with COVID-19 independent of loss of smell and taste. We included in the model ten other symptoms (fever, persistent cough, fatigue, shortness of breath, diarrhea, delirium, skipped meals, abdominal pain, chest pain and hoarse voice) as well as age, sex and BMI, and chose as the best model the one with the lowest Akaike information criterion. We then assessed the performance of the model both in the test set and via tenfold cross-validation in the entire UK sample of 15,638 individuals using the R package cvAUC13. We further validated the prediction model in the US cohort.

For our predictive model, using the R packages pROC and epiR, we further computed the AUC (that is, the overall diagnostic performance of the model), sensitivity (positivity in disease; that is, the proportion of subjects who have the target condition (reference standard positive) and give positive test results) and specificity (negativity in health; that is, the proportion of subjects without a SARS-CoV-2 RT-PCR test who give negative model results).

Finally, we applied the predictive model to the 805,753 individuals reporting symptoms who had not had a SARS-CoV-2 test, to estimate the percentage of individuals reporting some COVID-19 symptoms who were likely to be infected by the virus. The proportion of estimated infections was calculated repeatedly by sampling the dataset (with replacement) to obtain the 95% CIs.

### Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.