Saliva is more sensitive than nasopharyngeal or nasal swabs for diagnosis of asymptomatic and mild COVID-19 infection

We aimed to test the sensitivity of naso-oropharyngeal saliva and self-administered nasal (SN) swab compared to nasopharyngeal (NP) swab for COVID-19 testing in a large cohort of migrant workers in Singapore. We also tested the utility of next-generation sequencing (NGS) for diagnosis of COVID-19. Saliva, NP and SN swabs were collected from subjects who presented with acute respiratory infection, their asymptomatic roommates, and prior confirmed cases who were undergoing isolation at a community care facility in June 2020. All samples were tested using RT-PCR. SARS-CoV-2 amplicon-based NGS with phylogenetic analysis was done for 30 samples. We recruited 200 subjects, of which 91 and 46 were tested twice and thrice respectively. In total, 62.0%, 44.5%, and 37.7% of saliva, NP and SN samples were positive. Cycle threshold (Ct) values were lower during the earlier period of infection across all sample types. The percentage of test-positive saliva was higher than NP and SN swabs. We found a strong correlation between viral genome coverage by NGS and Ct values for SARS-CoV-2. Phylogenetic analyses revealed Clade O and lineage B.6 known to be circulating in Singapore. We found saliva to be a sensitive and viable sample for COVID-19 diagnosis.


Scientific Reports
| (2021) 11:3134 | https://doi.org/10.1038/s41598-021-82787-z www.nature.com/scientificreports/ specimens, including NP swabs 16 . However, one caveat relates to how saliva is collected-saliva is a complex bio-mixture which can consist of salivary gland secretion, gingival crevicular fluid, sputum and/or mucosal transudate, in varying proportions depending on collection method. Some studies tested only secretions from the mouth 11,12 , others explicitly tested "posterior oropharyngeal" or "deep throat" saliva with secretions from the oropharynx 7-10 , while the rest were unspecified [13][14][15][16] . We aimed to test the sensitivity of "naso-oropharyngeal" saliva and SN swabs compared to NP swabs in a large cohort of migrant workers in Singapore using RT-PCR testing. We additionally used direct-from-RNA amplicon-based next-generation sequencing (NGS) for confirmatory detection of low-level SARS-CoV-2 signal and to establish phylogeny for tested samples.

Methods
Study population. Subjects were recruited between 2nd and 26th June 2020 from two sites-a 5400-bed purpose-built dormitory where migrant workers were housed in large rooms holding 7-20 workers, and a community care facility (CCF) where migrant workers diagnosed with COVID-19 but not requiring acute hospital care were sent for isolation and monitoring. All subjects at the CCF are prior confirmed cases (via RT-PCR), while subjects from the dormitory comprised two groups-(1) migrant workers presenting with symptoms of acute respiratory tract infection (ARI); and (2) asymptomatic roommates of newly diagnosed COVID-19 cases.
Ethics statement. This study was approved by the Director of Medical Services, Ministry of Health, under Singapore's Infectious Disease Act 17 . Under this Act, in the event of a major outbreak, the Director may require the obtainment of such information or samples (including human samples) as deemed appropriate or necessary that will be of significant public health benefit to the country 17 . Informed consent was obtained from all participants, and all methods were performed in accordance to Singapore guidelines and regulations for biomedical research.
Sample collection. Migrant workers from the purpose-built dormitory presenting with ARI were assessed by physicians at the medical post, who made the decision for whether diagnostic NP swabs for COVID-19 testing was necessary. Those workers requiring NP swabs were immediately approached for study participation, and consent was taken where agreeable.
For the collection of SN swabs, participants were instructed to insert the swab (about 1 cm) into their nostrils (one at a time), tilt their head back slightly, and rotate the swab in a circular motion for 3 times around the nasal wall. The swab was then inserted into the collection tube. For the collection of naso-oropharyngeal saliva samples, participants were asked to tilt their head back slightly, clear their throat and nose, and spit the saliva into the collection bottle. The steps were repeated until the required volume (2 mL) was achieved. For "nasooropharyngeal" saliva collection, instructional videos (video link in English: https:// youtu. be/ 4jGrJ UbjBBs) in the major native languages of the migrant workers were shown, following which these samples were collected under the supervision of a trained researcher.
For consenting subjects from CCF and asymptomatic roommates of newly diagnosed cases at the dormitory, NP swab collection procedure was performed by a trained researcher. SN swab and saliva samples were collected in the same sitting.
Each subject was tested up to three times at 2-3-days interval where possible, in order to compare the sensitivity of different samples across time. Subjects from the purpose-built dormitory who tested negative across all three samples during the first round of testing were not retested. Subjects from the CCF were not retested if all samples from the initial two rounds were negative.
NP swabs from subjects with ARI were sent dry in cooler boxes to the Singapore General Hospital (SGH) molecular laboratory as part of routine clinical testing. NP swabs and self-administered nasal swabs from other subjects were sent in 3 mL of viral transport medium, while up to 2 mL of saliva was collected in a container with 2 mL of viral RNA stabilization fluid (SAFER-Sample Stabilization Fluid, Lucence, Singapore) before transfer to Lucence. All samples were processed within the same day. Both service laboratories are the College of American Pathologists (CAP) accredited, and Lucence is CLIA-licensed.
Laboratory testing. RT-PCR at SGH was performed using the automated cobas 6800 system (Roche, Branchburg, NJ, USA) on an automated cobas 6800 system, with results inferred according to the manufacturer's specifications. NP and saliva samples sent to Lucence Laboratory underwent RNA extraction (200 μL of the sample) (GeneAid Biotech Ltd) and were tested with a laboratory-developed RT-PCR test (CDC-LDT) based on primers published by the Division of Viral Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA 18 , while saliva and SN swabs were additionally tested using the Fortitude 2.1 kit (MiRXES, Singapore). The analytical limit of detection of the CDC-LDT was determined to be 25 copies per reaction based on a synthetic SARS-CoV-2 genome (Twist Bioscience). Saliva was pre-processed with the addition of dithiothreitol (DTT) at 0.4-0.85% of total sample volume, vortexing, and incubation at room temperature for 15 min. Solubilization was visibly apparent post-treatment at room temperature and RNA was extracted immediately post-treatment.
A limited number of samples was selected for the initial stage of determining performance specifications for a NGS-based SARS-CoV-2 assay. Both saliva and SN swab samples were included to demonstrate compatibility of RNA extracts from samples collected in the viral RNA stabilization fluid. Thirty samples were selected including high and low viral load samples, and those that had discordant results from the two RT-PCR assays. Six of the 30 samples were paired sets of saliva and SN swab samples from the same time point for 3 individuals, and SARS-CoV-2 amplicon-based NGS was done using 330 primer pairs to generate amplicons (size range 130-178 bp) covering the entire virus genome (except the first 25 bases and 30 bases upstream of the final polyA tail) to establish a direct-from-sample workflow. To rule out potential non-specific amplification of other viruses related in sequence, all amplicons were verified to have limited similarity to sarbecoviruses, outside of SARS-related coronaviruses (assumed not to be present in circulation). The threshold coverage (%) for making positive call by NGS was established by performing NGS on 5 negative (by RT-PCR) samples from this study, 11 negative NP swab samples from community testing, and 10 no-template controls (NTC). For samples with complete viral genomes (100% coverage ≥ 1 × coverage), phylogenetic analysis was performed to identify lineages based on sequence variants. Statistical methods. We described our data using frequencies/percentages and median/interquartile range. We assessed the comparability between sampling methods using kappa-statistic and percent agreement. STATA 13.1 (StataCorp, Texas, USA) was used for all statistical calculations.

Results
We recruited 200 subjects-149 from the dormitory and 51 from CCF. There were 45 subjects with ARI and 104 asymptomatic close contacts recruited from the purpose-built dormitory, while 51 subjects with confirmed COVID-19 (8 asymptomatic at the time of diagnosis) were recruited at the CCF (Table 1).
Of 200 subjects, 91 and 46 completed second and third rounds of testing, respectively, resulting in 337 sets of tests (Table 2). Because COVID-19-positive migrant workers were rapidly transferred out of dormitories to CCFs, all but one subject from the dormitory site did not complete the planned testing. The median time period between the date of diagnosis and the first round of testing in asymptomatic subjects was 0 days (range 0-6 days), while the median time period between symptom onset and the first round of testing was 5.5 days (range 0-28 days).
In total, there were 209 (62.0%) positive saliva tested via CDC-LDT, 167 (49.6%) positive saliva tested via Fortitude 2.1, 150 (44.5%) positive NP swabs tested via cobas SARS-CoV-2 or CDC-LDT, 127 (37.7%) positive SN swabs tested via CDC-LDT, and 119 (35.3%) positive SN swabs tested via Fortitude 2.1. Cycle threshold (Ct) values were lower during the earlier period of infection across all sample types, predominantly for symptomatic infections where the onset of illness could be better estimated (Fig. 1). The likelihood of a positive saliva test was higher than NP and SN from samples collected within weeks 1 and 2 from initial diagnosis. The percentage Table 1. Characteristics of recruited subjects. CCF community care facility, IQR interquartile range. a For asymptomatic subjects recruited at the purpose-built dormitory, all three samples (nasopharyngeal, selfadministered nasal, and saliva) were taken at the same time during round 1. For asymptomatic subjects recruited at CCF, subjects were diagnosed with COVID-19 via prior RT-PCR testing and only two samples (self-administered nasal and saliva) were taken during round 1.

Characteristics
Purpose-built dormitory (n = 149) CCF (n = 51) www.nature.com/scientificreports/ testing positive for SARS-CoV-2 from any samples fell beyond 14 days of symptom onset in symptomatic subjects or from initial diagnosis for asymptomatic subjects, although this was less significant for saliva tested via CDC-LDT (Fig. 2).   Among 30 samples (saliva, NP and SN swabs) tested by NGS, there was a strong correlation between viral genome coverage by NGS and Ct values for SARS-CoV-2. Ten samples showed 100% coverage (7 unique subjects) (Fig. 3). Ten samples (4 saliva, 6 SN swabs) with discordant results between the 2 RT-PCR tests (CDC-LDT positive, Fortitude negative) were positive by NGS. Phylogenetic analyses of sequences of SARS-CoV-2 viral RNA from high-coverage saliva samples of the 7 unique subjects showed Clade O by GISAID nomenclature 19 and lineage B.6 by PANGOLIN system of nomenclature 20 .

Discussion
Our study is concordant with multiple published works supporting saliva as an alternative sample for COVID-19 screening and diagnosis [7][8][9][10][11][12][13][14][15] , and one of a minority where saliva was shown to be more sensitive than the corresponding NP swab 8,9,13 , although the results by Leung et al. (53.7% saliva vs. 47.4% NP swab, 95 subjects) were not statistically different 8 . Several reasons may account for this difference in the studies, including enrichment from nasal and oropharyngeal secretions, where the viral load is potentially higher 8,9 , or a higher volume of samples collection, where approximately 10 mL of saliva was collected for testing 13 . Steps were taken to minimize biases and errors-NP swabs performed by trained healthcare staff, environmental testing of CAP-accredited laboratory (no evidence of contamination), conduction of tests for most of the samples in the same laboratory, and pre-processing of saliva samples with dithiothreitol before RNA extraction to resolve the issues of saliva specimen viscosity, which can lead to false negatives.
Interestingly but perhaps unsurprisingly, the use of different RT-PCR kits in the present study resulted in different test-positive rates in saliva, suggesting that this can potentially be an important consideration for clinical laboratories, where more sensitive laboratory protocols should be deployed for clinical diagnosis as opposed to mass screening for low-prevalence populations. More validation would be required to confirm this finding.
SN swabs, however, appeared less sensitive compared to both saliva and NP swabs for the diagnosis of COVID-19. Although it was convenient, less time-consuming to perform relative to saliva collection, and caused less discomfort compared to NP swabs, the markedly lower sensitivity should preclude its use where other sample types can be collected.
In our study, NGS provided efficient whole-genome profiling of SARS-CoV-2 for phylogenetic analysis directly from the clinical samples without culture. NGS detection sensitivity was excellent with a threshold of 1.7% genome coverage or 5 amplicons targets, confirming all CDC-LDT positives tested. Other groups have reported highly sensitive performance for NGS with limits of detection ranging between a threshold of 5% genome coverage or 84 genome-equivalents per mL 21 , or at least 5 SARS-CoV-2 targets for detection 22 . The phylogeny results were consistent with the virus belonging to a viral type (Clade O, lineage B.6) known to be circulating in the geographical regions of Singapore and India. www.nature.com/scientificreports/ There are several limitations to our work. Firstly, the study population was confined to young and middle-aged men who were either asymptomatic or had mild disease. The results cannot be extrapolated to other populations (e.g., paediatric), where there is a clear need for alternate sample types to NP swabs. Secondly, we did not extend the follow-up testing sufficiently to determine when saliva viral shedding stopped for the majority of subjects, although this has been explored in other studies 7,10 . Thirdly, we did not test for the difference, if any, between saliva obtained from naso-oropharyngeal or the mouth alone, although it is biologically plausible that the latter would result in lower sensitivity for COVID-19 diagnosis 16 .
In conclusion, our study adds to the body of evidence supporting saliva as a sensitive and less intrusive sample for COVID-19 diagnosis and further defines the role of naso-oropharyngeal secretions and the impact of different RT-PCR kits in increasing the sensitivity of testing. In our study, SN swabs were inferior to both saliva and NP swabs. Our study also provides evidence to support NGS in challenging samples for sensitive COVID-19 molecular diagnosis. Such an NGS workflow can also provide direct-from-sample phylogenetic analysis for public health decision-making, such as contact tracing.
Received: 23 September 2020; Accepted: 22 December 2020 Figure 3. Correlation between viral genome coverage (%) by next generation sequencing (NGS) and cycle threshold (Ct) values for SARS CoV-2. NGS is a sensitive method of detecting low-level SARS-CoV-2 virus in clinical samples, and coverage (%) of the genome is correlated to the Ct value determined by RT-PCR. Thirty samples (17 saliva and 11 SN swabs and 1 NP swab) were tested by NGS. Ten samples (4 saliva, 6 SN swabs) with discordant results from the RT-PCR tests were confirmed to be positive by NGS (median genome coverage 29.7%, range 3.3-88.2%). Genome coverage (%) is defined as the proportion of the SARS-CoV-2 genome that has > 1× depth of coverage. The threshold coverage (%) for making positive call by NGS was established by running 5 negatives (by RT-PCR) samples from this study, 11 negative NP swab samples from community testing, and 10 no-template controls (NTC). The threshold was determined to be the detection of 5 amplicons, which corresponds to a coverage of ~ 1.7% of the SARS-CoV-2 genome. Among five negatives (by both RT-PCR methods) saliva samples, one saliva sample was called positive by NGS (red bar) with a genome coverage of 2.0%. This sample was from a patient with RT-PCR positive NP swab and nasal swab samples collected at the same time, raising the possibility of a low-level signal detected on NGS. Ct values for samples with positive calls (by either or both Fortitude 2.1 and CDC-LDT RT-PCR assays) are represented on the graph with green circles, and samples negative for SARS-CoV-2 (by both RT-PCR tests) and NTC are represented as open circles. Average Ct values (of 2 targets) or single Ct value (when only 1 target was detected) from the CDC-LDT assay are plotted on the secondary axis.