Following the development of genome analysis technology using next-generation sequencing, the relationships between human microbiota and disease are being clarified. Even in the oral cavity, it has become clear that an imbalanced microbiota can contribute to various oral diseases, such as periodontal disease, dental caries, and halitosis1,2,3,4,5. The microbial composition of saliva has also been reported to be associated with autoimmune diseases, such as inflammatory bowel disease, rheumatoid arthritis, and many other systemic diseases6,7,8. Therefore, the microbial composition of saliva has been proposed for use as a risk assessment measure for many oral and systemic diseases. Saliva collection usually takes about 1 to 5 minutes per person. Therefore, when collecting samples from a large number of participants such as large-scale cohort studies, a simpler alternative method that shortens the collection time is required. As an alternative to saliva collection, a solution obtained by mouth washing using a commercially available mouth rinse (containing 15% ethanol) can be used9,10, but since it contains ethanol, this may also cause problems such as irritation to the oral mucosa or acute alcohol intoxication due to accidental ingestion of child. The turbidity of mouth-rinsed water, which is a liquid obtained by rinsing the oral cavity with water, is reported to reflect both oral health state and malodor11,12. The microbiome of mouth-rinsed water is expected to be influenced by that of the saliva and the tongue; however, the similarity of these microbial profiles has not yet been investigated.

In this study, as a first step to show the validity of mouth-rinsed water as an alternative method to saliva collection for the analysis of oral microbiome, we collected mouth-rinsed water, stimulated saliva, unstimulated saliva, and tongue coating from 10 systemic healthy participants, and compared the microbial diversity and composition of these samples using next-generation sequencing of 16S rRNA-encoding genes. We clarified that the microbiome of mouth-rinsed water was similar to that of unstimulated and stimulated saliva from the same participant, and showed that collecting mouth-rinsed water is a valid alternative sampling method to saliva collection for analyzing the oral microbiome.


Comparison of oral microbial diversity

Changing microbial diversity of the oral microbiome is an important indicator of many diseases, such as dental caries, periodontal disease, and diabetes13,14,15. Therefore, we firstly compared microbial diversity among the unstimulated saliva (US), stimulated saliva (SS), tongue coating (TC), and mouth-rinsed water (MW). Shannon’s diversity index, an indicator of microbial diversity, showed comparable results among US, SS, and MW samples but was significantly lower in TC samples (Fig. 1A). The median number of OTUs in MW was slightly lower compared with SS and US, although no statistically significant differences were detected between MW and SS or US, (SS–MW, p = 0.0581; US–MW, p = 0.264) (Fig. 1B).

Figure 1
figure 1

Alpha-diversity of oral microbiome in each sampling methods. Boxplots show the shannon diversity index (A) and the number of observed OTUs (B) of each sampling methods with the number of sequences rarefied to 10000 reads per sample. TC, Tongue coating; US, unstimulated saliva; SS, stimulated saliva; MW, mouth-rinsed water. Statistically significant differences are marked with asterisks (Nemenyi test, *p < 0.05 **p < 0.005 ***p < 0.0005).

Comparison of oral microbiome profiles for each sampling method

Subsequently, in order to compare the microbiome profiles of each sampling method, Spearman’s rank correlation coefficients at the species level between each sampling methods were compared. The results showed that the correlation coefficients between MW, SS, and US were significantly higher than that between MW and TC (Fig. 2, Supplementary Table S1). For all participants, the correlation coefficients between MW and SS/US were higher than those between MW and TC (Supplementary Table S2). The correlation coefficient between SS and US was also high (r = 0.846 ± 0.033). Similar results were obtained when using weighted or unweighted UniFrac distances (Supplementary Fig. S1). Furthermore, as a result of hierarchical clustering analysis, samples from the same individual were clustered except B05_TC (Fig. 3). Regardless of the status of dental caries or periodontal disease, MW, SS, and US were collected for all participants. Finally, we tested whether the relative abundance of individual bacteria changes depending on the collection method used. As a result, no genera were found to show significantly different relative abundances among the three sampling methods (Friedman test q < 0.10), whereas at the species level, eight species showed significant differences among the three sampling methods, and these species also showed significant differences between MW and US or SS (Table 1, Nemenyi post-hoc test p < 0.05).

Figure 2
figure 2

Spearman Rank Correlation between each sampling method. TC, Tongue coating; US, unstimulated saliva; SS, stimulated saliva; MW, mouth-rinsed water. Statistically significant differences are marked with asterisks (Nemenyi test, *p < 0.05 **p < 0.005 ***p < 0.0005).

Figure 3
figure 3

Cluster analysis based on Spearman Rank Correlation (single linkage method) . The color of the sample name is different for each subject. TC, Tongue coating; US, unstimulated saliva; SS, stimulated saliva; MW, mouth-rinsed water.

Table 1 Bacterial species that showed significantly difference of relative abundance between mouth-rinsed water, and unstimulated saliva or stimulated saliva.


In this study, we showed that the microbial diversity and composition of MW was similar to US and SS but different from that of TC in 10 systemic healthy participants. These results indicate that MW sampling is a valid alternative method to saliva sampling for analyzing oral microbiome.

The microbial diversity of MW was not different from that of US and SS, indicating that it is possible to collect the entire microbiome from MW as well as saliva samples. On the other hand, the microbial diversity of TC was significantly lower than that obtained by the other three methods (Fig. 1). This result was consistent with that found in a previous study16.

The microbial composition of US was similar to MW and SS from the same individual, but B10 had a relatively lower similarity (US–MW: r = 0.736, US–SS: r = 0.771). The reason for this may be that the abundance of Neisseria flava in US was higher compared with SS and MW (the relative abundance of Neisseria flava was US, 0.255; SS, 0.094; and MW, 0.086). Therefore, the abundance of other minor species was changed.

As a result of hierarchical clustering, the same individuals were clustered. Therefore, it is considered that the characteristics of each participant’s oral condition are reflected even when the sample collection method is different. The reason why B05_TC did not cluster with the sample from the same participant in Fig. 3 is that the abundance of Neisseria flavescens in TC was higher compared with the other samples, and the abundance of some minor species in TC was considered to be low (the relative abundance of Neisseria flavescens was US, 0.162; SS, 0.157; MW, 0.209; and TC, 0.448). The relative abundance of most species also showed no significant differences among MW, US, and SS; however, some species, such as Veillonella rogosae, Haemophilus haemolyticus, and Streptococcus infantis showed significant differences. Further analysis in a large scale cohort study should be required for understanding the reason, but attention should be paid to these bacteria when comparing the results of salivary microbiome analysis even though there were no significant differences in the genera to which these bacteria belong.

It has been reported that the similarity of oral microbiome profile between US and SS were controversial17,18. In this study, the oral microbial diversity and composition of US and SS from the same participant were comparable. In the study showed the similar pattern of microbial profile between US and SS, the sample collection method was a spitting method17 (used in this study), whereas the other study used a paper pointing for unstimulated saliva collection18. Therefore, the method used for the collection of unstimulated saliva may affect the oral microbial profile.

MW is useful as a method for collecting oral samples from participants who have difficulty collecting saliva, such as dry mouse patients and elderly people. However, in order to prove the usefulness of MW for these participants, its evaluation for participants with low saliva flow, such as individuals who suffer with a dry mouth, is also necessary.

In conclusion, we have shown that the oral microbiome of MW is similar to that of US and SS. Since previous researchers have reported that the salivary microbiome profile can reflect oral and whole-body systematic diseases13,19,20, MW is expected to become a beneficial tool for easy evaluation of various disease risks.


Ethics statement

This study was given ethical approval by the ethics committee of Chiyoda Para Medical Care Clinic (Tokyo, Japan, Issuing number: UMIN000031334). All participants understood the purpose of the study and provided informed consent. All experiments were performed in accordance with approved guidelines.

Sample collection and dental examination

Three male and seven female systemic healthy volunteers aged 24 to 40 years (mean ± s.d., 31.1 ± 4.8 years) were recruited at the Hiyoshi Oral Health Clinics. All participants met the inclusion and exclusion criteria. Inclusion criteria were: (a) non-smoker, (b) non-denture wearer, and (c) non-brace wearer. Exclusion criteria were: (a) systemic disease, (b) received antibiotics in the last 6 months, and (c) pregnancy or breastfeeding.

Sample collection and dental examinations was conducted at the Hiyoshi Oral Health Clinics. Prior to sample collection, the participants were instructed not to brush their teeth from the previous night to the time of sampling, and were prohibited from eating or drinking for at least 1 hour prior to sampling. A minimum of 2 mL unstimulated saliva was collected by allowing saliva to accumulate on the floor of the mouth followed by spitting into a specimen tube every 60 seconds21. Following the collection of unstimulated saliva, participants were asked to swish their mouth vigorously for 10 seconds with 3 mL sterilized water, and then to spit into another specimen tube. Following the collection of this mouth-rinsed water, individuals were asked to chew paraffin gum to stimulate saliva, samples of which were then collected in further specimen tubes every 60 seconds. Tongue-coating samples were collected by scraping the dorsum of the tongue with Catch-All™ Specimen Collection Swabs22 (Epicenter Biotechnologies, Wisconsin, USA). All samples were placed in a freezer within 10 minutes of collection and stored at −80 °C until use.

After samples collection, the numbers of present, decayed, missing and filled teeth were examined. The number of decayed, missing and filled teeth signifies teeth with caries experience, and represents the caries history of the individual. The periodontal pocket depths and bleeding on probing (BOP) at four sites (mesiobuccal, distobuccal, mesiolingual, and distolingual) of all teeth were measured using a periodontal pocket probe following sample collection. Summary of the oral health condition of the participants is shown in Table 2.

Table 2 Summary of participants' information.

DNA extraction and sequencing of 16S rRNA gene amplicons

Genomic DNA was isolated from the collected samples using the nexttec™ 1-Step DNA Isolation Kit (nexttec Biotechnologie GmbH, Leverkusen, Germany). PCR was conducted using universal primers (27Fmod and 338R) for 16S rRNA gene sequencing, as previously described6. PCR was performed using Ex Taq polymerase (Takara Bio, Shiga, Japan) and approximately 20 ng of template DNA.

Thermal cycling was performed in a Veriti Thermal Cycler (Life Technologies Japan, Tokyo) with the following cycling conditions: initial denaturation at 96 °C for 2 min, followed by 25 cycles of denaturation at 96 °C for 30 s, annealing at 55 °C for 45 s, extension at 72 °C for 1 min, and final extension at 72 °C. PCR amplicons were purified using AMPure XP magnetic purification beads (Beckman Coulter, Brea, CA, USA) and quantified using the Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies Japan). After quantification, mixed samples were prepared by pooling approximately equal amounts of each amplified DNA and sequenced using MiSeq Reagent Kit V3 (300 × 2 cycles) and a MiSeq sequencer (Illumina, CA, USA), according to the manufacturer’s instructions.

Sequence analysis

The sequenced paired-end reads were merged using VSEARCH (version: 1.9.3, options: -fastq_maxee 9.0, -fastq_truncqual 7, -fastq_maxdiffs 300, -fastq_maxmergelen 450, -fastq_minmergelen 250)23. Merged fragments were processed with Cutadapt (version: 1.16, options: -O 13 -m 50 -M 450 -q 0 -e option is not used in cut 5′ primer, used 0.3 in cut 3′ primer) and Bowtie2 (version: 2.1.0, options: -fast-local) for trimming the primer sequences and filtering PhiX fragments, respectively24,25. To ensure the quality of the processing reads, fragments with an average Phred quality score of less than 25 were removed using an in-house script. From the filter-passed fragments, 10,000 fragments were randomly subsampled from each sample to remove the effect of sequence-depth bias and then used for the following analysis. The selected fragments were primary dereplicated to remove redundancy, then the dereplicated fragments were clustered into operational taxonomic units (OTUs) using VSEARCH (version: 2.9.1, options:–id 0.99–strand plus). For determining genus/species level taxonomies, the OTUs were mapped onto a reference sequence database using Bowtie2 (version: 2.1.0, options: -I 280 -X 400–fr–no-discordant–phred33 -D 15 -R 10 -N 0 -L 22 -i S,1,1.15). We originally prepared the database by merging the SILVA LTP database (version: 132) and the HOMD 16 S rRNA RefSeq database (version: 15.1)26,27. If the edit distance of assigned taxonomy was more than 1% in species assignment or more than 3% in genus assignment, they were assigned as “Unclassified”.

Statistical analysis

Alpha diversity was measured by the number of OTUs and the Shannon diversity index, which was calculated using an in-house Python script. Beta diversity was measured by Spearman’s rank correlation distance and UniFrac distance, which were calculated using SciPy (version: 0.17.0) and scikit-bio (version: 0.5.2), respectively. An input phylogenetic tree for calculating UniFrac distances was prepared as follows. Multiple sequence alignment of clustered OTUs was performed using MAFFT (version: 7.245) and then a phylogenetic tree was constructed using FastTree (version: 2.1.8), followed by the addition of a midpoint by scikit-bio28,29. For the comparison of relative abundance in terms of genera and species, the Friedman test (a non-parametric test for multiple comparisons) and the Benjamini-Hochberg false discovery rate correction (FDR-BH) for posterior correction were adopted. The Friedman test and the Nemenyi post-hoc test were performed using the R package PMCMR (version 4.2, available at

Nucleotide sequence accession number

The microbiome analysis data (16S rRNA gene sequences) have been deposited in the DNA Data Bank of Japan (DDBJ) Sequence Read Archive under accession number DRA008586).