Quantitative Analysis of Salivary Oral Bacteria Associated with Severe Early Childhood Caries and Construction of Caries Assessment Model

To construct a saliva-based caries risk assessment model, saliva samples from 176 severe early childhood caries (S-ECC) children and 178 healthy (H) children were screened by real-time PCR-based quantification of the selected species, including Streptococcus mutans, Prevotella pallens, Prevotella denticola and Lactobacillus fermentum. Host factors including caries status, dmft indices, age, gender, and geographic origin were assessed in their influence on abundance of the targeted species, which revealed host caries status as the dominant factor, followed by dmft indices (both P < 0.01). Moreover, levels of S. mutans and P. denticola in the S-ECC group were significantly higher than those in the healthy group (P < 0.001 for S. mutans and P < 0.01 for P. denticola). Interestingly, the co-occurrence network of these targeted species in the S-ECC group differed from that from the healthy group. Finally, based on the combined change pattern of S. mutans and P. pallens, we constructed an S-ECC diagnosis model with an accuracy of 72%. This saliva-based caries diagnosis model is of potential value for circumstances where sampling dental plague is difficult.


Influence of hosts' factors in defining species levels.
To compare the effect sizes of the various host factors on the bacterial levels, permutational multivariate analysis of variance (PERMANOVA) was applied. As shown in Table 3, among the five hosts' factors including age, gender, caries-status, dmft indices and geographical origin, caries status displayed the strongest effect in defining the bacterial level, followed by dmft indices (F = 43.757, F = 3.420, respectively, both P < 0.01). No significant effect was observed from hosts' age, gender, and geographic origin on the level of these species (P > 0.05). Hosts' geographic origin also contributed to the differential load of those species, in which the amounts of S. mutans and P. pallens from northern city were significantly higher than those from the southern city (all P < 0.001, Mann-Whitney's U test). However, no significant difference was found within these species' levels among genders (P > 0.05, Mann-Whitney's U test).

Comparisons of the levels of specific species between healthy and caries-active children. Levels
of those four species were compared between the S-ECC and healthy groups (Table 4). P. denticola and S. mutans in the S-ECC group were significantly more abundant than those in the healthy group (P < 0.01 and P < 0.001, respectively, Mann-Whitney's U test). In contrast, no significant difference was found between the healthy and S-ECC groups for P. pallens and L. fermentum (P > 0.05). In addition, no significant difference was found within those species' level among genders (P > 0.05, Mann-Whitney's U test) ( Table 4).

Correlation between species level and dmft indices.
To understand the correlation between the selected species and dmft indices, we collected the hosts' caries status data including number of decayed, missing and filled tooth from each individual and calculated the dmft indices. A strong positive correlation between S. mutans level and dmft indices was found (Table 5; Spearman's rank correlation coefficient r = 0.600, P < 0.001). No significant correlation between other species levels and dmft indices was observed.
The co-occurrence networks of the targeted species in caries and health. The co-occurrence network can reveal the ecological relationship between the bacterial species in the microbial community. In the healthy group, there was a very strong positive correlation among the levels of each strains, including S. mutans, L. fermentum, P. pallens, and P. denticola ( Fig. 2A). However, the correlation pattern changed in the S-ECC group (Fig. 2B). Specifically, the correlation between S. mutans and L. fermentum was weakened (the coefficient value changed from 0.997 to 0.78), and correlation between S. mutans and P. pallens disappeared in the S-ECC group (the coefficient value changed from 0.999 to 0.103). This result suggested that the salivary community structures www.nature.com/scientificreports www.nature.com/scientificreports/ were differed in disease state, and the weaken or even disappeared connection among the tested members in caries samples indicated a diversely distributed community structure. Building a diagnosis model of S-ecc using P. pallens and S. mutans. To probe which of the four species exerts the greatest effect on model performance, a series of models were built from every singular species.
The AUC values of models derived from the sole species of P. denticola, L. fermentum, P. pallens or S. mutans were 0.47, 0.51, 0.57 and 0.61, respectively ( Fig. 3A-D). According to "rfcv" function in the Random-Forest package, the two top-ranking significant taxa (P. pallens and S. mutans) from these selected species led to a reasonably good classification of S-ECC status. And the two-species model built from P. pallens and S. mutans showed a relatively higher predictive power to distinguish S-ECC from the healthy groups (AUC = 0.72; shown in Fig. 3E). Specifically, through testing absolute amount of P. pallens and S. mutans, we could differentiate those hosts with severe caries disease with the accuracy of 72%.    Table 3. Influence of host factors on the abundance of selected species. P < 0.05 was considered statistically significant. The asterisks denoted statistical significance (**P < 0.01).

Discussion
In this study, we aimed to utilize qPCR technology, a more simpler, cost-effective and time-saving method for accurate, sensitive and rapid quantification of those selected species in the salivary microbiome 26,27 . We found that the level of S. mutans in S-ECC was significantly higher than those from the healthy group (P < 0.001), which was consistent with the previous studies 18, 28 . The dysbiosis of the oral microbiome, such as S. mutans, from an overproduction of acid, can result in increasing proportions of acidogenic and aciduric species 29 . However, no significant difference was found on the level of L. fermentum among the healthy and S-ECC groups in our study, which was also a recognized acidogenic caries pathogen. In contrast, another article reported that the levels of Lactobacillus spp. in plaques were significantly elevated in children with severe ECC patients 18,30 . Above conflicting results might be attributed to different sampling methods. Specifically, our study was based on saliva samples, while previous studies have used carious dentin samples. It indicated that L. fermentum might contribute more in the frontier of dentin caries, however, saliva-based quantification of L. fermentum could not detect significant difference in patients.
Our previous study tracked the changes of microbiota over time from the healthy status to caries occurrence and caries progression, thus developed a model for caries prediction and suggested a panel of Prevotella species that may be closely related to caries disease 21,22 . Prevotella's association with caries was also verified by many other studies 24,25 . These indicated that the overexpressed collagenases for proteolytic metabolism in Prevotella species may lead to the progression of dental caries 23 . Therefore, the design of this study involved P. denticola and P. pallens, two of the seven most discriminant Prevotella species in the prediction and diagnosis model for ECC 22 . Although above two Prevotella spp. both contribute to great extent in caries risk prediction, only P. denticola was detected with significantly elevated absolute amount in S-ECC group (P < 0.01) by real-time PCR-based quantification, the amounts of P. pallens were nearly the same between the groups. This result suggested that the amount of the species might not directly linked to its effect on disease prediction model construction.   Table 5. Correlation between the levels of targeted species in saliva and the dmft index scores. P < 0.05 was considered statistically significant. The asterisks denoted statistical significance (***P < 0.001).
Scientific RepoRtS | (2020) 10:6365 | https://doi.org/10.1038/s41598-020-63222-1 www.nature.com/scientificreports www.nature.com/scientificreports/ Caries status and dmft indices were the top two factors in defining the absolute abundance of those selected species, yet factors like age, gender and geographic origin did not influence the bacteria levels significantly. This suggested that even the individuals were from different background, disease status could still discriminate the S-ECC community structures from H groups. Interestingly, in healthy children, there were very strong positive correlations between two of the four targeted strains. In contrast, in children with S-ECC, the very strong positive correlation between S. mutans and L. fermentum found in the healthy children was significantly weakened, and the very strong positive correlation between S. mutans and P. denticola found in the healthy children was even disappeared. This finding was consistent to our former result that healthy microbiomes were more conversed, while those caries microbiomes were more diversely distributed 21,22 . For the healthy group, as they were more resembled, so we were able to detect more consistence in the close relationship of those bacterial members Figure 2. The co-occurrence networks of the targeted species in each of the two hosts groups. The connection lines between two nodes indicate positive correlation between the levels of two species, with color representing the degree of correlation. There was a very strong positive correlation among every species in a pair-wise manner in the healthy group (A). In the S-ECC group, however, the correlation between S. mutans and L. fermentum was weakened, and the correlation between S. mutans and P. pallens was no longer present (B). www.nature.com/scientificreports www.nature.com/scientificreports/ among this group. However, for the caries group, a shifted balance of microbiota takes place in the oral environment 21,22,31 , where any bacterial members with the ability of acid-producing and acid-resisting could potentially initiate the occurrence of caries. This might be a potential explanation for that on the links between the levels of chose bacterial members especially between those acknowledged caries-leading bacteria like S. mutans and L. fermentum, S. mutans and P. pallens were weaken or disappeared in the S-ECC group.
In terms of caries diagnosis assessment model, neither single species could elicit a satisfied diagnostic power with AUC from 0.47-0.61, even the significantly differentially distributed P. denticola resulted in the lowest accuracy of 0.47. However, the combination of S. mutans and P. pallens results in a caries assessment model with an accuracy of 72%, which was nearly equal to our former ECC prediction model (74% accuracy) based on eight marker Prevotella species via pyrosequencing 22 . In this study, based on a cross-sectional experimental design, we aimed to monitor levels of specific potential caries-associated bacterial markers and evaluate their contribution to caries diagnosis model construction. What deserve our attention is, levels of P. denticola (not P. pallens) were significantly higher in S-ECC, but the model finally constructed was derived from P. pallens and S. mutans. This suggested that the abundance of species might not be the sole predictor for caries 32 and links among species can be exploited to discriminate caries status in the models. In addition, our result indicated that S. mutans played a significant role in the caries diagnosis model, and a model that combined S. mutans and P. pallens reached accuracy of 72%. However, the caries prediction model we built before were composed of a panel of seven Prevotella species with accuracy of 74% 22 , and S. mutans didn't contribute to this model. This indicated that there was difference in dominant pathogens during caries onset and caries progression.
In this study, utilizing the rapid, accurate and economic qPCR technique, we developed a saliva-based efficient and economic S-ECC risk assessment model. Traditional methods of diagnosing ECC include visual-tactile detection combining with bitewing radiography. In addition, radiography, transillumination, ECM device, and methods based on fluorescence are useful for caries detection 33 . However, all these methods necessitate a certain extent of children's cooperation and on-site at the dental chair, and they can also be time-consuming and laborious 33,34 . Therefore, the caries diagnosis model built here can be beneficial to preschool age children, especially for those children who are anxious and thus unable to cooperate for oral exams, and can also be adopted for remote screening or home-based survey of caries risk for epidemiological studies.

Materials and methods
Selection of subjects for this study. The children employed in this study were from an oral health census (June 2017) in kindergartens at the southern city of Guangzhou (the Guangdong Province) and the northern city of Qingdao (the Shandong Province), which are physically separated by two thousand kilometers in mainland China. After an oral health survey, 354 children (3-5 years of age), including 190 boys and 164 girls, were chosen for saliva sample collection. All children were unrelated individuals of both genders 21 . According to the number of dental caries assessed with a decayed, missing, filled tooth (dmft) indices, 176 children were classified as S-ECC (dmft ≥ 6) and 178 children were classified as healthy (dmft = 0). All the guardians of the children were made aware of the nature of the experiment and granted written permission for participation. The written permission and study design had been approved by the Ethical Committee of Qingdao University (Qingdao, China). All experiments were performed following relevant guidelines and regulations. No child wore a removable appliance or took antibiotics in the preceding three months. Children with systematic or other oral diseases such as mucosal diseases and/or were excluded 21 .
Sample collection and DNA extraction. The clinical examinations and assessments of caries, as well as salivary sample collection, were carried out by dentists who were previously trained for the assessments of caries and sampling procedures. Unstimulated whole saliva (2 mL) was collected from each child into a tube containing an equal volume of lysis buffer (50 mM EDTA, 50 mM sucrose, 50 mM Tris, pH 8.0, 100 mM NaCl and 1% SDS) 35 . Salivary samples were stored at −80 °C before DNA extraction 21 . The extraction of DNA from bacterial cultures was performed using an optimized protocol based on the Qiagen DNeasy Blood & Tissue DNA kit (QIAGEN, Hilden, Germany) according to the manufacturer's instructions. DNA concentrations were determined using a Qubit Fluorometer 2.0 (Life Technologies, Grand Island, NY, USA). The purity of the extracted DNA was measured by the Qubit dsDNA HS Assay Kit (Invitrogen, Carlsbad, California, USA) following the manufacturer's instructions, with an inclusion criterion of above 1.8. Electrophoresis of DNA was performed to assess DNA integrity under ultraviolet light. The extracted DNA samples were stored at −80 °C before further processing.
Design of quantitative qPCR primers. Detection and quantification of the selected species in salivary samples were performed by qPCR. The presence of S. mutans, P. pallens, P. denticola and L. fermentum was detected by specific qPCR primers. Two pairs of primers (for S. mutans and L. fermentum) were used based on the published primer protocols (Table 1). Another two pairs of new primers (for P. pallens and P. denticola) were designed using AlleleID 6.0 (Premier Biosoft, Palo Alto, CA, USA) for qPCR and then analyzed in BLASTn (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGcE_TYPE=BlastSearch) 1 . Species specificity for the primers of P. denticola and P. pallens were tested by conventional (Fig. 1). The thermal conditions of qPCR reactions were listed in Table 2.
Quantitative real-time pcR. Each reaction mixture (20 μL) was composed of 10 μL of SYBR Green Master Mix, 0.5 μL of each forward/reverse primer (10 μM), 5 μL of sterilized DNase-RNase-free water, and 4 μL of DNA sample. The qPCR reaction was performed in Microamp fast optical 96-well reaction plates (Applied Biosystems, Foster City, CA, USA) using a LightCycler 480II (Roche, Basle, Switzerland). The qPCR reaction of samples was performed in triplicate and a negative control (ddH 2 O as a template) was included within each experiment 36 . construction of risk assessment model for S-ecc. Firstly, the Random Forests method was employed to discriminate between diseased and healthy subjects from the southern city cohort. The receiver operating characteristic (ROC) curve was used to evaluate the diagnostic value of bacterial candidates in discrimination between diseased and healthy subjects. According to "rfcv" function in the Random-Forest package, the two top-ranking significant taxa from these selected species led to a reasonably good classification of ECC status. Model performance was then assessed using a 10-fold cross-validation approach 22 . Secondly, the southern city cohort was used as a training dataset and the northern city cohort was used as a testing dataset to evaluate the discriminatory power of the model, which was further evaluated using the area under the ROC curve (AUC).