The human oral cavity hosts a diverse microbial community comprised of bacteria, fungi, protozoa, archaea, and viruses1. The vast bacterial biota includes pathogenic bacteria that are responsible for local and systemic diseases2. For example bacteria have been shown to be responsible for oral ailments such as dental caries3, and periodontal diseases4. The scope of bacteria causing oral ailments is also vast with conditions such as mild gum disease and gingivitis affecting over 90% of adults5 at some point in their lives. Oral bacteria may also be related to diseases not localized to the oral cavity, such as diabetes6, cardiovascular disease7, chronic respiratory conditions8, rheumatoid arthritis9, malignancy10,11,13,14, preterm labor and low birth weight12. In addition to bacteria, the oral cavity hosts a variety of fungal species15. Despite this, the fungal constituents of the oral microbiome have thus far been understudied when compared to bacteria, but are now emerging as being important in human disease. For example, fungi have been recently shown to affect treatment outcomes in immunocompromised individuals16 as well as the development of colorectal cancer17. Moreover, studies also indicate that fungi operate together with bacteria in oral infections18.

Since the oral cavity is a potential reservoir for organisms implicated in oral and systemic health, it is essential to determine the appropriate molecular assays to study its entire microbiome including the fungal communities. Initially, studies of the oral microbiome focused exclusively on pathogenic organisms and utilized culture-based techniques. However, with the knowledge that the oral microbiome is dominated by non-culturable species19, use of culture-independent molecular methods has increased. One of the most commonly used techniques involves high-throughput, massively parallel amplicon-based sequencing and subsequent taxonomic assignment based on publicly available reference databases20. The characterization of the microbial communities using this platform can be influenced at several steps including sample collection, DNA extraction, PCR amplification, sequencing, data processing, and statistical analyses21. Additionally, each of these steps has associated labor and cost factors that may influence a researcher’s decision to use one method over another22. Previous research has shown that oral sampling techniques such as saliva, buccal swab, and oral rinse collection may influence overall DNA quantity and spectrum of microbes detected23,24,25. It has also been suggested that next-generation sequencing (NGS) may produce variable results particularly when analyzed using different classification algorithms26. Given that these processes can influence the understanding of microbial communities, investigating protocols for characterizing the biota of the oral cavity is important to allow inter-study comparisons.

Efficient and consistent methods of DNA extraction are central to accurately characterizing these communities. A number of studies have begun to examine the oral microorganisms using NGS with a variety of DNA extraction methods27,28,29,30,31,32,33. In addition, a large number of studies have collected and processed Scope mouthwash samples for genomic DNA that might be suitable for microbiome studies. The purpose of this investigation was to compare the most recent techniques to discern the most effective method of DNA extraction utilizing both enzymatic and mechanical lysis techniques across various human oral samples in order to determine the methods with the highest DNA yield and the most consistent results for characterization of both bacterial and fungal communities found in the oral cavity.


For this study, eight DNA extraction methods, utilizing different combinations of enzymatic and mechanical lysis techniques, were compared across six oral samples (Table 1). The methods were evaluated for DNA yield and variation in the detected oral microbiome. There was a significant difference in DNA quantity among the eight extraction methods (p < 0.001). The phenol-chloroform extraction technique (Method 1) generated the highest DNA yield (Fig. 1) while the UltraClean Microbial DNA Isolation Kit (Method 7) and the UltraClean Microbial DNA Isolation Kit (Method 8) resulted in significantly lower DNA yields (p < 0.01) than the three non-bead-beating methods (Table 2).

Table 1 Methods of DNA extraction used in this study, including additional enzymatic and mechanical (bead-beating) cell disruption steps.
Figure 1
figure 1

DNA Quantitation for each isolation method. DNA concentrations (ng/μl) of six oral samples were calculated after eight different DNA extraction methods described in Table 1 and corresponding to the categories shown on the x-axis. All methods used the same starting quantities of sample and final volumes were equal; concentrations are proportional to total DNA recovered. Statistical analyses of the differences in DNA amounts recovered are shown in Table 2.

Table 2 Tukey HSD post-hoc results of DNA yield between each DNA extraction method. Significant p-values are in bold.

DNA from the 48 DNA samples were amplified using 16 S rRNA V6 barcoded primers and recently described primers for the ITS1 region and submitted for Illumina NGS. Raw sequences were processed for quality control and chimera removal, resulting in a total of 373,840 bacterial reads (average of 7,788 ± 1,837 reads per sample), and 363,881 fungal sequence reads (average of 5,965 ± 1,579 reads per sample). The bacterial community composition and normalized abundances in the oral cavity are displayed in the heat map (Fig. 2A). Dendrogram clustering based on the top 20 species shows a tendency of samples to cluster by original subject. DNA extraction method did not show clustering. Community clustering based on the top 20 fungi (Fig. 2B) displays a closer distance between samples than seen with the bacterial 16 S data. However, the fungal heatmap also indicated that samples tended to cluster together based on subject and not extraction method.

Figure 2
figure 2

Heat Map of Bacterial and Fungal Species. (A) Bacterial heatmap. The top 20 bacterial OTUs for six oral samples processed by eight different extraction methods were used to construct a heatmap. OTUs were classified to species or lowest possible taxonomic level. Heatmap shows that samples cluster by patient (SampleID, 2nd row), not extraction method (Method, 1st row). (B) Fungal Heatmap. The top 20 fungal OTUs were used to construct a heatmap for the same samples described in panel A. Fungal OTUs were classified to species or lowest possible taxonomic level. Clustering demonstrated predominant grouping by individual (SampleID, 2nd row) vs. method of extraction (Method, 1st row). Legends to the left of the figures indicate color scheme for log transformed OTU abundance, method and sample in descending order.

Seven bacterial phyla were identified; Firmicutes, Bacteroidetes, Actinobacteria, Fusobacteria, Proteobacteria, Spirochaetes, and the candidate phylum TM7 (also known as Saccharibacteria), with the majority of OTUs assigned to Firmicutes and Bacteroidetes. At the genus/species level, Streptococcus dominated the oral cavity, consistent with published studies27. Rothia mucilaginosa34, an opportunistic pathogen in immunocompromised patients and Prevotella veroralis, a biofilm forming opportunistic pathogen35, were the second and third most abundant species, respectively (Fig. 2A).

The oral mycobiota was dominated by species from Ascomycota, Basidiomycota, an unidentified fungal phyla, and Zygomycota (order based on cumulative dominance across all samples). Constituents of the Candida genus were amongst the top identified OTUs consistent with previous reports on the oral mycobiome36. Several species of Malassezia were also identified in the oral cavity, including Malassezia restricta (Fig. 2B), a common lipid dependent human pathogen that is usually found on skin37.

Significant variation in sample evenness, based on the Shannon diversity index, was observed in the bacterial p < 0.001 and fungal p < 0.001 assays (Fig. 3A,B respectively). There was no significant difference in the Shannon diversity index among DNA extraction methods for either the bacterial p = 0.87 or fungal assays p = 0.93 (Fig. 3C,D, respectively). Similarly, β-diversity showed distinct clusters formed on the basis of subject in both the bacterial p < 0.001 and fungal p < 0.001 community analyses, which explained nearly all of the inter-sample community variance, R2 = 0.80 and R2 = 0.84, respectively (Fig. 4A,B). β-diversity analyses did not show significant sample clustering based on extracted method for either bacteria R2 = 0.086, p = 0.996 or fungi R2 = 0.039, p = 1.00 (Fig. 4C,D, respectively).

Figure 3
figure 3

Comparison of Fungal and Bacterial Shannon Alpha Diversity Measures. Shannon alpha diversity box plots of bacterial and fungal community composition based on variance in species evenness is shown for samples (panels A and B) and by methods (panels C and D). Significant variance is observed in bacterial sample evenness, p < 0.001 (panel A) as well as fungal community evenness, p < 0.001 (panel B). No significant differences are observed for Shannon diversity based on collection method for bacterial, p = 0.87 (panel C) or fungal diversity measures, p = 0.93 (panel D).

Figure 4
figure 4

Beta-diversity Visualized Using Non-metric Multidimensional Scaling (NMDS) Plot With Bray-Curtis Dissimilarity Distances. NMDS plots on rank order Bray-Curtis distances were used to assess significance in bacterial and fungal community composition between individuals (panels A and B) and methods (panels C and D). Plot ellipses represent the 95% confidence regions for group clusters. Clustering by sample is highly significant for bacterial R2 = 0.80 p < 0.001 (panel A) and fungal communities R2 = 0.84 p < 0.001 (panel B) communities. DNA isolation method did not exhibit significant clustering in either bacterial R2 = 0.086 p = 0.996 (panel C) or fungal communities R2 = 0.039 p = 1.00 (panel D). Significance was determined using PERMANOVA analyses.


In the current study, eight methods for DNA extraction from six oral cavity samples were used and DNA quantity and microbial community composition were compared. Our analysis revealed that DNA yield was significantly different among the eight DNA extraction methods with DNA recovery greatest after phenol-chloroform extraction (Fig. 1). The lower DNA yield of commercially available kits (Table 1) compared to the phenol-chloroform extraction method may be due to DNA loss during silica column purification. DNA yield tended to be greater with enzymatic digestion than using mechanical lysis (bead-beating) approaches. The lower DNA yield among bead-beaten samples is likely due to DNA degradation during mechanical disruption. Thus, for DNA yield, non-bead-beating methods, particularly phenol-chloroform extraction provides the greatest yield of total DNA.

Although DNA for human genetic studies has frequently been obtained using oral mouthwash and/or saliva collection methods38, compatibility of the DNA from these studies for future microbiome studies has not been examined in detail. Previous studies found differences in the oral bacterial microbiome based on DNA extraction methods32,33,39; whereas, other studies determined that DNA extraction methods did not seem to influence major differences in the oral microbiome22,31,40. Nevertheless, it is hard to do a direct comparison amongst studies in the literature, since many used saliva and/or plaque collection31,33,39,40, some compared crude DNA to purified DNA39,40, others used mock communities39, and one did not include NGS analysis of the microbiome32. Only one study examined both bacterial and fungal communities and surprisingly found no differences amongst 4 methods for bacterial communities, but found phenol-chloroform extraction best for fungal community diversity33.

Although we found that DNA extraction methods had an influence on DNA yield, we did not find a significant difference in oral microbiome composition across eight DNA extraction methods of oral rinse specimens. Shannon diversity measures for bacterial and fungal communities were similar across the employed extraction methods and did not achieve statistically significant differences. Similarly, PERMANOVA analysis on rank order Bray distances did not demonstrate differences in β-diversity for either assay. Our results instead demonstrated that individual subject differences drove diversity measures across both bacteria and fungi. Taken together, these data suggest that both α- and β-diversity measures were consistent for all eight-extraction measures, and that the choice of method does not have a major influence on the observed oral communities. The results of this study might have been influenced by the larger number of samples analyzed compared to previous studies.

All eight extraction methods were able to consistently recapitulate the original subject microbiotas as indicated by both alpha and beta diversity measures including Shannon diversity index and Bray-Curtis distances, respectively. These findings are consistent with previous studies that have demonstrated that each person’s oral microbiome is unique41,42. Additionally, all methods reported here detected hard to lyse gram-positive species, such as Streptococcus43, indicating sufficient lysis of cells. Moreover, the similarity of results for fungal community analyses across all methods is consistent with the one report that found phenol-chloroform extraction yielded the highest fungal diversity in saliva33.

In summary, our study compared eight DNA extraction methods tested on oral rinse specimens that are commonly collected in large epidemiological studies and are used or may be used in the future to study the oral microbiome. While the eight methods tested in this study had significantly different DNA recovery, there was no difference in the observed oral microbiotas among methods. This study provides empiric evidence that research studies can select an appropriate DNA extraction method with or without bead-beating for characterization of the oral microbiota without influencing differences between the oral microbiome/mycobiome of individuals.

Materials and Methods

Consent and Approval for Use of Human Participants

Oral rinse specimens from six individuals were collected as part of a pilot study on sampling procedures for the Health and Nutrition Examination Survey in New York City 2013 (NYC HANES 2013), a collaborative project between the City University of New York (CUNY) Graduate School of Public Health and Health Policy and the NYC Department of Health and Mental Hygiene. IRB approval for analysis of pilot oral specimens was obtained from the Human Research Protection Program (HRPP) of CUNY. All methods performed in this study were conducted in accordance with Hunter College (CUNY) university integrated IRB approved protocol (PT: 346358-9). Informed consent was obtained from study participants prior to sample collection. Upon receipt all used human specimens received a lab Sample ID and no information regarding, age, race, gender or any other identifying information was used in the presented study.

Specimen Collection

Consented study participants provided an oral sample by rinsing with 20 mL of Scope mouthwash for 20 seconds. The 20-second oral rinse was broken into two 5-second swish sessions and two 5-second gargle sessions. The oral rinse samples were frozen at −80 °C at the New York State Public Health Laboratory (NYPHL) office and were transported on dry ice to Albert Einstein College of Medicine, where they were immediately stored at −80 °C.

DNA Extraction

DNA was extracted from the oral rinse samples using eight DNA extraction methods based on physical and/or enzymatic lysis steps and isolation procedures (Table 1). Extraction methods with commercially available kits all used a silica-based column. One extraction method included a non-commercial method using phenol-chloroform. All DNA isolation methods evaluated in this study are either commonly used in DNA extraction or have previously been used in microbial analysis studies. For each method, 1 mL from each oral rinse sample was centrifuged (5,000 × g) for 5 minutes. The cell pellet was re-suspended in 100 μl TE buffer (10 mM Tris. Cl, pH 8.0, 1 mM EDTA) and used for DNA extraction. Upon completion of each extraction method, the purified DNA was eluted in 100 µl of elution buffer (pH 8.0) and DNA concentration was determined using a NanoDrop 2000 (Thermo Scientific, DE).

Method 1 (Proteinase K/SDS/phenol chloroform extraction)

The cell pellet was directly processed in 200 μl cell lysis buffer (10 mmol/L Tris/HCl pH 8.0, 10 mmol/L EDTA, 0.1 mol/L NaCl, 2% SDS pH 8.0) and 10 μl proteinase K (20 mg/ml, Roche Diagnostics), and incubated overnight at 55 °C. The samples were treated with RNase A (100 mg/ml, Qiagen, Valencia, CA) for 20 minutes at 37 °C followed by phenol/chloroform extraction using Phase Lock Gel Tubes (PLG, 5 Prime Inc., Gaithersburg, MD) as described by the manufacturer.

Method 2 (QIAamp DNA mini kit)

First, 20 μl of proteinase K (20 mg/ml) and 100 μl of Buffer AL were added to 100 μl of pelleted cells in TE. The samples were incubated at 56 °C for 10 minutes. After incubation, 100 μl of 100% ethanol was added to the samples and the DNA was purified following the manufacturer’s instructions.

Method 3 (Enzymatic lysis followed by QIAamp DNA mini kit)

The pelleted cells in 100 µl TE were treated with lysozyme (0.84 mg/ml, Sigma Aldrich), mutanolysin (0.25 U/ml, Sigma Aldrich) and lysostaphin (21.10 U/ml, Sigma Aldrich) at 37 °C for 30 minutes. Subsequently, 20 μl proteinase K and 100 μl Buffer AL were added followed by incubation at 56 °C for 10 minutes. DNA was purified using the QIAamp DNA mini kit as described above.

Method 4 (Enzymatic and bead-beating lysis followed by QIAamp DNA mini kit)

Pelleted cells were digested using enzymes as in Method 3. After incubation, the mixture was treated with 15 μl proteinase K (10 mg/ml) and 150 μl Buffer AL (Qiagen) at 56 °C for 10 minutes. The samples were then transferred to a clean screw-cap tube containing 300 mg of 0.1 mm-diameter zirconia/silica beads (BioSpec, Bartlesville, OK) and mechanically lysed using a FastPrep-24 Instrument (MP Biomedicals, Santa Ana, CA) at 6.0 m/s for 40 seconds. Next, the samples were centrifuged (10,000 × g) for 30 seconds and 200 μl of the supernatant was added to a clean microcentrifuge tube containing 100 μl of 100% ethanol. DNA was isolated using the QIAamp DNA Mini Kit (Qiagen) as described above.

Method 5 (Enzymatic lysis followed by PowerLyzer PowerSoil DNA Isolation Kit)

The pelleted cells were incubated with the enzymes described in Method 3. After incubation, the mixture was transferred to a PowerLyzer Glass Bead Tube (0.1 mm) containing 650 μl of Bead Solution. The remainder of the DNA isolation protocol was continued beginning with step 4 of the PowerLyzer PowerSoil DNA Isolation Kit instructions (MO BIO laboratories, Inc., Carlsbad, CA). The bead-beating step used a FastPrep-24 Instrument (MP Biomedicals) set at 6.0 m/s for 40 seconds.

Method 6 (PowerSoil DNA Isolation Kit)

DNA was extracted using the PowerSoil DNA Isolation Kit (MO BIO laboratories, Inc.) following the manufacturer’s protocol without additional enzymatic lysis. The cells were mechanically lysed using manufacturer’s provided bead tubes and a FastPrep-24 Instrument (MP Biomedicals) at 6.0 m/s for 40 seconds.

Method 7 (UltraClean Microbial DNA Isolation Kit)

DNA was extracted using the UltraClean Microbial DNA Isolation Kit (MO BIO laboratories, Inc.) following the manufacturer’s protocol. The cells were mechanically lysed using manufacturer’s provided bead tubes and a FastPrep-24 Instrument (MP Biomedicals) at 6.0 m/s for 40 seconds.

Method 8 (BiOstic Bacteremia DNA Isolation Kit)

DNA was isolated using the BiOstic Bacteremia DNA Isolation Kit (MO BIO laboratories, Inc.) following the manufacturer’s protocol. The cells were mechanically lysed using manufacturer’s provided bead tubes and a FastPrep-24 Instrument (MP Biomedicals) at 6.0 m/s for 40 seconds.

16S rRNA gene and ITS1 region amplification and massively parallel sequencing

To amplify the 16SrRNA gene region of bacterial species, an aliquot of 0.5 µl DNA from each sample and DNA isolation method was PCR amplified in a total reaction volume of 25 µl using barcoded primers spanning the V6 variable region of the 16 S rRNA gene as previously described26. In brief, an equal mixture of AmpliTaq Gold (Applied Biosystems, Carlsbad, CA) and HotStart-IT FideliTaq DNA Polymerase (Affymetrix, Santa Clara, CA) was used. For all samples a unique 8-bp barcode was introduced to the PCR amplicons on the primers. Thermocycling conditions included an initial denaturation at 95 °C for 5 minutes, then 15 cycles at 95 °C for 1 minute, 55 °C for 1 minute, and 68 °C for 1 minute. This was followed by 15 cycles at 95 °C for 1 minute, 60 °C for 1 minute, and 68 °C for 1 minute; and a final extension for 10 minutes at 68 °C.

To amplify the ITS1 region of fungal species, 10 µl from each sample and DNA isolation method was PCR amplified in a total reaction volume of 25 µl using barcoded primers specific to the ITS1 region of the fungal ribosomal gene cluster44. In brief, 9.25 µl of dd H2O, 2.5 µl of USB 10X buffer with MgCl2 (10 mM Tris-HCl, pH 8.6, 50 mM KCl, 1.5 Mm MgCl2, Affymetrix, Santa Clara, CA), 1 µl of USB MgCl2 (25 mM), 0.5 µl of dNTP mix (10 mM each, Roche Basel, Switzerland), 0.25 µl AmpliTaq Gold, polymerase (5 U/µl, Applied Biosystems, Carlsbad, CA), 0.5 µl of HotStart-IT DNA FideliTaq Polymerase (2.5 U/µl, Affymetrix), and 1 µl (5 µM) of each primer (IDT, Coralville, IA). Thermocycling included an initial denaturation of 95 °C for 3 mins, followed by 35 cycles of 95 °C for 30 s, 55 °C for 30 s, 68 °C for 2 min, followed by a final extension of 68 °C for 10 min.

The 16 S rRNA and ITS1 PCR products each were pooled at approximately equal molar DNA concentrations and purified using the QIAquick Gel Extraction Kit (Qiagen). Following library preparation using TruSeq DNA Sample Prep Kits (Illumina, San Diego, CA), the pooled 16SrRNA DNA was sequenced on an Illumina HiSeq. 2500 using paired-end 150 bp reads, while the pooled ITS1 DNA was sequenced on an Illumina MiSeq using paired-end 300 bp reads, by the Epigenomics and Genomics Core Facility, Albert Einstein College of Medicine (Bronx, NY).


MiSeq reads were demultiplexed using novocraft’s novobarcode 1.0045 based on sample specific barcodes46. Reads were left and right trimmed with PrinSeq. 0.20.447 to remove bases that fell below the PHRED score of 25. Paired end reads were merged with PANDASEQ. 1.2048 using default settings.

For 16S rRNA gene reads, OTUs were clustered using closed reference selection with USEARCH using a custom in-house database that contains reference sequences from Green-Genes 13.849. Additionally reference sequences of an oral microbiome specific database, Human Oral Microbiome Database (HOMD)50, were retrieved in order to account for bacteria specific to the human oral cavity. Representative sequences were aligned using PyNAST51 and phylogenic analyses were performed using FastTree 2.052.

For fungal ITS1 reads, open reference OTU picking was employed with QIIME 1.953 open-reference OTU picking protocol as previously described44. The protocol was modified to use VSEARCH version 1.4.054, which allowed for higher throughput. The OTU clustering threshold was changed from 97% to 99% sequence identity to account for fungal diversity. Sequence dereplication and chimera removal was performed as part of the QIIME’s usearch quality control protocol prior to OTU picking with VSEARCH. Representative sequences for each OTU cluster were chosen based on sequence abundance. BLAST was used to assign the taxonomy55.

All data were processed in R version 3.2.156. QIIME outputs were imported into R using the phyloseq57. package and further processed with vegan58, coin59,60, and reshape260. Data visualization was performed using ggplot261. General community clustering was performed on the 20 most abundant OTUs (in terms of mean abundance across all samples) collapsed based on shared taxonomy at the species level using ward.D2 hierarchical clustering. β-diversity was assessed using Bray-Curtis distances and significance was calculated with PERMANOVA using the adonis function from the vegan package58. Statistical ellipses from the ggplot2 package were used to visualize the sample and method clusters on the NMDS plots. α-diversity was analyzed based on the Shannon’s alpha diversity and observed number of OTUs metrics and significance was determined using the Kruskal-Wallis test.