Introduction

The human oral microbiome (OM) is the second most diverse and densely populated microbiome of the human body1. It plays key roles in human digestion, protection against pathogen colonization and nitrate reduction2 and may have a role in human health including cardiovascular disease and cancer3,4. We recently reported that cigarette smoking5 and alcohol6 are important determinants of the oral microbiome. In particular, cigarette smoking is a cause of oral dysbiosis, affecting microbial diversity and its functional potential5,7.

Despite a global decline in tobacco consumption, tobacco use is still rising in African and Eastern Mediterranean countries, which is a significant public health concern8. Although cigarettes account for much of this increase, part of the increase is related to the popularization of alternative tobacco products common in Middle Eastern countries, such as dokha9 and shisha10. Dokha is a blend of tobacco leaves, barks, herbs, dried fruits and/or flowers and spices, which is smoked using a specialized pipe (midwakh) and is known to contain a much higher nicotine content than cigarettes11. Shisha is a fruit flavored tobacco comprised of shredded tobacco leaves, glycerol and other additives, which is smoked using a waterpipe12. Alternative tobacco products such as dokha and shisha, are, like cigarettes, a source of nicotine and other toxic products; however, the effect of these different types of tobacco on the oral microbiome remains unclear.

We tested for the first time the hypothesis that the oral microbiome is differentially impacted by specific tobacco products commonly used in Middle Eastern countries. We compared the effects of cigarette, dokha and shisha use on community composition of the oral microbiome by high-throughput sequencing of the bacterial 16S Ribosomal RNA (16S rRNA) gene in 330 participants from the “UAE Healthy Future” (UAEHF) pilot study13.

Results

We studied 330 subjects, including 105 (31.8%) smokers and 225 nonsmokers (68.2%) (Fig 1, Table 1). Smokers were more likely to be men (96.2%), but other health related factors such as age, BMI and diabetic status (ascertained by HbA1c levels in blood), were not different between smokers and nonsmokers (p = 0.81, p = 0.11 and p = 0.13, respectively) with the exception of systolic blood pressure (p = 0.05). Among the 105 smokers, 39% smoked more than one tobacco product, with cigarettes most commonly used (67.6%), followed by dokha (42%) and shisha (34.3%). Participants that exclusively used cigarettes smoked an average of 9.2 cigarettes per day and exclusive dokha users smoked an average of 10.7 midwakh pipes per day. On the other hand, only 50% of the participants that used shisha exclusively smoked it on a daily basis, while 13.3% smoked it on a weekly (2–3 times per week) and 36.6% smoked it on a monthly basis.

Figure 1
figure 1

Flow chart depicting the classification of participants from the UAEHF pilot study. UAEHF pilot study participants were Emirati nationals aged 18 and above. Study participants completed a self-administered questionnaire including information on smoking habits. During the physical exam, participants provided blood, urine and mouthwash samples. From 517 consented study participants, 363 subjects completed the smoking section of the baseline questioners and provided mouthwash samples. A Cotinine test in urine was used to ascertain smoke exposure. These results were further used to validate the non-smoking self-reported data. We further excluded 33 subjects (11 had no cotinine data and 22 had self-reported as non-smokers but tested positive for cotinine). All individuals participating in the study read and signed an informed consent.

Table 1 Characterization of smoking habits in the Emirati cohort.

After data filtering, there were 16,132,922 high quality 16S rRNA sequence reads ready for analysis for these study subjects (mean per subject: 48887.42; SD: 13,408.67). After low count filtering, the final data set was comprised of 13 phyla, 20 classes, 26 orders, 41 families, 57 genera, 26 species and 1,080 OTUs. We observed that amongst the 13 phyla, Firmicutes (50.0%), Bacteroidetes (21.7%), Proteobacteria (15.8%), Actinobacteria (6.7%) and Fusobacteria (4.7%) were the most abundant (Supplementary Table S1) and were present in all samples. Phyla such as Tenericutes, SR1 and Synergistetes, although in very low relative abundance, were present in more than 85% of the samples.

Overall oral microbiome community

We found that microbial diversity (Shannon entropy) was marginally greater in all smokers compared with nonsmokers (p = 0.04, Fig. 2A); however, this was not observed when comparing single tobacco type users to nonsmokers (Fig. 2B). Based on Unifrac distance matrices, controlling for age, sex and batch effects, we found that the oral microbiome overall structure significantly differed between smokers and nonsmokers (p = 0.001, Fig. 3A). This finding was confirmed in the comparison of cotinine positive and cotinine negative participants (p = 0.001, Supplementary Figure S1), which was independent of their self-reported status. When considering single tobacco products independently, the oral microbiome structure of exclusively cigarette smokers (p = 0.001, Fig. 3B) and exclusively dokha users (p = 0.042, Fig. 3C) were significantly different from that of nonsmokers. However, exclusive shisha smokers’ oral microbiome was not significantly different from that of nonsmokers (p = 0.62, Fig. 3D). In addition, no significant differences were observed when comparing the overall oral microbiome structure amongst those who exclusively used one type of tobacco (cigarette, dokha and shisha smokers, p = 0.2; Supplementary Figure S2).

Figure 2
figure 2

Characterization of the α-diversity of the Emirati oral microbiome. Diversity comparisons between (A) smokers (n = 105) and nonsmokers (n = 225) and (B) between tobacco types, cigarettes (n = 33), dokha (n = 16) and shisha (n = 15) versus nonsmokers (n = 225). Diversity was significantly greater in smokers than nonsmokers, but not when comparing single tobacco type use to nonsmokers. Only significant p values from linear regression are shown.

Figure 3
figure 3

Principal Coordinate Analysis (PCoA) of the bacterial communities according to smoking use and tobacco types derived from Unifrac weighted distances. Significant differences between (A) smokers (n = 105) and nonsmokers (n = 225) were observed (p = 0.001), (B) cigarette (n = 33, p = 0.001) and (C) dokha smokers (n = 16, p = 0.042). However, no significant differences were identified between (D) shisha smokers (n = 15, p = 0.620) and nonsmokers. All nonsmoker participants were colored orange and all smokers independently of tobacco use or type in blue.

Bacterial taxa abundance according to tobacco consumption and type

To determine the association of different tobacco products with oral bacterial taxa, we performed further detailed analyses.

Exclusively cigarette smokers (CS) vs. nonsmokers

Contrasts between CS (n = 33) and nonsmokers (n = 225) (Table 2, Supplementary Table S2, Fig. 4) showed that CS were depleted of the phylum Proteobacteria and in particular of its genera Neisseria, Eikenella, Aggregatibacter, Actinobacillus, Haemophilus and Lautropia, and the phylum Fusobacteria, represented at the genus level by Fusobacterium and Leptotrichia. Also, significantly depleted and not previously reported were the less abundant phyla SR1, GN02 and Cyanobacteria. In contrast, CS presented higher abundances at the phylum level of Spirochaetes, Synergistetes and Tenericutes being represented at the genus level by Treponema, TG5 and Mycoplasma, respectively. Furthermore, Firmicutes, Bacteroidetes and Actinobacteria were enriched at all lower taxonomical levels in CS, being characterized at the genus level among others by Megasphaera and Dialister (Firmicutes), Paludibacter, Porphyromonas and Prevotella (Bacteroidetes), and Atopobium (Actinobacteria).

Table 2 Differentially abundant taxa at selected taxonomical levels by type of tobacco use, compared to nonsmokers.
Figure 4
figure 4

Log2 fold change of genera abundances in the oral microbiome relative to tobacco use. Heatmap of the genera that were in significantly different relative abundances when comparing nonsmokers (n = 225) to cigarette (n = 33), dokha (n = 16), shisha (n = 15) and multiple (n = 41) tobacco type smokers independently. All genera with q < 0.1 (indicated by stars) in at least one of the comparisons are shown. Heatmap displays log2 fold change when compared to nonsmokers.

Exclusively dokha smokers (DS) vs. nonsmokers

Consistent with the patterns observed in CS, taxa dynamics in DS (n = 16) differed significantly from nonsmokers with the depletion of the phylum Cyanobacteria observed in CS, and in the genera Actinobacillus, Lautropia (Proteobacteria), and Porphyromonas (Bacteroidetes), which were also depleted in CS (Table 2, Supplementary Table S2, Fig. 4). In contrast, DS were exclusively enriched in the genus Bifidobacterium (Actinobacteria).

Exclusively shisha smokers (SS) vs. nonsmokers

Consistent with overall microbial composition comparisons, only four taxa were identified as having a significantly different relative abundance between SS (n = 15) and nonsmokers (Table 2, Supplementary Table S2, Fig. 4). The phyla Cyanobacteria and SR1 and the classes Chloroplast (Cyanobacteria) and BD1-5 (GN02) were all significantly depleted in SS when compared to nonsmokers.

Multiple tobacco type smokers (MS) vs. nonsmokers

We also performed the contrast between MS (n = 41) against nonsmokers (Table 2, Supplementary Table S2, Fig. 4). Depletion and enrichment patterns of taxa relative abundances were for the most part mirroring those observed in the contrast between CS and nonsmokers with some exceptions, in particular, the significant enrichment in MS of the genera Campylobacter and Enhydrobacter (Proteobacteria), and the depletion of the genus Vagococcus (Firmicutes); the two latter genera were not observed in other contrasts.

Contrasts between tobacco types

Comparisons between CS and DS revealed no significant differences at any taxonomical level in the taxa relative abundances of their oral microbiome (Supplementary Table S3). Comparisons between CS and SS however showed similar results to those observed in the contrast between CS and nonsmokers, with patterns for SS similar to that of nonsmokers. Only the depletion of the genus Actinobacillus was consistently observed in cigarette vs. shisha users and dokha vs. shisha smokers.

Discussion

This study encompasses the first characterization of the oral microbiome of the Emirati population and describes, for the first time, the specific effects on oral bacterial community structure of two regional products, dokha and shisha, with the latter experiencing increased worldwide usage in recent years14. We found that the Emirati population exhibited a diverse oral microbiome and that overall microbial diversity and composition were associated with use of tobacco products (Fig. 4). In particular, smoking in general, exclusive use of cigarettes and exclusive use of dokha were associated with significant alterations of oral microbiome structure and relative taxa abundances. Exclusive use of shisha was not associated with alterations in overall microbiome structure; however, depletion was noted in phyla Cyanobacteria and SR1, and classes Chloroplast and BD1-5.

The oral microbiome of the Emirati population presented a composition similar to that of other populations in the United States5,15, Japan and Korea16,17, China18, and among Amazonian Amerindians19,20, characterized by community dominance of phyla Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria and Fusobacteria and genera Streptococcus, Prevotella, Haemophilus, Veillonella and Neisseria. It would appear that the oral microbiome tends to have a generally similar community structure globally, despite there being wide differences in lifestyle and oral hygiene practices between populations. Of course, at a further detailed level of analysis between populations, there may be traits that tend to be more population-specific. For example, SR1 was reported as part of the core microbiome only in a Saudi Arabian population21. In the Emirati population SR1 was observed in 87% of the samples. This high carriage could be a characteristic of Middle Eastern populations, requiring further exploration. Further detailed analyses involving for example transcriptomics, metabolomics or proteomics could potentially reveal further population-specific biomarkers.

We found that tobacco use in general, was marginally associated (p = 0.04), with greater diversity of the oral microbiome. Similar associations have been reported in other studies16,22,23. However, this is not consistently observed with some reporting no change in diversity24,25,26. Exposure to tobacco smoke results in functional and structural changes in saliva and the oral environment27,28,29, which may impact on reduced immune fitness and decreased ability of the autochthonous bacteria to compete with transient taxa for nutrients; this may result in the increased diversity of the OM7,30 observed in smokers. Perhaps of clinical importance, periodontal disease and gingivitis, which are known to be associated with tobacco use, are also characterized by a higher diversity of the OM7,22,31,32,33, potentially indicating an early pathway to oral disease in smokers.

Consistent with previous studies, we found that smokers were significantly depleted of Proteobacteria such as Neisseria, Haemophilus and Lautropia)5,23,30, and enriched of Bifidobacterium5 and TG534. In addition, we reported for the first time the depletion in all smokers of the less abundant phylum SR1 (Supplementary Table S4). Members of the SR1 phylum are predominantly uncultivated, non-respiring oral bacteria that possess an altered genetic code, where the usual UGA stop codon is reassigned to a glycine)35,36,37. This change in the genetic code of SR1 may limit synergistic relationships within the oral bacterial community36, which is potentially related to the depleted relative abundances we observed.

Exclusive cigarette use was also associated with differentials in specific oral taxa, including a wider range of taxa than that found for all smokers combined. CS were depleted when compared to nonsmokers in the genera Aggregatibacter (Proteobacteria), Capnocytophaga and Porphyromonas (Bacteroidetes) and in particular of the phylum Fusobacteria represented by significantly depleted Fusobacterium and Leptotrichia. We previously reported5 similar differentials and related the bacterial genes involved to xenobiotic metabolism of toluene, in agreement with Peralbo-Molina, et al.38 who observed that exhaled breath condensate of cigarette smokers contained lower levels of p-cresol, a toluene metabolite. We also found that CS were significantly enriched in Atopobium and Bifidobacterium (Actinobacteria), TG5 (Synergistetes), Treponema (Synergistetes), Campylobacter and Eikenella (Proteobacteria) and Megasphaera (Firmicutes) consistent overall with previous results5,30,34,39,40.

We report for the first time on the impact of dokha and shisha on the oral microbiome. Dokha was associated with similar patterns of OM dysbiosis as found for cigarette use, although significant associations were found for fewer taxa, among which, the depletion of the phylum Cyanobacteria, the genera Actinobacillus, Lautropia (Proteobacteria) and Porphyromonas (Bacteroidetes) and the enrichment of the genus Bifidobacterium (Actinobacteria). The lower number of significant associations of taxa differentials with the use of dokha is probably due to the low number of participants (n = 16) that exclusively smoked dokha. Dokha is commonly consumed in the UAE using the traditional  midwakh pipe. Midwakh users in the UAE consume dokha on average 12 times per day, being equivalent to smoking 6 grams of dokha per day9, which is reflected on its effects on the oral microbiome. Dokha toxicants and health effects have received limited study, although notably Shaikh et al. reported increased systolic blood pressure, heart rate and respiratory rate in users41, indicating that these exposures constitute a potentially significant, yet understudied threat to health in the Middle Eastern region.

Although we found that shisha use was not related to overall oral microbiome structural changes, taxa relative abundance analysis identified the phyla Cyanobacteria and SR1, and the classes Chloroplast and BD1-5 as significantly depleted in shisha smokers when compared to nonsmokers, The lack of significance observed for the majority of the taxa is likely due to both the low n number available (n = 15) for exclusive shisha users as well as to the infrequency of shisha use, rather than the absence of toxicants in this product12,42. Shisha is usually associated with social gatherings, and consumption is often on a weekly to monthly basis43. As the OM is resilient44 and smoking related changes may not be permanent if cessation occurs5, the frequency with which participants smoke shisha could partially explain the patterns observed. Shisha smoking has been associated with esophageal squamous cell carcinoma45, low birth weight of infants from smoking mothers46 and cardiovascular effects47, and hence warrants for further study.

Users of multiple tobacco types in our study tended to show similar depletion/enrichment patterns of taxa relative abundances as cigarette smokers, largely because cigarette use was the most common tobacco use type in this group. Potentially of note, increased abundance of Enhydrobacter was related to joint use of cigarettes and dokha. This bacterium grows in the presence of ammonia48 which is believed to be common in both these products. In the case of cigarettes, ammonia is added to facilitate freeing of nicotine molecules by raising pH49.

This investigation is the largest study of the oral microbiome of an Arabic population. In contrast to most studies that rely exclusively on self-reported questionnaire, we validated nonsmoking status by urinary cotinine measurement. Although associations were identified for the tobacco types commonly used in this region, larger studies, which would provide stronger statistical power, with more detailed information on tobacco use patterns and frequency will be needed to further delineate differentials in tobacco products and the oral microbiome. We are currently recruiting participants to the UAEHFS to address this and other health-related issues for the UAE population. Although amplicon pyrosequencing has major advantages for human microbiome studies, it has also some limitations, such as the possible overestimation of OTU richness due to homopolymer errors (repeated nucleotides), inaccuracies of taxonomic identification due to the short length of the sequences and the introduction of primer and sequencing related biases50,51,52. While our research using the 16 S rRNA sequencing approach was appropriate for identifying taxonomies, future studies should investigate functional capacity of the microbiome, using full shotgun metagenomics sequencing and other methodologies.

In summary, we characterized the oral microbiome in the Emirati population and found that tobacco use had an important impact on the oral microbiome, particularly with regard to cigarette and dokha use. The abundance of multiple taxa and in particular that of 15 genera was significantly altered (enriched or depleted) in cigarette smokers; however, at the genus level, only the abundance of Actinobacillus, Lautropia, Porphyromonas and Bifidobacterium were significantly altered in users of dokha, and none were observed in shisha smokers. Our results suggest that cigarettes and other local tobacco products alter the oral microbiome structure and specific taxa abundance in the Emirati population.

Methods

Study Population

UAEHF pilot study participants were recruited in a 5-month period between December 2014 and April 2015 at Zayed Military Primary Health Clinic (ZMH PHCC) and Abu Dhabi Blood Bank (ADBB), both of which are licensed for clinical research by the Health Authority of Abu Dhabi. Eligible Emirati nationals (aged 18 and above) completed a self-administered questionnaire including information on socio-demographic factors, lifestyle and medical history. Study participants completed physical and clinical exams, including measurements of anthropometry, body composition, and blood pressure13. During the physical exam, participants also provided blood, urine and mouthwash samples. From 517 consented study participants, 363 subjects completed baseline questionnaires and provided mouthwash samples. We further excluded 33 subjects who had inconsistent smoking data (see smoking definition below). Therefore, our analytic dataset was comprised of 330 subjects (Fig. 1). This study was approved by the Institutional Review Boards (IRB) of Sheikh Khalifa Medical City (SKMC), Zayed Military Hospital (ZMH), New York University Abu Dhabi (NYUAD) and NYU Langone Medical Center, New York. All individuals participating in the study read and signed an informed consent. All experiments were performed in accordance with relevant guidelines and regulations.

Measurements

Definition of smoking

Detailed information on cigarette smoking, including smoking status, tobacco type used and smoking history, was ascertained by questionnaire. We also measured cotinine in urine by COT rapid test cassette (International Biomedical Supplies), with a cut off concentration of 200 ng/ml for tobacco smoke exposure. We further excluded from analysis 22 subjects with positive cotinine test who had self-reported as nonsmokers and 11 subjects with missing cotinine data. Tobacco type groups are defined as follows: smokers (all participants that self-reported as smoker independently of cotinine results), exclusively cigarette smokers (those that only smoke cigarettes), exclusively dokha smokers (those that only smoke dokha, typically in the traditional midwakh pipe), exclusively shisha smokers (those that only smoke shisha), multiple smokers (those that smoke more than one type of tobacco product), nonsmokers (those that self-reported as nonsmokers, and were further validated by a cotinine negative result).

Mouthwash sample collection

Participants were given a 10 ml sample of pharmaceutical grade normal saline (0.9%) solution and asked to vigorously swish for 30 seconds and spit it out onto a new sterile tube. Samples were stored initially at 4 °C for no more than 48 h. Samples were then vortexed for 20 seconds, pipetted up and down 10 times, aliquoted into 1 ml cryotubes and stored at −80 °C until further processing. To confirm that the saline solution used for collection of mouthwash samples contained no detectable levels of DNA, identical DNA extraction methods to those used in the study (see below), were applied to two separate saline solutions samples alongside two mouthwash samples. Neither of the two saline solution samples yielded any DNA. Extracted DNA was viewed by gel electrophoresis and concentrations were quantified using the high sensitivity Qubit assay. Only mouthwash samples yielded measurable amounts of DNA (Supplementary Table S5).

Microbiome assay

Two 1 ml aliquots per sample were pooled for DNA extraction. Thawed samples were centrifuged at 6000 g for 3 min and then at 10000 g for 10 min in order to collect the cell pellet. DNA was extracted using the Mo BioPowerSoil PowerLyzer kit following manufacturer’s instructions (Mo Bio Laboratory Inc, California, USA). Genomic DNA was visualized on a gel and quantified using the Qubit HS kit (Thermo Fisher Scientific). Amplification of DNA from the V4 region of 16S rDNA gene (515 F-5′GTGCCAGCMGCCGCGGTAA3′ - 806 R - 5′GGACTACHVGGGTWTCTAAT3′) was performed using specifically designed primers with Roche 454 FLK adaptor sequences and a 12 bp index (reverse primer only) added for posterior multiplexing. Amplification was carried out using the FastStart enzyme (Roche, IN). PCR products were visualized in an agarose gel, purified using Agencourt AMPure beads (Beckman Coulter Life Sciences, IN) and quantified using the Qubit BR kit (Thermo Fisher Scientific). Samples were then pooled for sequencing on an Illumina Miseq platform.

Quality control

Samples were sequenced in two batches. In addition to study samples, each batch contained three quality control samples, each in triplicate (shown in Supplementary Table S6) and a negative control (blank sample for DNA extraction and PCR amplification). Quality control samples showed good reliability, with the coefficient of variability ranging from 1.65–2.32% for the Shannon entropy and 1.02–7.21% for specific phyla relative abundances.

Statistical analysis

Sequence data processing and taxonomic assignment

Sequences were de-multiplexed and trimmed using the split_libraries_fastq.py QIIME script with default parameters53. Only sequences that passed quality control filters (average base score quality per read .20, reads longer than 200 bp), were further processed. Taxonomical assignment was achieved using the pick_de_novo_otus.py workflow as implemented in QIIME53. Sequences were clustered into operational taxonomical units (OTU) using a 97% pairwise-identity cutoff, executing the UCLUST algorithm54. PyNAST55 and the Greengenes database were used for taxonomical assignment, followed by removal of chimeric sequences using ChimeraSlayer as implemented in the QIIME workflow51. Low count OTUs were filtered from the analyses if they were singletons and absent in more than 10% of the participants.

Estimating α-diversity

β-diversity and taxa relative abundances. Oral microbiome richness and diversity were estimated from a rarefied dataset (16738 sequence reads per sample), in order to eliminate possible biases introduced by differences of sampling effort. Estimation of richness (observed and Chao) and diversity (Shannon entropy and Simpson diversity index) were calculated using the vegan library in R56 (Supplementary Figure S3). To compare α-diversity between cases and controls we modeled richness and Shannon entropy in linear regression, adjusting for age sex and batch effects. Because linear regression assumes a normal distribution of the outcome, Shannon entropy was previously log transformed. We conducted permutational multivariate analysis of variance (PERMANOVA) of weighted (taxa relative abundance) and unweighted (absence/presence) Unifrac distance matrices to compare overall oral microbial composition between tobacco users and nonusers and by tobacco type56. Matrices were calculated implementing the Unifrac function in the Phyloseq library in R57,58. We then generated PCoA plots to visualize sample ordination using the first two principal coordinates. All PERMANOVA analyses were adjusted for age, sex, and batch effects and were performed using the Adonis function in the vegan R library56. We used DESeq259 to explore for differential taxa abundances between smokers and nonsmokers for all tobacco categories as well as for the cotinine data. All statistical tests were two-sided. A p-value < 0.05 was considered of nominal statistical significance. In order to limit false detection of significance due to multiple comparisons, we adjusted for the False Discovery Rate (FDR)60. We determined a q-value < 0.10 as significant after adjustment. All analyses were conducted using R version 3.3.261.

Data availability

The datasets analyzed in this study are available in the Qiita database study ID - 11838.