Introduction

Windblown dust can carry microorganisms over great distances1,2,3,4,5,6. Airborne microbial communities transported by dust from various sources may significantly affect aquatic, atmospheric, and terrestrial ecosystems once the dust settles7,8,9,10,11,12,13. Bacteria constitute the major fraction of microorganisms emitted to the atmosphere, and once airborne, they can serve as condensation and ice nuclei14,15,16 and participate in cloud water chemistry17,18,19. In addition, transported pathogens may impact agriculture and human health13,20,21. Airborne communities transported from various sources carry distinct and diverse taxa22,23,24. Recent studies have suggested that the structure and composition of airborne microbial communities are strongly affected by seasonality25,26, air mass origin, PM10 concentration3,4,6, meteorological conditions, aerosol chemical composition27, and the size of the airborne particles6,28,29,30.

Most recent studies of the airborne microbiome have applied high-throughput sequencing of the 16 S ribosomal RNA gene3,4,5 to characterize the phylogenetic and taxonomic diversity of the present community. At the same time, sequencing of the 16 S ribosomal RNA may provide additional information regarding the phylogeny and taxonomy of potentially active bacterial community31. Comparative analysis of the rRNA and the rRNA-encoding genes has recently provided meaningful ecological insights into terrestrial and aquatic environments regarding the interactions with the environment32,33,34,35,36. However, less is known about the functional capacity of microbial communities transported in the atmosphere, primarily due to technical limitations such as low biomass, challenges in acquiring high-quality genetic material, and the lack of advanced molecular tools37,38,39. A few studies have previously applied amplicon or metatranscriptome sequencing to ribosomal RNA or mRNA. They have revealed potentially active communities in the Arctic atmospheric ecosystem40, in the atmosphere, and in cloud water sampled at a mountaintop research station where they likely influence the cycling of organic compounds18,41 and interfere with abiotic chemical processes42. From an ecological perspective, improving our knowledge about the aerial dispersal of potentially active bacterial communities is particularly important because it is a key process, along with selection, drift, and speciation, that control the patterns that drive the diversity, abundance, and composition of species in microbial communities43 and thereby significantly affect the maintenance of the global biodiversity.

The Eastern Mediterranean (EM) Basin is a crossroad for various air masses of different origins. It experiences frequent dust storms in all seasons from the Sahara, the Arabian Peninsula, Iraq, and eastern Syrian deserts44. Air masses transported from Europe are characterized by lower particulate matter concentrations with an aerodynamic diameter smaller than 10 µm (PM10)45. Thus, the EM Basin affords ample opportunities to identify and characterize the unique taxonomic signatures and information about the potential environmental impacts on the receptor region. Global climate change is projected to increase heat and drying processes in the Eastern Mediterranean and the Middle East regions resulting in elevated particulate matter concentrations and intensified dust event occurrences. These predictions further underscore the importance of studying the atmospheric transport of microbial communities on both global and regional scales46,47.

Furthermore, understanding the variability in the airborne microbial composition across dust size fractions is highly relevant to public health and the environment. Smaller airborne particles (<2.5 μm in diameter) have more significant potential for long-range transport. They also pose a higher propensity for respiratory system infections because they are more likely to penetrate deeper into the lungs following inhalation48,49.

Here, we present a comprehensive and comparative study investigating airborne bacterial communities transported from different origins in the EM region using 16 S ribosomal RNA and RNA-gene high-throughput amplicon sequencing together with quantitative polymerase chain reaction (qPCR) measurements in a size-resolved manner. Using these tools, we investigated the community composition of airborne bacteria and compared these communities between the different air masses, particle-size classes and library types (i.e., rRNA gene and rRNA).

Results

Air mass back trajectories and the estimated origin of the dust

During the sampling campaign, air mass back trajectories showed three main distinct transport patterns: from the northwest, southwest, and the east (Supplementary Table 1), and were classified accordingly. The dust origins were estimated based on back trajectory analysis, integrated with dust column mass density maps (Supplementary Movies 112) obtained from MODIS observations as was done in previous studies3,4,5,6,44. The high concentrations of suspended particulate matter in the dust column maps that coincided with the air mass trajectories were assigned as the potential dust origin. Since we did not chemically characterize the particles in the aerosol samples, the effect of locally suspended particles during the dust storms was not discernable based on chemical speciation.

The sampling date, air mass, and PM10 concentrations are presented in Table 1. The classification of the air masses is Northwest (NW), Southwest (SW) and East, which are associated with the three different primary dust sources in the Eastern Mediterranean: Iraq, eastern Syria, and the Arabian Peninsula (East), and the Sahara Desert (SW)44. Two of these sources, the Sahara Desert and the Arabian Peninsula, are among the world’s largest dust sources, accounting for more than 50% of global dust emissions50,51,52. The NW air masses are associated with low PM10 air masses with some contribution from the neighboring dust sources such as Sinai and Negev deserts and thus represent mostly local airborne microorganisms and/or long-range transport from Europe3,4,6.

Table 1 Sampling date, air mass and particulate matter concentration (PM10).

Community-level multivariate comparison

PCA ordinations of Euclidean distance between samples based on center log-ratio transformed ASV counts of bacterial communities are presented in Supplementary Fig. 1. The community structures based on rRNA and rRNA-gene showed high similarity. In both communities, samples representing the same back trajectories tended to cluster together, but the influence of sampling date and PM10 concentration was evident.

According to the PERMANOVA test, the rRNA and rRNA-gene communities differed significantly, although only a small fraction of the total community variance was attributed to this parameter (R2 = 0.02, p < 0.001). When examining the effect of the air mass, PM10 concentrations, size class and sampling date on rRNA and RNA-gene communities separately, we found that all of these parameters significantly affected the community composition, with sampling date (R2 = 0.35 and 0.22) and air mass (R2 = 0.20 and 0.11 for rRNA and rRNA-gene communities) responsible for the greatest share of the total variance, followed by PM10 (R2 = 0.07 and 0.06) and size class (R2 = 0.04 and 0.08). The results are presented in Table 2.

Table 2 The results of PERMANOVA analysis on the terms: air mass, particle concentration, particle size, and sampling date, in this order.

Diversity and richness of the airborne bacterial communities

The composition of the airborne bacterial communities is discussed in Supplementary Results. Alpha-diversity analyses, based on richness (observed number of ASVs) and Shannon–Wiener diversity index, were conducted on rRNA and rRNA-gene communities by air mass. The results are presented in Fig. 1.

Fig. 1: Diversity and richness of airborne bacterial communities.
figure 1

Shannon–Wiener diversity indices (a), and observed richness (b) of rRNA and rRNA-gene communities of different air masses. Community type and particle-size classes are represented by different colors and symbol shapes, respectively. Box plot center lines represent the median values, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the lower and upper whisker extends from the hinge to the lowest and largest value no further than 1.5 * IQR (interquartile range) from the hinge, and dots represent individual samples within each group. Lowercase letters indicate statistically significant differences (p < 0.05) between the air masses for rRNA and rRNA-gene communities separately. p values of significance between rRNA and rRNA-gene communities in each air mass are displayed over the horizontal bars in each plot.

Similar patterns of richness and diversity between rRNA and rRNA gene of different air masses were observed. According to the Kruskal–Wallis test, different air masses differed significantly in the diversity and richness of both the rRNA and rRNA-gene communities (Kruskal–Wallis, richness p < 0.001 and = 0.003, diversity p < 0.001 and <0.001, respectively). Specifically, according to a Wilcoxon signed-rank test, diversity and richness were significantly higher in the SW and East than those with NW air masses in both rRNA and rRNA-gene communities. The diversity of the rRNA and rRNA-gene communities of the East air masses was significantly higher than that of the SW air masses (Supplementary Table 3).

We also compared the diversity and richness of the rRNA and rRNA-gene communities. According to the Kruskal–Wallis test results, the diversity of rRNA-gene communities differed significantly from that of rRNA communities (p = 0.030), but the observed richness did not (p = 0.586). Comparing the diversity of rRNA vs. that of rRNA-gene communities in each air mass source revealed no significant differences for the SW and East samples (Wilcoxon signed-rank test, p > 0.05); however, the diversity in rRNA community was significantly lower than that of rRNA-gene community in the NW air masses (p = 0.041), possibly due to small sample size.

When all the samples (i.e., rRNA and rRNA-gene communities) were divided according to the particle-size class (i.e., fine, intermediate, and coarse), the diversity and richness differed significantly (Kruskal–Wallis, p = 0.046 and 0.007). Specifically, the diversity and richness of the coarse particle-size class were significantly higher than those of the fine particle-size classes (Wilcoxon signed-rank test, p = 0.049 and 0.007). All p values are presented in Supplementary Table 4.

Quantitative PCR

Figure 2 describes the number of 16 S ribosomal RNA transcripts and gene copies per 1 m-3 of sampled air, based on qPCR results. Overall, we found that the rRNA transcripts were significantly more abundant than the rRNA gene copies. We also found significant differences in rRNA transcripts and gene copies between air masses and particle-size classes (Supplementary Table 5). The East and SW air masses showed a higher abundance of rRNA transcripts and gene copies than the NW (Wilcoxon signed-rank tests, East p < 0.001 for rRNA transcripts and gene copies, respectively; and SW p < 0.001 and p = 0.001 for rRNA transcripts and gene copies, respectively). Significant differences in the abundances of rRNA transcripts were also observed among the airborne communities between the East and SW (p = 0.015), the former with higher rRNA abundance. Lastly, rRNA transcripts and rRNA gene copies were significantly more abundant in the coarse particle-size class compared to the fine particle-size class (p = 0.021 and 0.002, respectively). The mean 16 S rRNA transcripts and gene copies in each group are given in Supplementary Table 6.

Fig. 2: Abundance of airborne bacterial communities.
figure 2

The rRNA transcripts and gene copies in different air masses (a), and particulate matter size classes (b). Box plot center lines represent the median values, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the lower and upper whisker extends from the hinge to the lowest and largest value no further than 1.5 * IQR from the hinge, and dots represent individual samples within each group. Lowercase letters indicate significant differences (p < 0.05) between the air masses and particle-size classes for rRNA transcripts and gene copies, separately. p values of significance between rRNA transcripts and gene copies in each air mass and particle-size class are displayed over the horizontal bars in each plot.

Bacterial taxa associated with different air masses

We examined the taxa significantly associated with different air masses using a linear mixed model (MaAsLin2), as described in the “Methods” section, with the reference community defined as NW. The results are shown in Fig. 3. A total of 3458 unique ASVs were significantly associated with at least one of the air masses (p < 0.05, Benjamini–Hochberg correction for multiple comparisons). A total of 790 and 814 ASVs were associated with SW and East air masses compared to the NW, whereas 1311 and 1379 ASVs were associated with NW compared to the SW and East air masses, respectively. ASVs of Actinobacteriota, Bacteroidota, Firmicutes and Proteobacteria were most commonly associated with the air mass origin.

Fig. 3: Bacterial taxa significantly associated with different air masses.
figure 3

The MaAslin2 scores per class is presented with association to the NW and the SW (a), and the NW and the East (b) trajectories. Each color represents a different phylum. Box plot center lines represent the median values, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the lower and upper whisker extends from the hinge to the lowest and largest value no further than 1.5 * IQR from the hinge, and dots represent ASVs within each group. Bacterial classes with more than ten ASVs associated with one of the air masses are shown.

SW air masses were mainly characterized by elevated levels of Actinobacteria (202 ASVs of 816), Bacilli (165 ASVs of 515), Clostridia (207 ASVs of 784) and Negativicutes (9 ASVs of 22); East air masses by Actinobacteria (237 ASVs of 816), Chloroflexia (30 ASVs of 114), Gemmatimonadetes (8 ASVs of 43), Longimicrobia (8 ASVs of 36) and Rubrobacteria (11 ASVs of 29). Alphaproteobacteria (290 and 217 ASVs of 823), Bacteroidia (236 and 246 ASVs of 809) and Gammaproteobacteria (110 and 163 ASVs of 470) were the class of ASVs, among many others, associated with NW compared to SW and East air masses, respectively. However, none of these dominating bacterial classes were uniquely associated only with one of the air masses (i.e., the SW and East compared to the NW).

Bacterial taxa associated with different particle-size classes

We investigated the taxa associated with different particle-size classes, defining the fine particle-size class as the reference community. A total of 165 ASVs were significantly associated with at least one of the particle-size classes (p < 0.05, Benjamini–Hochberg correction for multiple comparisons). In total, 54 and 9 ASVs were associated with the coarse and intermediate particle-size classes, respectively, whereas 97 and 5 ASVs were associated with fine particle-size class (compared to coarse and intermediate, respectively). Significant ASVs associated with the coarse vs the fine particle-size classes are shown in Fig. 4. We found that the families Rhodobacteraceae and Oxalobacteraceae, among others, were only associated with coarse particles. In contrast, Sphingomonadaceae, WD2101 soil group, Hymenobacteraceae, Pseudonocardiaceae, Rubrobacteriaceae and Streptomycetaceae were only associated with fine particle-size class. Moreover, 13.4% (13 ASVs of 97) and 40.0% (2 ASVs of 5) of the total ASVs were associated with the fine particle-size class (vs the coarse and intermediate size class, respectively) were genera of Streptomyces.

Fig. 4: Bacterial taxa significantly associated with coarse particle-size class compared with fine particle-size class.
figure 4

The MaAslin2 scores per family is presented with association to fine and coarse particle sizes. Each color represents a different phylum. Box plot center lines represent the median values, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the lower and upper whisker extends from the hinge to the lowest and largest value no further than 1.5 * IQR from the hinge, and dots represent ASVs within each group. Bacterial families with more than two ASVs associated with one of the particle-size are shown.

Bacterial taxa associated with PM10 concentration

We identified a total of 1831 unique ASVs that associated with PM10 concentrations (p < 0.05, Benjamini–Hochberg correction for multiple comparisons). 42.3% (774 ASVs of 1831) of these ASVs were identified with a positive coefficient, indicating that the abundance of these taxa increased with the increase in PM10 concentrations, whereas 57.7% (1057 ASVs of 1831) ASVs were identified with a negative coefficient, indicating that the abundance of these taxa decreased with the increase in PM10 concentrations. These results are shown in Fig. 5. Actinobacteria (27.0%, 220 ASVs of 816), Alphaproteobacteria (23.0%, 189 ASV of 823), Chloroflexia (14.0%, 16 ASVs of 114) and Thermoleophilia (12.5%, 16 ASVs of 128) ASVs tended to increase in abundance with the increasing in PM10 concentrations whereas Bacilli (%21.2%, 109 ASVs of 515), Bacteroidia (23.5%, 190 ASVs of 809), Clostridia (40.7%, 319 ASVs of 784) and Cyanobacteriia (26.9%, 18 ASVs of 67) ASVs tended to decrease in abundance with increasing PM10 concentrations.

Fig. 5: Bacterial taxa significantly associated with PM10 concentration.
figure 5

The MaAslin2 scores per class are presented with association to particulate matter concentration. Each color represents a different phylum. Box plot center lines represent the median values, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the lower and upper whisker extends from the hinge to the lowest and largest value no further than 1.5 * IQR from the hinge, and dots represent ASVs within each group. Bacterial classes with more than ten ASVs associated with PM10 concentration are shown.

Bacterial taxa associated with rRNA and rRNA-gene communities

We examined the taxa associated with rRNA and rRNA-gene communities, as a proxy for ASVs with significantly higher and lower rRNA abundance than their rRNA-gene abundance. Results are presented in Fig. 6. Overall, 478 ASVs were identified with a significantly higher rRNA abundance, whereas 363 ASVs had a significantly lower rRNA abundance (p < 0.05, Benjamini–Hochberg correction for multiple comparisons), compared with rRNA-gene abundance. Alphaproteobacteria and Gammaproteobacteria were the two dominating classes, in which 19.0% (156 out of a total of 832 ASVs) and 11% (52 out of a total of 470 ASVs) of their total ASVs, respectively, displayed high rRNA abundance. A few ASVs of these classes displayed low rRNA abundance (0.01%, and 0.04%, for Alpha- and Gammaproteobacteria, respectively). Other dominating class such as Actinobacteria, Chloroflexia and Thermoleophilia were more commonly associated with low rRNA abundance (11.0%, 90 ASVs of 816; 16.6%, 19 ASVs of 114; 25.8%, 33 ASVs of 128, respectively) than with high rRNA abundance (0.02%, 24 ASVs of 816; 0.00%, zero ASVs of 114; 0.04%, 6 ASVs of 128).

Fig. 6: Bacterial taxa significantly associated with rRNA vs. rRNA-gene communities.
figure 6

a The MaAslin2 scores per class are presented with association to rRNA and rRNA-gene communities. Each color represents a different phylum. Box plot center lines represent the median values, the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the lower and upper whisker extends from the hinge to the lowest and largest value no further than 1.5 * IQR from the hinge, and dots represent ASVs within each group. b ASV mean abundance (represented by clr-transformed counts) of rRNA (vertical axis) and rRNA-gene (horizontal axis). Each dot represents a different ASV. Significant results for ASVs with high rRNA abundance are depicted by colored diamonds, whereas ASVs with low rRNA abundance are shown as colored dots; each color represents a different phylum. The diagonal line represents the 1:1 ratio between rRNA to rRNA-gene abundance.

Furthermore, we investigated whether the abundant (i.e., clr rRNA-gene abundance >0) or rare (i.e., clr rRNA-gene abundance <0) taxa could have a high rRNA abundance and these results are presented in Supplementary Results.

Discussion

Factors influencing the airborne bacterial community structure

The bacterial communities, as detected by 16 S rRNA and rRNA-gene sequencing, varied in composition across all collected samples according to their air mass source. According to PERMANOVA analysis, the sampling date and air mass were the two key variables responsible for the observed variance across all samples and the main environmental driver structuring the airborne community composition and assembly in both the rRNA and rRNA-gene samples. Particulate matter concentration and size class were the other two significant variables affecting the variance in airborne communities. These findings are consistent with findings in previous studies3,4,6.

A higher proportion of total variance was explained by the measured environmental variables (i.e., sampling date, air mass, particulate matter size class, and concentrations) in rRNA communities (R2 = 0.66) than in the rRNA-gene communities (R2 = 0.47). This may be expected since the copies per genome of a cell’s ribosomal RNA gene generally can vary between 1 and 1553, whereas the number of transcripts synthesized per cell may reach up to thousands depending on the cellular metabolic state and nutrient availability54,55,56.

The abundance (i.e., rRNA gene copies and rRNA transcripts), alpha diversity and richness of samples of the NW trajectories were significantly lower than those of all other samples, suggesting that the communities coming from the terrestrial sources are more abundant and diverse. This is consistent with the findings of previous studies at the same sampling location. These studies suggested that the NW air masses, are characterized by lower PM10 concentrations and likely represent the local bacterial communities3,4,6. Moreover, the greater abundance, diversity and richness observed in aerosol samples of the coarse particles compared to the fine particles indicate that most of the airborne bacteria are attached to mineral dust or transported as cell aggregate.

To the best of our knowledge, this is the first comprehensive survey to characterize airborne bacteria in the atmosphere in a size-resolved manner using rRNA and rRNA-gene sequencing. The higher yield of rRNA, along with its potential implications on the recent activity of the sampled bacteria, may prove to be a more reliable method than rRNA-gene for airborne community analysis. The greater sensitivity displayed by rRNA transcripts to environmental parameters such as air mass source, PM10 concentration, etc., suggests that the rRNA library is less affected by relic nucleic acids than the rRNA-gene library, possibly due to faster degradation of rRNA compared with rRNA-gene, albeit slower than the degradation of mRNA.

Effect of air mass source on bacterial community composition

Animal and human commensal genera such as aerobic (most species) Corynebacterium57, strictly anaerobic, non-spore-forming Bacteroides58, aerotolerant-anaerobic or microaerophilic, non-spore-forming Lactobacillus59, obligately anaerobic Oscillospiraceae UCG-00560, Ruminococcus torques group61 Christensenellaceae R-7 group62, and aerobic, spore-forming, halotolerant and soil, marine and hypersaline environments-dwelling Nocardiopsis63, soil-dwelling, spore-forming Streptomyces64 were abundantly associated with SW air masses.

Calcareous carbonaceous rocks (e.g., marble, limestone, and calcarenite)-dwelling, desiccation, solar radiation, salt and metal resistant bacterial genera of Geodermatophilus65, soil, water and air-dwelling bacterial genera of Adhaeribacter66, free-living, aerobic, nitrogen-fixing, soil and plant rhizosphere and phyllosphere-dwelling bacterial genera of Microvirga67, soil-dwelling, spore-forming Streptomyces64, soil-dwelling bacterial genera of Nocardioides68, animal and human commensal, aerobic (most species) genera of Corynebacterium57 were abundantly associated with East air masses.

Strictly aerobic, ice-nucleation active Sphingomonas8,69, nitrate-reducing, extremophile soil bacterial genera of Paracoccus70, aerobic and non-spore forming Pseudomonas71, soil, water and air-dwelling bacterial genera of Roseomonas72, mesophilic or thermophilic, aerobic, ionizing-radiation resistant bacterial genera of Deinococcus73 were some of the dominating genera were associated with the NW air masses. These genera are abundantly found in aquatic habitats74,75,76,77 and in aerosol samples collected in clear atmospheric conditions40,42 in previous studies.

Although the number of aerosol samples in this study was limited to three to five samples per air mass, we identified unique taxonomic signatures for each air mass representative of the potential source. The airborne bacterial community of the NW air masses were more affected by marine ecosystem though, some contribution of terrestrial origins was evident, whereas the SW and East air masses were dominated by terrestrial origins, inferred by the 16 S rRNA and rRNA gene sequencing. These results are consistent with the back trajectory analysis and dust column density maps. The NW air masses, showed mostly a marine trajectory over the Mediterranean Sea with some terrestrial contribution from the neighboring regions such a Sinai and Negev deserts; whereas the SW and East air masses were characterized by high PM10 concentrations (except for three SW aerosol samples) that are typical of dust storms in the Eastern Mediterranean often originating from Saharan desert, with a possible contribution of the Sinai and Negev deserts (SW)78,79, the Arabian Peninsula, Iraq, eastern Syria and Jordan (East)44. Moreover, majority of ASVs both associated with air mass (i.e., SW and East) and PM10 (SW: 83.1%, 158 ASVs of 190; East: 84.8%, 395 ASVs of 466), positively correlated with the increase in PM10 concentrations, suggesting that the origin of SW and East air masses are more likely to be deserts. We also tracked the potential environmental source of airborne bacteria using dbBact80. According to the results, ASVs that are negatively correlated with the increasing PM10 concentrations displayed enriched ontology terms associated with fecal bacteria from human and farm animals whereas ASVs that are positively correlated with the increasing PM10 concentrations showed enriched ontology terms associated with desert and soil bacteria. This indicates that dust plumes bring desert and soil-associated bacteria over great distances to the EM.

Some of the signature taxa such as Microvirga, Geodermatophilus and Corynebacterium previously identified in East air masses sampled at the same sampling location, using a different aerosol sampling device (i.e., microorifice uniform deposit impactor, MOUDI) and sequencing technique (i.e., based on only rRNA gene)6. Other index taxa such as Jeotgalicoccus identified in East air masses in the same previous study however, was associated with both SW and East (to a lesser extent) air masses in this study. The observed taxonomic differences between the two studies at the same sampling location may be a result of natural variance or different sampling and sequencing techniques used.

The bacterial genera Acinetobacter, Corynebacterium, Lactobacillus, Methylobacterium, Pseudomonas and Sphingomonas genera were reported to be possible contaminants, especially in samples obtained from low-biomass environments81. We found a higher number of ASV affiliated with these taxa in collected dust samples (i.e., SW and East) than in the NW samples. This observation indicates that these bacteria were more likely to be transported by the dust.

Effect of particle-size class on bacterial community composition

Understanding the relationship between the ecology and lifestyle of specific taxa and particle-size class can provide meaningful insights into the aerial dispersal patterns of bacteria. For example, many ASVs of soil-dwelling bacterial families were significantly associated with the fine particle-size class, including the family Streptomycetaceae64 (31.3%, 15 ASVs of 48), specifically genera Streptomyces (39.3%, 13 ASVs of 33); and the family Pseudonocardiaceae82 (10.1%, 9 ASVs of 89) including the genera Actinomycetospora, Actinophytocola, Amycolatopsis, Lechevalieria, Saccharothrix and Umezawaea. These two families alone comprise approximately 25% (24 ASVs of 97) of all fine particle-associated bacteria (compared to the coarse particles). These bacterial families exhibit a life cycle that is similar to many fungi that involve the formation of hyphal filaments from the vegetative mycelium during colony growth, followed by sporulation, aerial segregation (as single cells i.e., spores) and germination in a new environment64,82. Some of the families that were only associated with fine particle-size class such as Hymenobacteraceae (Adhaeribacter, Hymenobacter and Pontibacter), Rubrobacteriaceae (Rubrobacter), Sphingomonadaceae (Ellin6055, Novosphingobium and Sphingomonas) and WD2101 soil group are known biofilm-dwelling bacteria in various environments83,84,85,86,87. This suggests that during the emission or aerial transport, cell aggregates are detached and dispersed as individual cells.

Animal and human commensal families such as Lachnospiraceae (Ruminococcus torques group, CHKCI001, GCA-900066575 and UCG-010)88 and Bacteroidaceae (Bacteroides)58, soil-dwelling families Geodermatophilaceae (Klenkia and Geodermatophilus)65 and Oxalobacteraceae (Noviherbaspirillum and Massilia)89, air and human skin residing Micrococcaceae (Kocuria, Nesterenkonia and Yaniella)90 and nitrate-reducing, extremophile soil bacterial family Rhodobacteraceae (Paracoccus)70 were associated with coarse particle-size class. Many of these genera form cell aggregates or produce extracellular polymeric substances91,92,93,94,95. Altogether, the results suggest that the ecology and specific lifestyle of bacteria in their source environment may have strong influence on observed aerial transport patterns (i.e., as single cells or cell aggregates).

Bacterial taxa with high rRNA abundance

Our results show that the rRNA:rRNA gene ratios (inferred by qPCR) were between 2.5 and 276.3 (M = 31.95, SD = 41.50) in all samples, and when grouped according to air mass and particle-size class, rRNA transcripts were significantly more abundant than rRNA gene copies. The observed 16 S rRNA:rRNA gene ratios in our study were relatively lower than the findings of a previous study in the Arctic atmospheric ecosystem, which reported an average of 585 ± 288 16 S rRNA transcripts per cell (accounting for an average of 4.02 ribosomal operons per cell)40. This variation could result from different sampling (high volume impinger was operated for 5 h where atmospheric samples were collected into an RNA fixative solution) and the nucleic acids extraction methods (chemical and physical lysis were used together) or ecosystems studied.

We observed a range of distributions at the class level for taxa with high and low rRNA abundance. For example, Alphaproteobacteria (19.0%, 156 ASVs of 823) and Gammaproteobacteria (11.0%, 52 ASVs of 470) mostly represent the high-rRNA abundance ASVs, whereas Actinobacteria (11.0%, 90 ASVs of 816), Chloroflexia (16.6%, 19 ASVs of 114) and Thermoleophilia (25.8%, 33 ASVs of 128) mostly represent low-rRNA abundance ASVs. Other dominating classes such as Bacteroidia, Bacilli, and Clostridia did not exhibit such discernable traits at the class level. Previous studies attributed the bacterial low- and high-rRNA abundance to a phylogenetically conserved trait related to cell size96 and habitat preference97,98. In this study, we did not find a substantial relationship between bacterial cell size99 and the rRNA content of specific taxa.

We identified 9.4% of total ASVs with a high-rRNA abundance (478 ASVs of 5,051). The dominating genera were the hydrocarbon-degrading, aerobic Acinetobacter100 (1.9%, 9 ASVs of 478); strictly anaerobic, human and animal commensal Bacteroides58 (2.7%, 13 ASVs of 478); facultative anaerobic, non-spore-forming Corynebacterium101 (1.9%, 9 ASVs of 478); aerobic Hymenobacter102,103 (2.1%, 10 ASVs of 478); aerotolerant-anaerobic or microaerophilic, non-spore-forming, human and animal commensal Lactobacillus59 (2.9%, 14 ASVs of 478); facultative methylotroph Methylobacterium-Methylorubrum104 (1.9%, 9 ASVs of 478); heterotrophic and nitrate-reducing Paracoccus70 (3.6%, 17 ASVs of 478); aerobic and non-spore-forming Pseudomonas71 (3.8%, 18 ASVs of 478); anaerobic, human commensal Ruminococcus torques group105 (2.9%, 14 ASVs of 478); and strictly aerobic, ice-nucleation active Sphingomonas8,69 (6.3%, 30 ASVs of 478); all are genera known widely distributed in nature, and isolated from various terrestrial and aquatic ecosystems. Species belonging to the genus of Acinetobacter (e.g., A. baumannii, A. nosocomialis and A. pittii)106, Bacteroides (e.g., B. distasonis, B. fragilis, B. ovatus, B. thetaiotaomicron and B. Vulgatus)58, Corynebacterium (e.g., C. diphtheria, C. kutscheri, C. pseudotuberculosis and C. ulcerans)107,108, Methylobacterium-Methylorubrum (e.g., M. extorquens, M. mesophilicum and M. zatmanii)109, Paracoccus (e.g., P. yeei)110, Pseudomonas (e.g., P. aeruginosa)111 and Sphingomonas (e.g., S. paucimobilis and S. suberifaciens)112,113,114,115,116 are reported to be human, animal or plant pathogens. Their presence in the dust samples further emphasizes the potential ecological impact of aerial bacterial transport, as these may be human, animal and plant pathogens.

Viable microorganisms represent the functional capacity to drive ecosystem processes. Microbial growth and activity are linked to the ribosomes since they are the sole protein biosynthesis machinery of the cell. Specifically, in bacteria, ribosome biosynthesis is tightly regulated by ribosomal RNA transcription level in accordance with the current physiological state of the cell and in response to changing environmental conditions117,118,119. However, there is no absolute correlation between the abundance of rRNA transcripts and the activity and growth of a cell. In fact, this relationship is likely to vary between different bacteria due to different survival strategies31. In environmental samples, rRNA-based analyses, specifically 16 S rRNA:rRNA gene ratios have been considered to indicate the current activity and growth. A higher ratio was assumed to be associated with high metabolic activity41,42,96,120,121. However, this approach was criticized due to conflicting patterns between rRNA abundance and cell activity and growth in different species31. Moreover, rRNA transcripts can sometimes be detected for extended periods following cell death122,123,124,125,126,127,128, although extracellular rRNA molecules are less likely to maintain long-term stability in harsh atmospheric conditions. However, this variability also raises concerns about using rRNA as a reliable indicator for viability and activity. Therefore, although our results may indicate that some of the airborne bacteria have been recently viable in the source environment or during the atmospheric transport, further studies by complementary methods are warranted to verify the viability129 (e.g., culture-based techniques, cell staining for microscopy or flow cytometry) because of the uncertainties associated with the varying degradation time of rRNA in environment122,123,124,125,126,127 and the lack of systematic studies that evaluate the presence of rRNA for viability assessment130.

The accurate quantification and identification of the taxa present in atmospheric samples rely on the sampling method used131. A potential limitation of this study was using impaction, which was previously shown to decrease the viability of the sampled microorganisms132,133, for a relatively long sampling period in desiccation conditions. It is thus likely that some microorganisms might have been poorly represented by this study. The integrity of rRNA transcripts after cell death depends on the environmental conditions122,123,124,125,126,127. Although the significantly higher rRNA transcripts (vs to rRNA gene copies) found in this study is consistent with the finding of a previous study where the aerosol sample of the Arctic atmospheric ecosystem was directly collected into a preservative solution40, bioaerosol samples collected on quartz microfiber filters for a sampling period of 6 h may be subject to some bias due to the potential degradation of rRNA transcripts during the sampling.

Conclusions

It is often questioned whether the atmosphere is an active ecosystem capable of maintaining Earth’s biodiversity and supporting the transport of viable bacteria and human, animal, and plant pathogens. Previous studies have suggested a continuous exchange of microorganisms between remote ecosystems, thus maintaining biodiversity and a healthy global ecosystem22,134,135,136. Our results show that the composition of the airborne bacterial communities from different locations varies with diverse ecological significance. The air masses coming from terrestrial sources have significantly higher abundance, diversity and richness of bacteria than marine-influenced air masses in the Eastern Mediterranean. We found that the ecology and the specific lifestyle of the bacteria in the source environment may affect the aerial transport patterns (i.e., single cells or cell aggregates). The significantly higher abundance of bacteria, along with higher bacterial diversity and richness, in the coarse particle-size class suggests that most of the airborne bacteria are attached to mineral dust or transported as cell aggregates.

The greater sensitivity displayed by rRNA transcripts to environmental parameters suggests that rRNA library is less affected by relic nucleic acids than the rRNA-gene library. This highlights the importance of using a complementary approach i.e., rRNA and rRNA-gene sequencing in the future studies to acquire robust and more reliable data to understand the dynamic factors affecting the airborne community composition and assembly.

Although community composition and qPCR data may indicate the aerial transport of viable bacteria, further evidence based on cell integrity and cellular metabolism assay may help confirm this assumption. Some of the dominating bacteria that we identified with high-rRNA abundance are reported to be harmful to human, animal, and plant health. Thus potential atmospheric transport of pathogens may pose public and environmental health risks both regionally and globally. Future studies will focus on the transport of pathogens to determine their potential effect.

Materials and methods

Sample collection

Atmospheric PM10 was collected on quartz microfiber filters (Whatman Sigma-Aldrich, Saint Louis, MO, USA, 203 mm × 254 mm for the backup stage, and Tisch Environmental, Inc., Cleves, OH, USA, TE-230-QZ Slotted Quartz Fiber for a five-stage high volume cascade impactor); all the filters were pre-baked at 450 °C for 5 h before sampling to rid them of all organic matter. Sampling was performed using a high-volume air sampler (Tisch Environmental, Inc., TE-6070X) at a flow rate of 67.96 m3 hr−1 for six hours. Operating blanks were obtained following the same procedure but were placed in the sampler for only 5 min of operation.

A five-stage high-volume cascade impactor (Tisch Environmental, Inc., TE-235) was used to procure size-segregated samples (<0.49 µm, 0.49–0.95 µm, 0.95–1.5 µm, 1.5–3.0 µm, 3.0–7.2 µm, 7.2–10.0 µm). An equal air volume was sampled on each stage since the high-volume cascade impactor operates at an equal flow rate (i.e., 67.96 m3 hr−1) per stage137. All the samples were collected on the roof of a four-story building at the Weizmann Institute of Science, Rehovot, Israel (31.9070 N, 34.8102 E; 80 m AMSL). Sampling was designed to capture particulate matter from various sources (e.g., Sahara, Arabia, Syria) under varying concentrations (i.e., low and high PM10)138. Saharan dust storms frequently affect the EM Basin during winter and spring through cold-core mid-latitude Mediterranean and Sharav cyclones139. In contrast, the dust storms originate from the Eastern sources, such as Iraq, eastern Syria, Jordan, and Saudi Arabia, mainly during the fall, are triggered by the main synoptic circulation system known as The Red Sea Trough140. Thus, a specific period, from October to December, was chosen for collecting aerosol samples representing dust storms of different origins, while four aerosol samples representing clear conditions (i.e., low PM10 Northwest air masses) were amassed from both dusty (i.e., winter to spring) and non-dusty seasons.

To achieve this sampling design, we followed various atmospheric forecast platforms (https://www.windy.com/; https://forecast.uoa.gr/en/forecast-maps/dust/europe; and https://dust.aemet.es/forecast/nmmb-bsc-dust-forecast-sconc) to predict and prepare for upcoming dust storms. These predictions were verified by the online PM10 data of the Israeli Ministry of Environmental Protection database, Rehovot Air Monitoring station (https://www.svivaaqm.net/). Aerosol samples representing 12 different sampling days and PM10 concentrations in a size-resolved manner (i.e., six different cutoff sizes) were obtained. At the end of each sampling event, the filters were cut using a sterile surgical scalpel and forceps and immediately submerged in RNA fixative solution (280 g L−1 ammonium sulfate dissolved in 25 mM sodium sulfate solution with 10 mM EDTA, pH 5.2). All prepared batches of the RNA fixation solution were poured into several sterile Petri dishes in a biological hood and UV sterilized twice for 15 min each to eliminate potential nucleic acid contaminants. The samples were kept at −20 °C (for 3–5 days) until downstream processing. Since the direct extraction of nucleic acids from multiple samples following the long hours of nighttime sampling during the dust storms is challenging, we kept aerosol samples in an RNA fixation solution until the nucleic acids were extracted. This method was previously applied to bioaerosol samples, resulting in a successful microbial community analysis141. The extraction yields of nucleic acids and the microbial community analyses of frozen and instantly extracted (directly after sampling) ultra-low biomass aerosol samples showed no significant differences between the two methods in a previous study142. However, storing the samples at −20 C° before the extraction may lead to the loss of nucleic acids due to cell lysis, potentially leading to some bias in the community analysis depending on the cell type.

Particulate matter concentration data, backward trajectory analyses, and dust column mass density maps

Particulate matter concentration data were obtained from the Rehovot Air Monitoring station, located approximately 1 km from the sampling site. This station is part of the Israeli Ministry of Environmental Protection network. PM10 concentration data were obtained in 5-min time intervals and were used to calculate the mean concentration for each sampling period.

To identify the origin of the sampled air mass, back trajectories were calculated using the hybrid single-particle Lagrangian integrated trajectory model (HYSPLIT)143,144 via the web interface (READY, http://ready.arl.noaa.gov/HYSPLIT_traj.php). Each back trajectory was calculated for 72 h at 3 different altitudes (0, 50, and 100 meters above ground level). The results are presented in Supplementary Table 1.

To identify the potential dust sources, time-averaged maps of dust column mass density (hourly 0.5° × 0.625°) and reanalysis meteorological data were acquired from the Modern-Era Retrospective analysis for Research and Applications (MERRA-2) for each sampling event. Each retrospective analysis consisted of time-averaged hourly frames (i.e., maps of the given region) of dust column density maps for 72 h before the sampled event (including the sampling period) that were animated into short video clips for each sampling date (Supplementary Movies 112). Analyses and visualizations used in this study were produced with the Giovanni online data system developed and maintained by the NASA GES DISC145.

Nucleic acid extraction and cDNA synthesis

DNA and RNA were coextracted from the filters using a PowerWater DNA isolation kit (Qiagen, Dresden, Germany) following the manufacturer’s protocol, with the following adjustments. One piece of each filter (1 × 12 cm) was removed from the RNA-preserving solution and placed into separate bead tubes. One milliliter of PW1 solution was added to each tube, and the tubes were then vortexed horizontally for 5 min, followed by centrifugation at 2700 × g for 2 min. The supernatant from each bead tube was split into two aliquots and placed in 2 ml collection tubes. Then, we followed the manufacturer’s protocol up to the last step (elution), which we repeated twice, using 50 μl of PW6 solution each time; the spin filter was soaked at room temperature for 5 min before centrifugation at 13,000 × g for 1 min. 100 μl of DNA/RNA solution per tube was obtained. Next, the DNA/RNA solution was split into two 50 μl aliquots. One of these aliquots was kept without further processing and was used in amplification and sequencing steps as a DNA sample.

The other 50 μl aliquot was treated with 1 μl DNase (DNase I, RNase-free, Thermo Fischer Scientific, Roskilde, Denmark) and incubated at 37 °C for 30 minutes to digest all the DNA. This reaction was repeated twice consecutively to remove any residues of DNA. This aliquot was used as an RNA sample for downstream reverse transcription and sequencing.

cDNA was synthesized from the RNA extracts using a High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Life Technologies, CA, USA) with random primers. To ensure that all the DNA was digested in the previous step, we ran a control reaction for each sample to which no reverse transcriptase enzyme was added. These controls were treated according to the same protocol as the real reaction samples. No amplicons were evident after the quantitative polymerase chain reaction (qPCR), indicating that all DNA was digested.

The ribosomal RNA transcripts and gene copies in the aerosol samples were determined by qPCR (StepOnePlus Real-Time PCR, Applied Biosystems, Life Technologies) using RNA (RNA transcripts that were converted to cDNA as described above) and DNA extracts, respectively. qPCR in this study used universal bacterial primers, 331 F (TCCTACGGGAGGCAGCAGT) and 518 R (ATTACCGCGGCTGCTGG)146, targeting a fragment of the small subunit of the bacterial ribosome (SSU), also referred to as the 16 S gene. qPCR reactions were performed in triplicate on each cDNA and DNA sample, including reverse transcriptase negative control reactions and non-template controls. Each 20 μl reaction mixture was prepared as follows: 10 μl of SensiFAST SYBR mix (Bioline, London, UK), 1 μM of each primer, 4 μl of molecular grade H2O, and 2 μl of template gDNA or cDNA. The following thermal cycling conditions were used for amplification: 3 min at 95 °C followed by 35 cycles of 5 s at 95 °C and 20 s at 60 °C. A logarithmic calibration curve of known pNORM1 plasmid concentrations (designed by Christophe Merlin (LCPME, Nancy, France), synthesized by Eurofins Scientific) was used to quantify the nucleic acids as previously described3. Total ribosomal RNA transcripts and gene copies were normalized to the total volume of sampled air. A total of 129 samples representing rRNA and rRNA-gene of three air mass origins in 6 particle-size classes were analyzed, as follows: 24 and 10 samples of northwest; 30 and 30 of southwest; 17 and 18 of the east (rRNA and rRNA-gene, respectively).

A significant limitation in bioaerosol studies is the ultra-low biomass content of air samples142. Previous studies at the same location showed that aerosol samples, especially those with low PM10 mass, contain very low amounts of genomic DNA (i.e., 16 S rRNA-gene copies)3,4,5. However, when the number of different nucleic acids (i.e., 16 S rRNA transcripts and gene copies) was compared in a low biomass environment (i.e., Arctic atmospheric ecosystem) by qPCR, the number of RNA (i.e., 16 S rRNA transcripts) was always higher than the DNA134. These results are in line with our initial nucleic acid extraction experiments in this study. Therefore, we used a DNA isolation kit for DNA/RNA co-extraction to specifically maximize the quantity and quality of the extracted DNA to obtain an applicable spectrum of airborne community DNA in this study. Although using a DNA isolation kit for DNA/RNA co-extraction may introduce recovery bias, comparing community diversity and richness of rRNA and rRNA-gene shows a good correlation for observed richness (Spearman’s ρ = 0.92, p < 0.001) and diversity (Spearman’s ρ = 0.87, p < 0.001), suggesting that this bias is small. However, the co-extraction of DNA and RNA will be re-examined in future studies, and a method that better suits both nucleic acids should be implemented.

Amplification and sequencing

The rRNA and rRNA-gene communities in the aerosol samples were analyzed by high-throughput amplicon sequencing (250×2 cycles) of 16 S ribosomal RNA and RNA-gene; RNA (RNA transcripts that were converted to cDNA) and DNA extracts were used as templates, respectively. The genomic sequencing was conducted at the DNA Sequencing Facility (DNAS) at the University of Illinois at Chicago (UIC) using an Illumina MiSeq instrument. The target sequence was a segment of the V4 region in the 16 S rRNA gene, amplified using tagged bacterial/archaeal primers as follows: CS1_515F (ACACTGACGACATGGTTCTACAGTGCCAGCMGCCGCGGTAA) and CS2_806R (TACGGTAGCAGAGACTTGGTCTGGACTACHVGGGTWTCTAAT)147. Short read length sequencing of the rRNA gene (i.e., V4 region in the 16 S rRNA gene) may reduce the number of ASVs taxonomically classified148. This may introduce some bias in the results.

Biostatistical microbiome data analysis

The sequencing data were analyzed using the R packages DADA2 (version 1.16.0)149 and Phyloseq (version 1.36.0)150. Sequences (i.e., a total of 4,129,758) were quality-trimmed and filtered. Paired reads were merged, and chimeras were removed to produce amplicon sequence variants (ASVs). A total of 2,998,398 reads of bacterial 16 S rRNA sequences (i.e., rRNA and rRNA-gene), including aerosol samples and blank filter controls (i.e., one for each particle-size class), were then assigned to 13,444 ASVs. Due to the low RNA and DNA quantities, not all samples were fit for biostatistical analysis. A total of 125 samples representing rRNA and rRNA-gene of three air mass origins in 6 particle-size classes were analyzed, as follows: 24 and 6 samples of northwest; 30 and 30 of southwest; 17 and 18 of the east (rRNA and rRNA-gene, respectively).

Taxonomic classification of the obtained ASVs was performed using the SSURef SILVA database (v.138)151. The contaminating sequences were identified using the decontam package relying on sequence frequencies152. Identified contaminant sequences were subsequently removed from the dataset. We also removed sequences that corresponded with Cholorplasts and Mitochondria as well as ambiguous and unclassified phylum annotations. The rRNA and rRNA-gene samples from each sampling date were treated as separate communities for all subsequent processing. ASVs with a read count lower than 5 in each sample and with a prevalence (i.e., number of samples each ASV occurs in) lower than 3 in the entire dataset were removed to avoid a small mean and trivially large coefficient of variation. Phantom taxa, defined as ASVs only observed in rRNA and not in rRNA-gene communities42, were also removed from the entire sampling event (Supplementary Methods). A total of 5,051 ASVs passed all filtration steps and were used in the subsequent biostatistical and community composition analyses. To account for data compositionality153,154, zero counts were replaced using the cmultRepl function in the zCompositions (version 1.3.4) package based on geometric Bayesian multiplicative replacement followed by centered log-ratio (clr) transformation with the clr function in the compositions package155. All statistical and model-based analyses were run using clr-transformed ASV counts unless otherwise noted.

The different particle size fractions were combined into four size classes: <0.49 µm—the backup stage; 0.49–1.5 µm (stage 4–5)—fine fraction; 1.5 to 3.0 µm (stages 3)—intermediate fraction and 3.0–10.0 µm (stages 1-2)—coarse fraction. These subgroups correspond to bacterial cell debris (backup stage), single cells (fine fraction), and cell aggregates (coarse fraction). The intermediate size class is more likely to represent cell aggregates than single cells, as the cell diameter of a single bacterial cell is typically between 0.7 to 1.5 µm (approximately 1 µm)156. Due to the accumulation of cell debris and the bouncing effect on impactor stages157, we excluded the backup stage (<0.49 µm) from further analysis (in qPCR and sequencing).

Variance analysis (PERMANOVA) was conducted using the adonis2 function (1000 permutations) in the vegan package158 based on the Euclidean dissimilarity matrix. PERMANOVA models were run separately using the following designs: (1) we examined the variance between rRNA and rRNA-gene communities over the entire dataset, and (2) we examined the variance explained by the variables air mass, particle concentration (PM10), particle size, and sampling date, in this order, on rRNA and rRNA-gene communities. For both cases, blocks were defined as sample dates to account for the non-independence of different particle sizes sampled at the same time159,160.

Richness (observed number of ASVs) and Shannon–Wiener diversity indices were calculated with the Phyloseq package150. The Kruskal–Wallis rank-sum test and Wilcoxon signed-rank test were implemented for community richness, diversity, and qPCR analyses using the stats R package. In the Wilcoxon signed-rank test, p values were adjusted for multiple testing using the Benjamini–Hochberg method161.

ASVs that were significantly more abundant in the rRNA community than in the rRNA-gene community (1), in the SW and East air masses than the NW air mass (2), in the coarse and intermediate particle-size classes than the fine particle-size class (3) and with the increase in PM10 concentrations (4) were identified using a linear mixed model of the MaAsLin2 package162. The model was run separately for sample type (i.e., rRNA to rRNA-gene communities) (1), air-mass (2), particle-size class (3) and PM10 concentration (4), with the following designs to examine significant associations between specific ASVs and each variable: (1) fixed effect = c (sample type), random effect = (sample date, particle size), (2;3;4) fixed effect = c (air-mass; particle-size; PM10 concentration), random effect = (sample type, sample date). The reference communities were defined as the rRNA-gene (1), the NW air mass (2), and fine size class (3) in the models. All reported p values were corrected using the Benjamini–Hochberg method161, with adjusted p < 0.05 results considered significant. All modeling results are presented in Supplementary Table 2. Ontology-based enrichment analysis was conducted using dbBact database’s differential abundance query mode to track the potential environmental source of the airborne bacteria80.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.