Taxonomic composition and seasonal dynamics of the air microbiome in West Siberia

Here, we describe taxonomical composition, as well as seasonal and diel dynamics of airborne microbial communities in West Siberia. A total of 78 airborne biomass samples from 39 time intervals were analysed, within a temperature range of 48 °C (26 °C to − 22 °C). We observed a 5–170-fold decrease in DNA yield extracted from the airborne biomass in winter compared to summer, nevertheless, yielding sufficient material for metagenomic analysis. The airborne microbial communities included Actinobacteria and Proteobacteria, Ascomycota and Basidiomycota fungi as major components, as well as some Streptophyta plants. In summer, bacterial and fungal plant pathogens, and wood-rotting saprophytes were predominant. In winter, Ascomycota moulds and cold-related or stress environment bacterial species were enriched, while the fraction of wood-rotting and mushroom-forming Basidiomycota fungi was largely reduced. As recently reported for the tropical climate, the airborne microbial communities performed a diel cycle in summer, however, in winter diel dynamics were not observed.

The field of bioaerosol research studies the taxonomy and community composition of airborne microbial organisms, also referred to as air microbiome. A recent series of technological and analytical advancements include high-volumetric air samplers, an ultra-low biomass processing pipeline, low-input DNA sequencing libraries, as well as high-throughput sequencing technologies. Applied in unison, these methods have enabled comprehensive and meaningful characterization of the airborne microbial organismal dynamics found in the near-surface atmosphere 1 . Previous studies investigating bioaerosols using amplicon sequencing predominantly focussed on the bacterial fraction of the air microbiome, while fungal and plant pollen fractions frequently remained understudied [2][3][4][5][6][7][8][9] . Recent reports show the effect of airborne microbial communities on human health, driven by respiratory pathogens [10][11][12] and agents that trigger allergies [13][14][15] . Airborne microbial organisms also impact agricultural productivity, as bacterial and fungal species distributed by air movement act as plant blights 16 . Furthermore, atmospheric processes, such as cloud condensation and ice nucleation events were shown to depend on airborne microbial particles 17 . Therefore, understanding the dynamics of microbial organisms in air is crucial for insights into the atmosphere as an ecosystem, but also will inform on respiratory health aspects and human wellbeing.
In seasonal climate settings, assemblages of airborne microorganisms are dynamic and vary across seasons 3,18,19 and from day-to-day 4 . Compositions of the airborne communities may also vary between locations 20 , however, meteorological factors appear to have a stronger effect on microorganisms than other factors of the local environment 4, 9 . Recently, we have shown that in a tropical climate, airborne microbial communities are stable across days, weeks and months for samples taken at the same time of the day, while significant fluctuations in microbiome composition occur across different time points within a day. These diel air microbiome oscillations are correlated with atmospheric temperature, relative humidity and CO 2 daily profiles 1 . In seasonal climate settings, however, seasonal and diel changes of airborne communities are currently understudied.
We sampled airborne biomass in West Siberia (Yurga, 55.711 N, 84.937 E), an extensive geographical region of Russia with a continental climate. The recorded average temperatures range from 6 to 24 °C in summer (June-August), and from − 21 to − 6 °C in winter (November-March). In most years, snow accumulation occurs  (Fig. S1, S2) represent typical summer and winter settings of the region. In this report, we provide a metagenomic airborne community analysis of three time-series surveys of the near-surface atmosphere in two seasonal settings. This culture-independent study of airborne microorganisms identifies the bacterial, fungal and plant genetic material with a high taxonomic and temporal resolution in a single sampling set-up. Hence, the relative abundance of the entire airborne community can be represented on the same scale. The analysis presented here demonstrates the impact of seasonal and diel changes in the community composition of airborne microorganisms and reveals temperature as the key driver for the observed ecosystem dynamics.

Results
We conducted three time-series surveys collecting airborne biomass during 2 h time intervals 3-4 times per day for 2 to 5 days in summer and in winter 2017 and 2018 (see Methods). In total, 78 airborne biomass samples from 39 time-points were collected (Table S1). The amount of biomass collected by our air sampling approach was assessed by quantification of extracted DNA. The metagenomic analysis of the extracted DNA was performed using the RAPSearch aligner 21,22 and MEGAN 23,24 as a taxonomic classifier on Illumina whole genome shotgun sequencing data.
Large differences in biomass quantity were observed between summer and winter air samples (Fig. 1a). Total DNA yield extracted from the airborne biomass samples in summer ranged from 48.0 to 237.6 ng, while in winter only 1.4 to 9.6 ng of DNA could be obtained. Our sampling methods were performed in winter at temperatures as low as − 22 °C. Despite up to 170-fold differences between summer and winter samples (39.3fold on average, Fig. 1a), sampling intervals of 2 h yielded sufficient biomass for DNA metagenomic sequencing and deep metagenomic analysis, resulting in species level taxonomic identification. Independent of the season, a 3-5-fold fluctuation in the extracted DNA yield was observed within 24 h (Fig. S3, upper panel). Regardless of season, the majority of sequencing reads were assigned to bacterial taxa (~ 2-38%), followed by eukaryotes (~ 4-16%, mostly fungal), and < 1% to archaea, with traces of phages (Fig. S3, middle panel). Despite temperature differences of 48 °C across our sampling series, a consistently high species diversity was noted, as detected by our in-depth taxonomical analysis (Fig. 1b,c). Number of assignable taxa per 2 h sampling interval ranged from 200 to 374 (11.8-42.2% of all reads) (Fig. S3, middle and lower panels), using a cut-off value of 25 reads per taxon (see Methods).
The prevailing microbial taxa were Proteobacteria and Actinobacteria, Ascomycota and Basidiomycota fungi, and Streptophyta plants (Fig. 1c). When intersecting all time-series samples, 433 species represented the core of the microbial communities, both in summer and in winter (Fig. 1d). The majority were bacteria (315 species), followed by fungi (89 species) and species belonging to other taxonomic groups (29 species) (Fig. S4). Despite a substantial fraction of species occurrence was overlapping between summer and winter ( Fig. 1d), principle coordinate analysis clearly segregates both groups (Fig. 1e), driven by the Bray-Curtis dissimilarity distances of the most abundant organisms.
The relative abundance (Fig. 1f) and richness of the bacterial community ( Fig. 1b) reached their maximum values in the evening hours (21:00-23:00), and significantly decreased at night (1:00-3:00). The fungal community showed complementary behaviour with the highest relative abundance and richness observed at night (1:00-3:00), while decreasing during day. The day/night airborne microbial community structure was significantly distinct (multivariate linear modelling analysis in the Summer 2018 time-series: p Multivariate_GLMvalue = 0.036). In this regard, certain groups of microbial taxa, mostly Basidiomycota fungi, were significantly more abundant at night (p Multivariate_GLM -value < 0.05, Fig. S7). We also detected a cross-correlation between the atmospheric temperature daily oscillation profiles and relative abundances of some taxa (Fig. S8). This crosscorrelation with temperature was specifically strong for Basidiomycota fungi (r = − 0.58) and Streptophyta plants (r = 0.82). Moreover, we observed the effects of a rapid drop in temperature on the airborne microbial community dynamics and structure (Fig. 1g). In this regard, in the last day of the Summer 2018 time-series, northern winds caused a drop of the recorded temperature to 10 °C and accompanying light rain, resulting in a shifted air microbiome composition that was prominently decreased in bacterial taxa, with elevated abundances of Ascomycota and Basidiomycota fungi (Fig. 1g).
Winter air microbiomes. Winter  www.nature.com/scientificreports/  (Table S2). Also, Aspergillus moulds were more prolific in colder temperature settings, in contrast to woodrotting and mushroom-forming Basidiomycota fungi (Fig. 1c,f, Table S2, Fig. S9). Notably, the Gram-positive bacterium Saccharopolyspora rectivirgula was among the most abundant airborne bacteria in winter settings (Table S2, Fig. S9). This bacterium is a causative agent of farmer's lung disease, a type of hypersensitivity pneumonitis in immunocompromised patients 25 .
Despite the lack of perennial plant materials during winter in Siberia, we observed airborne biomass of cereal plants (Poaceae) (e.g., Aegilops tauschii, Triticum Urartu, Oryza sativa, Hordeum vulgare) (Table S2). This is likely due to the fact that the stalks of cereals can rise above snow level, thus allowing the spikes to continuously spread their biological material through the air, even after the growth season has ended.
Throughout the winter time-series, no diel variation was observed for any of the five taxonomic groups (Ascomycota, Basidiomycota, Streptophyta, Actinobacteria and Proteobacteria) (Fig. 1f).

Discussion
Sampling of airborne biomass is technically challenging due to low amounts of collectable material of a given size (0.5-10 µm). Therefore, research into the dynamics of airborne microbial communities is often restricted to long sampling periods. In this study, we established a sampling and analysis regime that enabled the successful sequencing of ultra-low biomass samples, collected during 2 h time-intervals and at temperatures that were as low as − 22 °C during the Siberian winter (Fig. 1h).
Our study revealed striking seasonal differences with up to 170-fold higher DNA yield extracted from the collected biomass in summer. In winter, due to low temperatures and snow cover, the potential sources and sinks of the near surface atmosphere harbour distinctly altered airborne microbial communities. This is largely due to the inaccessibility of perennial plant material and is documented by the strong reductions in Streptophyta and Basidiomycota (Fig. 1c,f), which include wood-rotting fungi and mushroom fruiting bodies. In contrast, the fraction of the mould-forming Ascomycota is strongly increased in winter, while in summer, Ascomycota are less abundant and represented mostly by plant pathogens (Fig. 1c,h, Table S2). For the bacterial domain, the most striking seasonal change is noted for richness and relative abundance of Firmicutes, with relatively lower differences observed for Proteobacteria (Fig. 1h, Fig. S10, Table S3). Despite this observation, the overall richness of the winter microbial community was comparable to summer communities (Fig. 1b).
The temporal resolution of our sampling approach enabled the diel dynamics of the air microbiome to be documented for continental climates. In summer, when the day/night atmospheric temperature differences become more prominent, members of the microbial community follow a diel cycle (Fig. 1f, summer), similar to the one previously reported for tropical settings 1 . These diel oscillations are the most pronounced in Actinobacteria and Proteobacteria, but also in Basidiomycota fungi and Streptophyta plants (Fig. 1f). Changes in temperature, and the interconnected relative humidity are likely the most important drivers of these diel oscillations, while the duration of daylight appears less important. Interruptions in the diel temperature cycle by a sudden decrease of the atmospheric temperature instantly affected community composition (Fig. 1g). In winter, when the temperature and relative humidity differences are low, the dynamics of the diel air microbiome is diminished (Fig. 1f, winter). Further factors impacting microbial community compositions are emissions as such reported during seasonal air pollution events in Siberia 26 , which potentially may impact the summer and winter dynamics of the local air microbiome.
Our study comprehensively characterizes airborne microbial communities in West Siberia and provides a direct comparison between bacteria, fungi, as well as other organisms that release biological matter into the environment. In contrast to earlier studies of bioaerosols in West Siberia 18,19,27,28 , our approach is cultivation free and not based on nucleic acid amplification. Therefore, direct comparison of the microbial community structures between this study and previously published ones is challenging. Nevertheless, the observation of the seasonal dynamics of airborne bacteria and fungi in West Siberia is concordant with previous studies 18,19,27 . Furthermore, the cultivation-based identification of the airborne bacterial 28 and fungal taxa 19 confirms on higher taxonomic level the results from our study, which in contrast allows for a species level analysis (Table S2). The observation of the diel cycle occuring also in the West Siberian air ecosystem, however, could only be made using the technological advances first demonstrated in the Gusareva et al. 1 publication and in this study, as the previous methods did not allow for a sufficient temporal resolution.

Conclusion.
The airborne biomass consisting of microbial communities was studied in West Siberia by metagenomic analysis for the first time. A large difference in biomass, assessed by DNA analysis, existed for the summer and winter seasons. Despite an up to 170-fold reduction in microbial and plant material, the deployed sampling protocol yielded quality DNA libraries with high complexity. In summer, when the atmospheric temperature and relative humidity fluctuate in a day/night pattern, the airborne microbial communities followed a diel cycle, as previously described for the tropical climate. This diel fluctuation was absent during consistently cold temperatures encountered in the winter months. www.nature.com/scientificreports/ ing three time periods (1:00-3:00, 9:00-11:00, and 15:00-17:00) on 26 and 28 July 2017; the second set was also collected during three time periods (9:00-11:00, 15:00-17:00, and 21:00-23:00) within consecutive days from 2 to 5 December 2017, the third set was collected during four time periods (1:00-3:00, 9:00-11:00, 15:00-17:00, and 21:00-23:00) within consecutive days from 27 August to 2 September, 2018. In total, 78 samples in 39 time intervals were collected and used for preparation of 62 sequencing libraries (Table S1). High volumetric, filter-based air samplers (SASS3100, Research International, USA) were used in this study, with SASS bioaerosol electret filters (6 cm diameter, expected 50% efficiency for 0.5 µm particle size, Research International, USA) as the filter medium. Sampling was performed at 300 L/min air flowrate for 2 h. After sampling, the SASS filters were stored at − 20 °C. During transport from Siberia to Singapore, the samples were hand-carried with cooling.

Methods
Sample blanks. In each sampling set, blanks were also collected as controls. The blanks consisted of 12 filter blank samples (FB) and three reagent blank samples (RB). The filter blank samples were collected by installing a new filter on the air sampler at the sampling location for about 5 s. The filter was then collected and analysed with the same protocol as the time-series samples. Reagent blank samples involved extractions performed with extraction reagents without any filter.
Details on metagenomic analysis for blanks are provided in the Supplementary section (Fig. S11).
DNA extraction. Technical replicates were isolated separately. For processing, the SASS filter was first transferred into a sterile 5 mL tube. Phosphate buffered saline (pH 7.2) with 0.1% (v/v) Triton X-100 (2 mL, PBS-T) was added to the 5 mL tube as the wash buffer. Using tweezers, the SASS filter in the tube was moved up and down a few times to let the PBS-T penetrate the filter. The tube was then sonicated for 1 min in a sonication bath without heating to dislodge the biomass from the filter. After sonication, the filter was squeezed with tweezers and the PBS-T with suspended particles was transferred into a sterile 50 mL conical tube to complete the first washing step. This washing step was repeated three times for each filter sample, using fresh 2 mL PBS-T for each repeat. At the end of the second and third repeats, the filter was transferred into the barrel of a 10 mL syringe, placed in the same 50 mL conical tube containing the wash liquid. The 50 mL tube with the syringe and SASS filter was then centrifuged at 5000×g for 2 min to remove any leftover PBS-T. The expected total recovered supernatant volume from the three washes for each sample was 6 mL, which contained the captured airborne particles. Upon completion of the wash steps, the supernatant was subsequently filtered through a 0.02 µm Anodisc filter (Whatman, UK) using a vacuum manifold (DHI, Denmark). The Anodisc was finally transferred into a 5 mL bead tube provided in the DNeasy PowerWater Kit (Qiagen, Germany) for DNA extraction.
DNA extraction from the Anodisc was mostly performed following the standard protocol of the DNeasy Power Water Kit with the following modifications to increase DNA yield. Briefly, 0.1 mg/mL (final) Proteinase K was added to the lysis buffer (solution PW1) prior to the initial 55 °C incubation. The initial incubation time at 55 °C was also prolonged from the recommended 10 min to overnight incubation. After initial incubation, the sample tubes were vortexed for 3 min and subsequently placed into an ultrasonic bath (Elmasonic, USA) for sonication at 65 °C for 30 min 29 , followed by another 5 min vortex. The remaining extraction steps were completed as instructed in the manufacturer's protocol.
In the first and second time series (SUMMER 2017 and WINTER 2017), the DNA isolated from the technical replicates was pooled to provide sufficient material for sequencing.
Metagenomic sequencing. For the metagenomic sequencing and NGS data processing, we used standardised procedures and pipelines described in detail elsewhere 1 . Extracted air DNA samples were quantitated on a Qubit 2.0 fluorometer, using the Qubit dsDNA HS (High Sensitivity) Assay Kit (Invitrogen). Immediately prior to library preparation, sample quantitation was repeated on a Promega QuantiFluor fluorometer, using Invitrogen's Picogreen assay.
Next-generation sequencing libraries were prepared with Swift Biosciences' Accel-NGS 2S Plus DNA Library Kit, following the instructions provided in the kit. With the exception of samples that had a concentration of < 0.25 ng/µL, the starting amount of DNA for library preparation was normalized to 5 ng. DNA shearing was performed on a Covaris E220 focused-ultrasonicator with the following settings: peak power: 175, duty factor: 5.0, cycles/burst: 200, run time: 90 s. All libraries were dual-barcoded, using Swift Biosciences' 2S Combinatorial Dual Indexing Kit. For PCR amplification, which selectively enriches for library fragments that have adapters ligated on both ends, the PCR cycles were normalized to eight for all libraries with a starting amount of 4-5 ng of DNA. For samples with less than 4 ng of DNA, amplification cycles were adjusted as follows: 3.0-3.9 ng: 9 cycles, 2.0-2.9 ng: 11 cycles, 1.0-1.9 ng: 13 cycles, < 1 ng: 15 cycles. Size-selection was omitted for all libraries.
Library quantitation was performed using Invitrogen's Picogreen assay and the average library size was determined by running the libraries on a Bioanalyzer DNA 7500 chip (Agilent). Library concentrations were normalized to 4 nM and the concentration was validated by qPCR on a ViiA-7 real-time thermocycler (Applied Biosystems), using Kapa Biosystem's Library Quantification Kit for Illumina Platforms. Libraries were then pooled at equal volumes and sequenced on Illumina HiSeq2500 rapid runs at a final concentration of 10-16 pM and a read-length of 251 bp paired-end (Illumina V2 Rapid sequencing reagents).
High-throughput sequencing data processing and analysis. Metagenomic data generated for the air samples were processed for adaptor removal and quality trimming with a Phred quality score threshold of Q20 using Cutadapt v. 1.8.1 30 . Two million reads (250 bp) were randomly selected from each sample as a representative set and aligned against the NCBI non-redundant (NR) protein database downloaded on 7/08/2017 using the alignment tool RAPSearch v. 2.15 21 www.nature.com/scientificreports/ Resulting alignments were imported into MEGAN v.5.11.3, which assigns taxon IDs based on the NCBI taxonomy 23,24 . To achieve the desired taxonomic specificity, we used the following filtering parameters: min score = 100 (bit score), max expected = 0.01 (e-value), top percent = 10 (top 10% of highest bit score), min support = 25 (minimum number of reads required for taxonomic assignment), LCA percent = 100 (naive), min complexity = 0.33 (sequence complexity). Lowest common ancestry (LCA) for each read on the NCBI taxonomy is assigned using MEGAN's LCA algorithm. In instances where all of the above filtering criteria have been fulfilled, reads are assigned to levels of taxonomic classification ranging from domain to species. In our study, species-level classification is only reached if at least 25 reads uniquely align to a single species in the database with a 100% match on the protein level over at least 50% of the 250 bp read. Due to limits of existing public sequence databases, some sequencing reads did not result in meaningful alignments and were assigned to the 'no-hits' category. Unassigned reads are sequencing reads for which low-complexity, repetitive DNA sequences or multiple alignments beyond domain-level are encountered.
Statistical analysis. Seasonal difference in richness, evenness and relative abundances of the microbial taxa was assessed by generalized regression modelling in R v. 3.3.3 31 . Multivariate linear modelling analysis was performed in mvabund package in R v. 3.3.3. To visualize multivariate patterns in microbial communities, Bray-Curtis dissimilarity distances among centroids for each sample series were calculated in vegan package in R v. 3.3.3. Principal Coordinates (PCo) were used as an ordination method. Alfa diversity indices chao1 and Simpson E were calculated in QIIME v. 1.8.0 32 . Cross-correlation analysis was performed in R v.3.3.3.
Meteorological data. The retrospective meteorological data of the local weather station were downloaded from the open source service "Weather and Climate" (http://www.pogod aikli mat.ru/, weather archive information downloaded on 2.10.2018). Meteorological characteristics (temperature, relative humidity, and wind direction) during the time series are represented in Fig. S1 and Fig. S2.