Benchmarking laboratory processes to characterise low-biomass respiratory microbiota

Hasrat, Raiza; Kool, Jolanda; de Steenhuijsen Piters, Wouter A. A.; Chu, Mei Ling J. N.; Kuiling, Sjoerd; Groot, James A.; van Logchem, Elske M.; Fuentes, Susana; Franz, Eelco; Bogaert, Debby; Bosch, Thijs

doi:10.1038/s41598-021-96556-5

Download PDF

Article
Open access
Published: 25 August 2021

Benchmarking laboratory processes to characterise low-biomass respiratory microbiota

Raiza Hasrat^1,2,
Jolanda Kool²,
Wouter A. A. de Steenhuijsen Piters^1,2,
Mei Ling J. N. Chu^1,2,
Sjoerd Kuiling²,
James A. Groot²,
Elske M. van Logchem²,
Susana Fuentes²,
Eelco Franz²,
Debby Bogaert^1,2,3^na1 &
…
Thijs Bosch²^na1

Scientific Reports volume 11, Article number: 17148 (2021) Cite this article

3320 Accesses
9 Citations
16 Altmetric
Metrics details

Subjects

Abstract

The low biomass of respiratory samples makes it difficult to accurately characterise the microbial community composition. PCR conditions and contaminating microbial DNA can alter the biological profile. The objective of this study was to benchmark the currently available laboratory protocols to accurately analyse the microbial community of low biomass samples. To study the effect of PCR conditions on the microbial community profile, we amplified the 16S rRNA gene of respiratory samples using various bacterial loads and different number of PCR cycles. Libraries were purified by gel electrophoresis or AMPure XP and sequenced by V2 or V3 MiSeq reagent kits by Illumina sequencing. The positive control was diluted in different solvents. PCR conditions had no significant influence on the microbial community profile of low biomass samples. Purification methods and MiSeq reagent kits provided nearly similar microbiota profiles (paired Bray–Curtis dissimilarity median: 0.03 and 0.05, respectively). While profiles of positive controls were significantly influenced by the type of dilution solvent, the theoretical profile of the Zymo mock was most accurately analysed when the Zymo mock was diluted in elution buffer (difference compared to the theoretical Zymo mock: 21.6% for elution buffer, 29.2% for Milli-Q, and 79.6% for DNA/RNA shield). Microbiota profiles of DNA blanks formed a distinct cluster compared to low biomass samples, demonstrating that low biomass samples can accurately be distinguished from DNA blanks. In summary, to accurately characterise the microbial community composition we recommend 1. amplification of the obtained microbial DNA with 30 PCR cycles, 2. purifying amplicon pools by two consecutive AMPure XP steps and 3. sequence the pooled amplicons by V3 MiSeq reagent kit. The benchmarked standardized laboratory workflow presented here ensures comparability of results within and between low biomass microbiome studies.

Mock community as an in situ positive control for amplicon sequencing of microbiotas from the same ecosystem

Article Open access 11 March 2023

An inter-laboratory study to investigate the impact of the bioinformatics component on microbiome analysis using mock communities

Article Open access 19 May 2021

Evaluation of PCR conditions for characterizing bacterial communities with full-length 16S rRNA genes using a portable nanopore sequencer

Article Open access 28 July 2020

Introduction

The human microbiome consists of interacting networks of microorganisms, such as bacteria, archaea and fungi. The microbial community composition varies between individuals and body sites^1,2,3. To date, the gut microbiota is the most well-studied niche, and has been shown to play a vital role in human health^4,5,6,7,8. However, evidence is accumulating that the microbiota in other niches such as the respiratory tract might impact human health in a similar manner^1,5,9,10,11. The respiratory bacterial community is suggested to play an important role in the protection against acquisition and overgrowth of new pathogens, as well as maturation and modulation of the immune system. Additionally, there are strong indications it promotes the epithelial integrity, thereby inhibiting bacterial translocation^5,12.

Complex microbial communities are more accurately characterised by culture-independent techniques. Especially next-generation sequencing techniques are commonly used for analysis of gut microbiota, which is a high biomass environment^{1,2,3,5,9,10,11,12,13,14}. In contrast to the gut microbiota, the respiratory tract is less densely colonized^{12,15,16,17,18,19}, which makes it more difficult to reliably characterise them. In particular, contaminating microbial DNA from the environment and from laboratory reagents can strongly skew bacterial profiles in low biomass materials^20,21,22,23. Consequently, positive and negative controls are extremely important when working with low-biomass samples to correct for contamination and control for the laboratory workflow²¹. Furthermore, differences in standard operating procedures including bacterial load and the number of PCR amplification cycles have shown to affect the results significantly, making comparisons between studies more difficult^{24,25,26,27,28}.

Therefore, a consistent workflow including suitable controls should be applied to ensure reliable microbiota analyses of low biomass materials. Here we describe the optimization of the complete laboratory process for 16S rRNA gene MiSeq library preparation protocols^2,29. We report the effects of bacterial input, and the number of PCR cycles applied, library clean-up methods and MiSeq reagent kit chemistry on low biomass microbiota characterisation. We focus in particular on the microbial community composition of respiratory materials, which are typical low biomass samples. This study benchmarks laboratory processes to accurately characterise the microbiota of low biomass samples.

Methods

Study population/data collection

For the optimization experiments, we used 218 random samples collected from the nasopharynx (n = 214), oropharynx (n = 2) and saliva (n = 2) from healthy individuals (Table 1) obtained from a Dutch cross-sectional population-wide study, named Pienter-3³⁰. All procedures performed were in accordance with the ethical standards of the institutional and/or national research committee. Ethical approval was granted by the national ethics committee in the Netherlands, METC Noord-Holland (METC Number: M015–022). Written informed consent was obtained from all adult participants, and parents or legal guardians of minors included in the study³⁰. Following collection, saliva samples were stored in a tube containing 50% glycerol, and the upper respiratory tract samples, nasopharyngeal (NP) and oropharyngeal (OP) swabs, were stored in 1 ml of liquid Amies medium. Samples were directly frozen on dry-ice and stored at − 80 °C until further processing³⁰. We used the ZymoBIOMICS microbial community standard (Zymo mock; Zymo Research, Irvine, CA, USA) and the ZymoBIOMICS microbial community DNA standard (DNA mock; Zymo Research) as positive controls.

Table 1 Samples and statistical method per experiment. NP = Nasopharynx, OP = Oropharynx.

Full size table

DNA extraction

DNA was extracted from NP swabs, OP swabs and saliva using an Agowa Mag DNA extraction kit (LGC genomics, Berlin, Germany) as previously described^29,31, with slight modifications shown to ensure robustness for low biomass DNA extractions²⁹. In each isolation run, one 200 µl aliquot of 10³ diluted Zymo mock was included as positive control, plus two negative controls containing lysis buffer only (referred to as DNA blanks). Samples were thawed on ice and vortexed for 10 s. Per sample, 600 µl of lysis buffer with zirconium beads (diameter 0.1 mm, Biospec Products, Bartlesville, OK, USA) and 550 µl phenol (VWR International, Amsterdam, the Netherlands) was added in a conical 1.5 ml screw-cap Eppendorf tube. Samples were mechanically disrupted twice for 2 min at 3500 oscillations/minute by bead beating (Mini-Beadbeater-24, Biospec Products) and transferred on ice for 2 min after each bead-beating step. The tubes were centrifuged for 10 min at 4500 × g. The clear aqueous phase was added to a 2 ml Eppendorf tube containing 1.3 ml binding buffer and 10 µl magnetic beads. After shaking for 30 min, the tubes were put in a magnetic separation rack. The supernatant was discarded, the magnetic beads were washed with wash buffer 1 and 2 and air-dried for 15 min at 55 °C. DNA was eluted in either 35 µl or 50 µl elution buffer, depending on the starting material, by shaking for 15 min at 55 °C. Supernatant was transferred to a 1.5 ml Eppendorf LoBind tube and stored at − 20 °C.

ZymoBIOMICS microbial community standard

The Zymo mock was received from the manufacturer dissolved in DNA/RNA shield. To test the effect of dilution solvent on the generated Zymo mock profile, we prepared dilutions (10¹–10³) in DNA/RNA shield, elution buffer (Qiagen, Hilden, Germany) and Milli-Q water, mimicking the DNA concentration of low biomass samples. Unless otherwise stated, we used a 10³ diluted Zymo mock for our analyses.

Bacterial DNA quantification

The bacterial load was quantified by quantitative PCR (StepOnePlus Real-Time PCR System, Thermo Fisher Scientific, the Netherlands) with universal primers and probe targeting the 16S rRNA gene, containing forward primer 16S-F1 (5′-CGA AAG CGT GGG GAG CAA A-3′), reverse primer 16S-R1 (5′-GTT CGT ACT CCC CAG GCG G-3′) and probe 16S-P1 (FAM-ATT AGA TAC CCT GGT AGT CCA-ZEN) (IDT, Leuven, Belgium)^15,29. To optimize qPCR reproducibility and to allow comparisons of DNA concentrations reliably, we developed a standard curve by using a synthesized fragment of the 16S rRNA gene (gBlocks Gene Fragment, IDT, 5′-CGG TGC GAA AGC GTG GGG AGC AAA CAG GAT TAG ATA CCC TGG TAG TCC ACG CCG TAA ACG ATG TCT ACT AGC TGT TCG TGG TCT TGT ACT GTG AGT AGC GCA GCT AAC GCA CTA AGT AGA CCG CCT GGG GAG TAC GAA CGC AAG-3′).

MiSeq library preparation and sequencing

The V4 region of the 16S rRNA gene was amplified by PCR using the 515F (5′-GTG CCA GCM GCC GCG GTA A-3′) and 806R (5′-GGA CTA CHV GGG TWT CTA AT-3′) primers including the Illumina adapters and sample specific barcodes^2,32,33. Each 25 µl PCR reaction consisted of 0.5 µl Phusion Hot Start II High-Fidelity DNA Polymerase, 5 µl 5 × Phusion HF Buffer (Thermo Fisher Scientific), 7 µl HPLC grade water (Instruchemie, Delfzijl, the Netherlands), 2.5 µl of 2 mM dNTP mix (Roche, Mannheim, Germany), 5 µl of 5 µM barcoded primer 515F, 5 µl of 5 µM barcoded primer 806R and 5 µl template DNA. PCR reactions were executed using the following successive steps; 98 °C for 30 s; 30 cycles at 98 °C for 10 s, 55 °C for 30 s and 72 °C for 30 s and a final hold of 5 min at 72 °C. Samples with a 16S rRNA gene DNA concentration of < 20 pg/µl (< 100 pg input DNA) were used undiluted, samples with a higher concentration were diluted in HPLC grade water, accordingly. To study the effect of PCR conditions on the microbiota profile, 16, 125 and 1000 pg of bacterial load from two OP and two saliva samples were amplified using 30 cycles. The input DNA of 125 pg was additionally, separately, amplified by 25 and 35 PCR cycles, respectively. DNA blanks, no template controls (NTC), Zymo mocks and DNA mocks were included in each PCR plate and sequenced alongside the samples. The fragment size of the amplicon was assessed using agarose gel electrophoresis and quantified by Quant-iT PicoGreen dsDNA Assay Kit (Thermo Fisher Scientific). Barcoded amplicons were pooled in equimolar ratios. To study the optimal purification method, we purified the pool with two different cleaning methods; 1. agarose gel purification, extracting the DNA using GeneJET Gel Extraction and DNA Cleanup Micro Kit (Thermo Fisher Scientific), and subsequent purification by 0.9 × AMPure XP magnetic beads (Beckman Coulter, the Netherlands), or 2. by two consecutive purifications using 0.9 × AMPure XP. The library was prepared as recommended by Illumina and sequenced using the MiSeq reagent kit V2 or V3 (paired end, 500 bp) on an Illumina MiSeq instrument (Illumina Inc., San Diego, CA, US).

Data analysis

All sample libraries were simultaneously processed using an in-house bioinformatics pipeline^1,3,11,34,35. First, we performed adaptive window trimming with a quality threshold of Q30, retaining those reads with a minimum length of 150 nucleotides (Sickle, version 1.33)³⁶. Sequencing errors were reduced by an error correction algorithm (BayesHammer, SPAdes genome assembler toolkit, version 3.12.0). Paired-end sequenced reads were assembled into contigs using PANDAseq (version 2.10) and demultiplexed using QIIME (version 1.9.1)^38,39. Singleton sequences and chimeras were removed (UCHIME; implemented in the VSEARCH toolkit v2.0.3). VSEARCH abundance-based greedy clustering was performed to pick OTUs (operational taxonomic unit) with a 97% identity threshold⁴⁰. OTUs were taxonomically annotated by the Naïve Bayesian RDP classifier using the SILVA 119 release reference database^41,42. OTUs were assigned a rank number based on their abundance across the total dataset.

Analyses were performed in R version 4.0.2 within R studio version 1.4.623. OTU read counts were normalised using total sum scale resulting in relative abundances of OTUs. Microbiota profiles were visualized using stacked bar charts/boxplots. Lollipop plots were used to visualize the differences in relative abundance of each OTU between sequenced diluted Zymo mocks and the theoretical Zymo mock. To assess overall differences in microbial community composition, including low and high abundant OTUs, between (pairs of) samples we used Bray–Curtis dissimilarity matrix, where zero indicates an identical composition between pairs. Non-metric multidimensional scaling (NMDS) based on the Bray–Curtis dissimilarity matrix was used to visualize differences in microbial profiles between low-biomass samples and DNA blanks¹. We investigated the minimal DNA concentration for reliant microbiome analyses by comparing the microbial profiles of DNA blanks and low-density samples using an unsupervised hierarchical clustering approach based on the Bray–Curtis dissimilarity matrix, which was illustrated in a dendrogram. Silhouette and Calinski-Harabasz indices were used to determine the optimal number of clusters¹. To assess the impact of MiSeq reagent kits/purification methods, we determined the Pearson correlation of log₁₀ + 1-transformed relative abundances of OTUs with > 0.1% abundance in at least 20 samples. To test for significant differences in Zymo mock composition with different dilution solvents we used an ANOVA-test with Tukey's post hoc test to determine statistical significance between specific groups. Linear models were used to calculate the statistical significance between the number of reads per sample sequenced by V2 and V3 reagent kits. A p-value < 0.05 was considered significant.

Results

DNA extraction

Zymo mock dilution optimization

To mimic the concentration of low-biomass samples, a Zymo mock dilution series (10¹–10³ ×) was prepared. Zymo mocks were diluted in DNA/RNA shield (n = 6), elution buffer (n = 5) and Milli-Q (n = 5). Dilution in DNA/RNA shield resulted in a significantly different microbiota profile in comparison to elution buffer and Milli-Q across dilutions (Fig. 1a and b), which also deviated most from the theoretical Zymo mock profile. We observed an overrepresentation of Bacillus subtilis (11), Enterobacter (8), Escherichia coli (10) and Pseudomonas aeruginosa (15) and an underrepresentation of Staphylococcus epidermidis (2), Lactobacillus fermentum (22) and Enterococcus faecium (29) in Zymo mocks diluted in DNA/RNA shield compared to elution buffer (Fig. 1b). In contrast, when comparing dilution in Milli-Q versus elution buffer, we observed a significant difference in Lactobacillus fermentum (22) abundance (median 7.9% vs 10.0%, respectively, p-value < 0.001). The Zymo mock diluted in elution buffer most closely resembled the theoretical Zymo mock composition (Fig. 1c). Therefore, for further experiments, we continued with elution buffer as dilution solvent.

Library preparation

Influence of PCR amplification cycles and bacterial density on the microbiota profile

Next, we tested the effect of the number of PCR amplification cycles on the microbial community profile. To this end, 125 pg of microbial DNA of 2 OP and 2 saliva samples, were amplified using 25, 30 and 35 PCR cycles. We observed that a higher number of PCR cycles resulted in minor increases in relative abundance of especially high abundant OTUs. The abundance of Neisseria (21) (8.6/13.9%, 10.0/16.3% and 10.9/19.2% for 25, 30 and 35 cycles, respectively) increased in both saliva samples with increasing PCR cycles (Fig. 2a). One OP sample showed a higher relative abundance of Prevotella melaninogenica (37) (17.0%, 18.4% and 22.9% for 25, 30 and 35 cycles, respectively) and Leptotrichia (74) (16.8%, 17.3% and 22.6% for 25, 30 and 35 cycles, respectively) with increasing PCR cycles. However, a higher number of PCR cycles also resulted in an increased amplification of DNA in blanks (Fig. 3). Given the increased risk of contamination bias when using 35 PCR cycles on the one hand, and higher rate of amplification failures when using 25 PCR cycles on the other hand (data not shown), we therefore recommend an optimal number of PCR amplification cycles of 30.

To assess the effects of bacterial load on microbial community profiles, we tested three different quantities of bacterial input DNA (16, 125 and 1000 pg) of 2 saliva and 2 OP samples. We noticed that increasing DNA concentrations modestly affect the relative abundance of high abundant OTUs (Fig. 2b). In 3 of the 4 samples, we observed a modest increase in the relative abundance of Neisseria (21) (9.4/8.7/14.5%, 10.8/10.0/16.3% and 11.1/11.0/17.4% for 16, 125 and 1000 pg, respectively). Another OP sample showed modest increased relative abundance of Leptotrichia (74) with increasing template input (16.8%, 17.3% and 17.9% for 16, 125 and 1000 pg, respectively). Despite minor differences, we propose to standardise to a bacterial load of 125 pg as input DNA for MiSeq PCR in case of low-biomass samples, given that many low biomass samples do not meet a 1000 pg yield threshold.

Concordance between library clean-up methods

To further optimize our workflow, we studied the influence of the gel-based purification and the AMPure XP clean-up on the eventual microbiota profile, by purifying an amplicon pool containing 214 samples using both procedures (Table 1). The obtained microbiota profiles per sample were highly similar between methods (paired Bray–Curtis dissimilarity median: 0.03; range: 0.0–0.06), indicating a high concordance between both clean-up methods (Fig. 4a). Furthermore, we compared the relative abundances of the top 8 OTUs per sample and observed a correlation and regression coefficient of ~ 1.0 for all OTU abundances observed by both methods (Fig. 4b), indicating a near perfect concordance, and thus negligible differences between the tested library clean-up methods. Following, we chose to continue with the AMPure XP purification method as it is faster compared to gel-based purification.

MiSeq sequencing

Concordance between the V2 and V3 MiSeq reagent kits

To study the concordance between the V2 and V3 MiSeq reagent kits, we used the same set of samples as described when validating the library clean-up methods (Table 1). The mean number of reads per sample purified by AMPure XP was significantly different between the V2 and V3 kit (p-value < 0.001), with 20,060 (range: 2123–39486) versus 36,981 reads per sample (range: 3781–72469 reads), respectively (Fig. 5a). The overall microbial community profile only marginally differed between both sequencing methods, as indicated by the very high similarity observed between paired samples (Bray–Curtis dissimilarity median: 0.05; range: 0.0–0.1) when compared to unpaired samples (Bray–Curtis dissimilarity median: 0.8; range: 0.03–1.0) (Fig. 4a). Additionally, we compared the relative abundances of the top 8 OTUs and observed a correlation coefficient of ~ 1.0 for all those OTUs and a regression coefficient of ~ 1.0 for 7 of those OTUs (Fig. 5b), with Streptococcus (7) slightly underrepresented in the V2 kit (regression coefficient: 0.9). For lower prevalent OTUs the variance in data was too large, to reliably conclude on similarity of data. We conclude that given the high concordance between MiSeq reagent kits, we prefer to use the more recent V3 MiSeq kit, as it yields a higher number of reads per sample.

Microbiota profiles of low biomass samples compared to DNA isolation blanks

We finally tested whether the microbial community profiles of very low biomass samples could be distinguished from procedural blanks, using a range of low biomass samples. When comparing the microbiota profiles of 140 NP samples (range: 0.06–1.00 pg/µl) and 8 DNA blanks (0.02–0.07 pg/µl) (Table 1), we found that the blanks still clustered away from the NP samples (Fig. 6a). Using an unsupervised hierarchical clustering of both samples and blanks, we identified 8 different clusters, 7 clusters containing exclusively NP samples and one cluster containing DNA blanks and 2 NP samples (Fig. 6b). These 2 NP samples had a concentration lower than 0.10 pg/µl, while the other 2 NP samples with < 0.10 pg/µl clustered with all other NP samples containing > 0.10 pg/µl. Therefore, we advise to only use samples for DNA amplification and sequencing with a minimum concentration of 0.10 pg/µl, or a threshold slightly above the blanks in case local signals observed in DNA blanks are higher. Although, low biomass samples may still contain contaminating DNA, these samples can be clearly distinguishable from DNA blanks and are more likely to still elicit sufficient reads after consecutive bioinformatic clean-up.

Discussion

To study high biomass fecal microbiota, Costea et al. recommended the use of a standardized protocol to ensure reproducibility and comparibility among studies⁴³. Here, we show the importance of a standardized DNA extraction and sequencing protocol for low biomass samples like respiratory materials as well. The samples used for this project consist of a large number of NP samples (n = 214) with a range of (low biomass) bacterial loads. Positive and negative controls were included during DNA extraction, MiSeq PCR, sequencing and in the bioinformatic pipeline. Hereby, we could study the accurate processing of DNA for 16S rRNA gene sequencing and the limitations of working with low biomass samples. Noteworthy, the library clean-up methods (gel-based purification or AMPure XP), and the MiSeq reagent kits (V2 or V3 chemistry), resulted in modest to no effects on overall microbial community profiles.

We compared the labour-intensive gel-based size selection and a column-based clean-up method (AMPure XP), which can select for DNA size in a fast and effective manner^44,45,46,47. A specific ratio of 0.9 × AMPure XP leads to minimal loss of library DNA concentration and complete removal of primer dimers⁴⁸. Microbial community profiles of samples processed using each of these two methods in parallel showed high similarity. Furthermore, we observed a near perfect concordance between relative abundances of the 8 most abundant OTUs in paired analyses. Since the different cleaning procedures elicited highly similar microbial profiles, we propose to use AMPure XP for fast library clean-up.

In a whole genome sequencing study, the microbiota data obtained by whole genome shotgun sequencing using the V2 (2 × 150 bp) and V3 (2 × 300 bp) MiSeq reagent kits showed already to be highly similar⁴⁹. We are the first to compare the microbiota data of a 16S rRNA gene pool sequenced with the same settings using the V2 and V3 reagent kits (2 × 250 bp). We observed a very high concordance between the V2 and V3 kit; the modest underrepresentation of Streptococcus in the V2 kit is likely a result of differences in number of freeze–thaw cycles (one cycle difference) of the library in our study, rather than differences in kits used⁵⁰. To understand the ecology of the respiratory microbiome, it is critical to study the whole microbiome including the low abundant bacteria⁴⁹, which underlines the importance of sufficient sequencing depth. Here, we noticed that the sequencing depth per sample almost doubled when we sequenced using the V3 kit. Furthermore, the V3 MiSeq reagent kit offers an increased cluster density, higher read length and improved quality scores, thus being preferable above the V2 kit.

The inclusion of negative controls is vital to accurately study the microbiota^20,21,22,23. Contaminants can have a significant impact on the microbial data of low biomass samples²¹. Though not a primary research question in our study, we confirmed that samples with a concentration as low as 0.1 pg/µl can be consistenly amplified and show a microbiota composition that is distinguishable from the DNA extraction blanks, even without removing the contaminanting OTUs in the bioinformatic pipeline. Discrimination between samples and blanks should further improve when using dedicated bioinformatic tools^51,52 such as the decontam R-package, which allows for the identification and removal of contaminating OTUs, ideally based on a large number of negative controls⁵¹. DNA extraction blanks and ‘no template’ controls will therefore not only help to identify limits within laboratory protocols, but also help to control for contaminating DNA in downstream analyses.

We demonstrated that the bacterial profile of the Zymo mock, when diluted, can be influenced by the solvent used (DNA/RNA shield, MilliQ and elution buffer). Sample storage should therefore also be optimised for the positive controls. Dilution of Zymo mock in elution buffer most closely resembled the bacterial profile of the theoretical mock, and therefore seems preferable.

Several studies have described the effect of PCR conditions on the microbial community profile. A higher number of PCR cycles has shown to lead to an increased concentration of contaminating DNA, point mutation artifacts and chimera formation^{21,24,25,27,28}. An increased number of PCR cycles will also lead to a higher concentration of contaminating DNA in blanks and low biomass samples. Given our focus on low biomass samples, we find 30 PCR cycles to be optimal, allowing for sufficient amplicon yield, yet still limiting the impact of contaminating DNA. An initial bacterial input of 16 pg is feasible for most of the NP samples used in this study, though more samples would have to be diluted, resulting in a higher amplification of contaminating DNA and biased microbial profiles²¹. We here demonstrated that varying template DNA concentration and PCR cycles resulted in minor differences in the microbiota profile. Eventually, 30 amplification cycles with a bacterial DNA input of 125 pg resulted in sufficient amplicon concentrations for MiSeq sequencing and low background contamination.

This study has several strengths. We improved the laboratory processes by optimizing several components of our workflow, e.g. clean-up methods and PCR conditions. This resulted in an optimized MiSeq protocol for analysis of low-biomass samples. We used diluted positive controls to mimic the concentration of low biomass samples and studied the influence of dilution solvents on the bacterial profiles of these positive controls. To characterise the influence of potential reagent and environmental contamination, we included appropriate negative controls, which are extremely important when studying low biomass samples. We compared the libraries sequenced by different MiSeq reagent kits (V2 and V3) with the same MiSeq settings (2 × 250 bp). Our study also has some limitations. Despite the advantages of the Zymo mock as a positive control, it only contains few respiratory bacteria and represents low microbial diversity. Preferably, we would like to use a mock which mimics the microbiota composition of NP samples, has a more diverse profile and consists of different ratios of bacteria. This is something to consider for individual laboratories to introduce when they decide to focus on low biomass samples. Furthermore, we did not include a sufficient number of Zymo mocks to test whether different PCR conditions and different MiSeq reagent kits have an influence on the Zymo mock profile. In addition, there are other laboratory factors that may also impact microbiota results, including different types of polymerases^53,54, which were outside the scope of the current study.

Conclusion

In this study, we demonstrated the reliability of our DNA extraction and 16S rRNA gene MiSeq library preparation protocol for low biomass samples. Template concentration and number of PCR cycles had a modest influence on the microbiota profiles, while the PCR purification method and MiSeq sequencing kit had no significant effects on the microbial profiles. Therefore, we propose to use samples with a DNA concentration of 0.1–20 pg/µl which can be amplified with 30 PCR cycles. After pooling, the library can be purified by two consecutive 0.9 × AMPure XP purification steps and sequenced with the V3 MiSeq reagent kit. We confirmed that even extremely low biomass samples can be distinguished from DNA blanks. We here present a benchmarked standardized laboratory workflow that, when consistently and more widely used, ensures comparability of results within and between studies. In addition, the workflow could be useful to study the microbiota of other low biomass samples, e.g. lung, skin, blood, but also environmental samples in a standardized way.

Data availability

Sequence data that support the findings of this study have been deposited in the National Center for Biotechnology Information Sequence Read Archive database with BioProject ID PRJNA718293.

References

Bosch, A. et al. Maturation of the infant respiratory microbiota, environmental drivers, and health consequences. A prospective cohort study. Am. J. Respir. Crit. Care Med. 196, 1582–1590. https://doi.org/10.1164/rccm.201703-0554OC (2017).
Article PubMed Google Scholar
Bosch, A. et al. Development of upper respiratory tract microbiota in infancy is affected by mode of delivery. EBioMedicine 9, 336–345. https://doi.org/10.1016/j.ebiom.2016.05.031 (2016).
Article PubMed PubMed Central Google Scholar
Reyman, M. et al. Impact of delivery mode-associated gut microbiota dynamics on health in the first year of life. Nat. Commun. 10, 4997. https://doi.org/10.1038/s41467-019-13014-7 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Cho, I. & Blaser, M. J. The human microbiome: At the interface of health and disease. Nat. Rev. Genet. 13, 260–270. https://doi.org/10.1038/nrg3182 (2012).
Article CAS PubMed PubMed Central Google Scholar
de Steenhuijsen Piters, W. A., Sanders, E. A. & Bogaert, D. The role of the local microbial ecosystem in respiratory health and disease. Philos. Trans. R. Soc. Lond. B Biol. Sci. 370, 294. https://doi.org/10.1098/rstb.2014.0294 (2015).
Article CAS Google Scholar
Human Microbiome Project, C. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214. https://doi.org/10.1038/nature11234 (2012).
Article ADS CAS Google Scholar
Kamada, N., Chen, G. Y., Inohara, N. & Nunez, G. Control of pathogens and pathobionts by the gut microbiota. Nat. Immunol. 14, 685–690. https://doi.org/10.1038/ni.2608 (2013).
Article CAS PubMed PubMed Central Google Scholar
O’Hara, A. M. & Shanahan, F. The gut flora as a forgotten organ. EMBO Rep. 7, 688–693. https://doi.org/10.1038/sj.embor.7400731 (2006).
Article CAS PubMed PubMed Central Google Scholar
Biesbroek, G. et al. Early respiratory microbiota composition determines bacterial succession patterns and respiratory health in children. Am. J. Respir. Crit. Care Med. 190, 1283–1292. https://doi.org/10.1164/rccm.201407-1240OC (2014).
Article PubMed Google Scholar
de Steenhuijsen Piters, W. A. A., Binkowska, J. & Bogaert, D. Early life microbiota and respiratory tract infections. Cell Host. Microbe 28, 223–232. https://doi.org/10.1016/j.chom.2020.07.004 (2020).
Article CAS PubMed Google Scholar
Man, W. H. et al. Respiratory microbiota predicts clinical disease course of acute otorrhea in children with tympanostomy tubes. Pediatr. Infect. Dis. J 38, e116–e125. https://doi.org/10.1097/inf.0000000000002215 (2019).
Article PubMed Google Scholar
Man, W. H., de Steenhuijsen Piters, W. A. & Bogaert, D. The microbiota of the respiratory tract: Gatekeeper to respiratory health. Nat. Rev. Microbiol. 15, 259–270. https://doi.org/10.1038/nrmicro.2017.14 (2017).
Article CAS PubMed PubMed Central Google Scholar
Biesbroek, G. et al. The impact of breastfeeding on nasopharyngeal microbial communities in infants. Am. J. Respir. Crit. Care Med. 190, 298–308. https://doi.org/10.1164/rccm.201401-0073OC (2014).
Article PubMed Google Scholar
de Steenhuijsen Piters, W. A. et al. Dysbiosis of upper respiratory tract microbiota in elderly pneumonia patients. ISME J. 10, 97–108. https://doi.org/10.1038/ismej.2015.99 (2016).
Article CAS PubMed Google Scholar
Bogaert, D. et al. Variability and diversity of nasopharyngeal microbiota in children: A metagenomic analysis. PLoS ONE 6, e17035. https://doi.org/10.1371/journal.pone.0017035 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Salonen, A. et al. Comparative analysis of fecal DNA extraction methods with phylogenetic microarray: Effective recovery of bacterial and archaeal DNA using mechanical cell lysis. J. Microbiol. Methods 81, 127–134. https://doi.org/10.1016/j.mimet.2010.02.007 (2010).
Article CAS PubMed Google Scholar
Claassen-Weitz, S. et al. Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens. BMC Microbiol. 20, 113. https://doi.org/10.1186/s12866-020-01795-7 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hilty, M. et al. Disordered microbial communities in asthmatic airways. PLoS ONE 5, e8578. https://doi.org/10.1371/journal.pone.0008578 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Prevaes, S. M. et al. Concordance between upper and lower airway microbiota in infants with cystic fibrosis. Eur. Respir. J. https://doi.org/10.1183/13993003.02235-2016 (2017).
Article PubMed Google Scholar
Ducarmon, Q. R., Hornung, B. V. H., Geelen, A. R., Kuijper, E. J. & Zwittink, R. D. Toward standards in clinical microbiota studies: Comparison of three DNA extraction methods and two bioinformatic pipelines. mSystems https://doi.org/10.1128/mSystems.00547-19 (2020).
Article PubMed PubMed Central Google Scholar
Salter, S. J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87. https://doi.org/10.1186/s12915-014-0087-z (2014).
Article CAS PubMed PubMed Central Google Scholar
Eisenhofer, R. et al. Contamination in low microbial biomass microbiome studies: Issues and recommendations. Trends Microbiol. 27, 105–117. https://doi.org/10.1016/j.tim.2018.11.003 (2019).
Article CAS PubMed Google Scholar
Douglas, C. A. et al. DNA extraction approaches substantially influence the assessment of the human breast milk microbiome. Sci. Rep. 10, 123. https://doi.org/10.1038/s41598-019-55568-y (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Wu, J. Y. et al. Effects of polymerase, template dilution and cycle number on PCR based 16 S rRNA diversity analysis using the deep sequencing method. BMC Microbiol. 10, 255. https://doi.org/10.1186/1471-2180-10-255 (2010).
Article CAS PubMed PubMed Central Google Scholar
Polz, M. F. & Cavanaugh, C. M. Bias in template-to-product ratios in multitemplate PCR. Appl. Environ. Microbiol. 64, 3724–3730 (1998).
Article ADS CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Res. 21, 494–504. https://doi.org/10.1101/gr.112730.110 (2011).
Article CAS PubMed PubMed Central Google Scholar
de Muinck, E. J., Trosvik, P., Gilfillan, G. D., Hov, J. R. & Sundaram, A. Y. M. A novel ultra high-throughput 16S rRNA gene amplicon sequencing library preparation method for the Illumina HiSeq platform. Microbiome 5, 68. https://doi.org/10.1186/s40168-017-0279-1 (2017).
Article PubMed PubMed Central Google Scholar
Kennedy, K., Hall, M. W., Lynch, M. D., Moreno-Hagelsieb, G. & Neufeld, J. D. Evaluating bias of illumina-based bacterial 16S rRNA gene profiles. Appl. Environ. Microbiol. 80, 5717–5722. https://doi.org/10.1128/AEM.01451-14 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Biesbroek, G. et al. Deep sequencing analyses of low density microbial communities: Working at the boundary of accurate microbiota detection. PLoS ONE 7, e32942. https://doi.org/10.1371/journal.pone.0032942 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Verberk, J. D. M. et al. Third national biobank for population-based seroprevalence studies in the Netherlands, including the Caribbean Netherlands. BMC Infect. Dis. 19, 470. https://doi.org/10.1186/s12879-019-4019-y (2019).
Article PubMed PubMed Central Google Scholar
Wyllie, A. L. et al. Streptococcus pneumoniae in saliva of Dutch primary school children. PLoS ONE 9, e102045. https://doi.org/10.1371/journal.pone.0102045 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Kozich, J. J., Westcott, S. L., Baxter, N. T., Highlander, S. K. & Schloss, P. D. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl. Environ. Microbiol. 79, 5112–5120. https://doi.org/10.1128/AEM.01043-13 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Caporaso, J. G. et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. U S A 108(Suppl 1), 4516–4522. https://doi.org/10.1073/pnas.1000080107 (2011).
Article ADS PubMed Google Scholar
Reyman, M., van Houten, M. A., Arp, K., Sanders, E. A. M. & Bogaert, D. Rectal swabs are a reliable proxy for faecal samples in infant gut microbiota research based on 16S-rRNA sequencing. Sci. Rep. 9, 16072. https://doi.org/10.1038/s41598-019-52549-z (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
de Koff, E. M. et al. Microbial and clinical factors are related to recurrence of symptoms after childhood lower respiratory tract infection. ERJ Open Res. https://doi.org/10.1183/23120541.00939-2020 (2021).
Article PubMed PubMed Central Google Scholar
Joshi, N. A. & Fass, J. N. Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33), 2011).
Nikolenko, S. I., Korobeynikov, A. I. & Alekseyev, M. A. BayesHammer: Bayesian clustering for error correction in single-cell sequencing. BMC Genomics 14(Suppl 1), S7. https://doi.org/10.1186/1471-2164-14-S1-S7 (2013).
Article PubMed PubMed Central Google Scholar
Masella, A. P., Bartram, A. K., Truszkowski, J. M., Brown, D. G. & Neufeld, J. D. PANDAseq: Paired-end assembler for illumina sequences. BMC Bioinf. 13, 31. https://doi.org/10.1186/1471-2105-13-31 (2012).
Article CAS Google Scholar
Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336. https://doi.org/10.1038/nmeth.f.303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Rognes, T., Flouri, T., Nichols, B., Quince, C. & Mahe, F. VSEARCH: A versatile open source tool for metagenomics. PeerJ 4, e2584. https://doi.org/10.7717/peerj.2584 (2016).
Article PubMed PubMed Central Google Scholar
Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261–5267. https://doi.org/10.1128/AEM.00062-07 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590-596. https://doi.org/10.1093/nar/gks1219 (2013).
Article CAS PubMed Google Scholar
Costea, P. I. et al. Towards standards for human fecal sample processing in metagenomic studies. Nat. Biotechnol 35, 1069–1076. https://doi.org/10.1038/nbt.3960 (2017).
Article CAS PubMed Google Scholar
Borgstrom, E., Lundin, S. & Lundeberg, J. Large scale library generation for high throughput sequencing. PLoS ONE 6, e19119. https://doi.org/10.1371/journal.pone.0019119 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Hawkins, T. L., O’Connor-Morin, T., Roy, A. & Santillan, C. DNA purification and isolation using a solid-phase. Nucleic Acids Res. 22, 4543–4544. https://doi.org/10.1093/nar/22.21.4543 (1994).
Article CAS PubMed PubMed Central Google Scholar
Westen, A. A., van der Gaag, K. J., de Knijff, P. & Sijen, T. Improved analysis of long STR amplicons from degraded single source and mixed DNA. Int. J. Legal Med. 127, 741–747. https://doi.org/10.1007/s00414-012-0816-1 (2013).
Article PubMed Google Scholar
DeAngelis, M. M., Wang, D. G. & Hawkins, T. L. Solid-phase reversible immobilization for the isolation of PCR products. Nucleic Acids Res. 23, 4742–4743. https://doi.org/10.1093/nar/23.22.4742 (1995).
Article CAS PubMed PubMed Central Google Scholar
McElhoe, J. A. et al. Development and assessment of an optimized next-generation DNA sequencing approach for the mtgenome using the Illumina MiSeq. Forensic Sci. Int. Genet. 13, 20–29. https://doi.org/10.1016/j.fsigen.2014.05.007 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ranjan, R., Rani, A., Metwally, A., McGee, H. S. & Perkins, D. L. Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing. Biochem. Biophys. Res. Commun. 469, 967–977. https://doi.org/10.1016/j.bbrc.2015.12.083 (2016).
Article CAS PubMed Google Scholar
Shao, W., Khin, S. & Kopp, W. C. Characterization of effect of repeated freeze and thaw cycles on stability of genomic DNA using pulsed field gel electrophoresis. Biopreserv. Biobank 10, 4–11. https://doi.org/10.1089/bio.2011.0016 (2012).
Article CAS PubMed Google Scholar
Davis, N. M., Proctor, D. M., Holmes, S. P., Relman, D. A. & Callahan, B. J. Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome 6, 226. https://doi.org/10.1186/s40168-018-0605-2 (2018).
Article PubMed PubMed Central Google Scholar
Proctor, D. M. et al. A spatial gradient of bacterial diversity in the human oral cavity shaped by salivary flow. Nat. Commun. 9, 681. https://doi.org/10.1038/s41467-018-02900-1 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Ahn, J. H., Kim, B. Y., Song, J. & Weon, H. Y. Effects of PCR cycle number and DNA polymerase type on the 16S rRNA gene pyrosequencing analysis of bacterial communities. J. Microbiol. 50, 1071–1074. https://doi.org/10.1007/s12275-012-2642-z (2012).
Article CAS PubMed Google Scholar
Sze, M. A. & Schloss, P. D. The impact of DNA polymerase and number of rounds of amplification in PCR on 16S rRNA gene sequence data. mSphere https://doi.org/10.1128/mSphere.00163-19 (2019).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The serosurveys in the Netherlands (PIENTER-3) and in Caribbean Netherlands (HSCN) are conducted by the National Institute for Public Health and the Environment (RIVM), in close collaboration with the local Public Health Services (GGD) and Statistics Netherlands (CBS). We would like to thank all volunteers who participated in this study. This work (salaries R.H., W.A.A.d.S.P.) was also supported by The Netherlands Organisation for Scientific research (NWO-VIDI; Grant 91715359).

Author information

These authors contributed equally: Debby Bogaert and Thijs Bosch.

Authors and Affiliations

Department of Paediatric Immunology and Infectious Diseases, Wilhelmina Children’s Hospital/University Medical Center Utrecht, 3508 AB, Utrecht, The Netherlands
Raiza Hasrat, Wouter A. A. de Steenhuijsen Piters, Mei Ling J. N. Chu & Debby Bogaert
Centre for Infectious Disease Control, National Institute for Public Health and the Environment, 3720 BA, Bilthoven, The Netherlands
Raiza Hasrat, Jolanda Kool, Wouter A. A. de Steenhuijsen Piters, Mei Ling J. N. Chu, Sjoerd Kuiling, James A. Groot, Elske M. van Logchem, Susana Fuentes, Eelco Franz, Debby Bogaert & Thijs Bosch
University of Edinburgh Centre for Inflammation Research, Queen’s Medical Research Institute, University of Edinburgh, Edinburgh, EH16 4TJ, UK
Debby Bogaert

Authors

Raiza Hasrat
View author publications
You can also search for this author in PubMed Google Scholar
Jolanda Kool
View author publications
You can also search for this author in PubMed Google Scholar
Wouter A. A. de Steenhuijsen Piters
View author publications
You can also search for this author in PubMed Google Scholar
Mei Ling J. N. Chu
View author publications
You can also search for this author in PubMed Google Scholar
Sjoerd Kuiling
View author publications
You can also search for this author in PubMed Google Scholar
James A. Groot
View author publications
You can also search for this author in PubMed Google Scholar
Elske M. van Logchem
View author publications
You can also search for this author in PubMed Google Scholar
Susana Fuentes
View author publications
You can also search for this author in PubMed Google Scholar
Eelco Franz
View author publications
You can also search for this author in PubMed Google Scholar
Debby Bogaert
View author publications
You can also search for this author in PubMed Google Scholar
Thijs Bosch
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.F., E.F., T.B. and D.B. conceived and designed the experiments. M.L.J.N.C., J.G., E.v.L., S.K and J.K. were responsible for the execution and quality control of the laboratory work. R.H., W.A.A.d.S.P., D.B. and T.B. analysed the data. R.H., D.B. and T.B. wrote the paper. All authors significantly contributed to interpretation of the results, critically revised the manuscript for important intellectual content and approved the final manuscript.

Corresponding author

Correspondence to Thijs Bosch.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hasrat, R., Kool, J., de Steenhuijsen Piters, W.A.A. et al. Benchmarking laboratory processes to characterise low-biomass respiratory microbiota. Sci Rep 11, 17148 (2021). https://doi.org/10.1038/s41598-021-96556-5

Download citation

Received: 23 April 2021
Accepted: 11 August 2021
Published: 25 August 2021
DOI: https://doi.org/10.1038/s41598-021-96556-5

This article is cited by

Realising respiratory microbiomic meta-analyses: time for a standardised framework
- David Broderick
- Robyn Marsh
- Michael W. Taylor
Microbiome (2023)
Is there a placental microbiota? A critical review and re-analysis of published placental microbiota datasets
- Jonathan J. Panzer
- Roberto Romero
- Kevin R. Theis
BMC Microbiology (2023)
Quantitative microbial population study reveals geographical differences in bacterial symbionts of Ixodes ricinus
- Aleksandra I. Krawczyk
- Lisa Röttjers
- Hein Sprong
Microbiome (2022)
Tick microbial associations at the crossroad of horizontal and vertical transmission pathways
- Aleksandra Iwona Krawczyk
- Sam Röttjers
- Hein Sprong
Parasites & Vectors (2022)
Higher off-target amplicon detection rate in MiSeq v3 compared to v2 reagent kits in the context of 16S-rRNA-sequencing
- Mari-Lee Odendaal
- James A. Groot
- Wouter A. A. de Steenhuijsen Piters
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Methods

Study population/data collection

DNA extraction

ZymoBIOMICS microbial community standard

Bacterial DNA quantification

MiSeq library preparation and sequencing

Data analysis

Results

DNA extraction

Zymo mock dilution optimization

Library preparation

Influence of PCR amplification cycles and bacterial density on the microbiota profile

Concordance between library clean-up methods

MiSeq sequencing

Concordance between the V2 and V3 MiSeq reagent kits

Microbiota profiles of low biomass samples compared to DNA isolation blanks

Discussion

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links