Introduction

Next-generation sequencing technologies have generated renewed interest in culture-independent 16S rRNA gene-based community profiling (Tringe and Hugenholtz, 2008). 16S amplicon pyrosequencing permits a much deeper sampling of microbial communities by providing orders of magnitude more sequence information than the more traditional Sanger sequencing of PCR-clone libraries. Moreover, bar-coded 16S amplicons from multiple samples can be analyzed in parallel and provide greater sensitivity than do PCR-clone libraries (Sogin et al., 2006; Parameswaran et al., 2007). This makes it possible to examine the effects of several variables on community composition estimates, such as biases due to DNA extraction or PCR conditions.

In this study, we investigate two technical issues, namely the effects of primer choice and amplicon length on assessments of bacterial species richness and evenness. The termite P3 hindgut lumen community of Nasutitermes corniger was chosen to assess these factors because this community has been characterized extensively by PCR-clone library analysis (1700 near full-length sequenced clones) and has potentially tractable diversity (Warnecke et al., 2007). As part of this investigation, we explored two hypotheses. First, we anticipated that shorter amplicons produce higher richness estimates. Given that amplicons can compete with primers for binding sites in the PCR reaction, shorter amplicons may accumulate and inhibit their own production in earlier cycles allowing rarer templates to amplify in later cycles, thereby increasing the apparent richness. In addition, we hypothesize that primer choice will have a marked effect on species evenness because of variable priming specificities and annealing kinetics (Suzuki and Giovannoni, 1996).

Materials and methods

DNA extraction

To obtain termite hindgut community DNA, the gut tracts of 25 N. corniger worker specimens were extracted from the exoskeleton using sterile forceps. A hemi-transverse incision of the P3 hindgut compartment was made using a needle, and 2 μl of 100 mM phosphate-buffered saline was mixed with luminal contents squeezed out of the P3 compartment. The samples were pooled, maintained on ice, and DNA was isolated using aluminum ammonium sulfate added to cetyl trimethylammonium bromide, followed by a polyethylene glycol precipitation (Wrighton et al., 2008).

PCR and 454 sequencing

Overall, 11 amplicons ranging from 352 to 1443 nucleotides (nt) in length (Figure 1) were produced using combinations of the following broadly conserved 16S primers (it must be noted that 454 adaptor sequences and bar codes are not shown here): 27F (5′-agagtttgatcMtggctcag-3′), 357F (5′-ctcctacgggaggcagcag-3′), 530F (5′-gtgccagcMgccgcgg-3′), 803F (5′-attagataccctggtagtc-3′), 926F (5′-aaactYaaaKgaattgacgg-3′), 1114F (5′-gcaacgagcgcaaccc-3′), 342R (5′-ctgctgcSYcccgtag-3′), 519R (5′-gWattaccgcggcKgctg-3′), 787R (5′-ctaccagggtatctaat-3′), 907R (5′-ccgtcaattcMtttRagttt-3), 1100R (5′-gggttgcgctcgttg-3′) and 1392R (5′-acgggcggtgtgtRc-3′).

Figure 1
figure 1

Experimental design showing amplified regions of the 16S rRNA gene. Amplicon names to the left of the figure denote amplicon length including primers and the orientation of sequencing, forward (F) or reverse (R). Representations of amplicons show the region sequenced (blue) and forward (F) or reverse (R) primers (gray) used to produce the amplicon. Universal primers are represented in red typeface and domain-level primers are in black.

To multiplex amplicons for inclusion on a single sequencing run, the common primer in each reaction (27F or 1392R) was bar coded on the 5′ end with five unique bases between the 454 A-adaptor sequence (5′-gcctccctcgcgccatcag-3′) and the conserved 16S rRNA primer sequence. Rules for bar coding were established to reduce the likelihood of ambiguities due to potential homopolymeric errors; (1) barcodes cannot start with the same nt as the 454 adaptor ends; (2) bar codes cannot end with the same nt as the first nt in the 16S primer; (3) there can be no more than two successive occurrences of the same nt within the bar code; and (4) each bar code must differ from other bar codes by at least two bases. The other primer in each pair was not bar coded but did incorporate the 454 B-adaptor (5′-gccttgccagcccgctca-3′) at its 5′ end.

For each primer pair, PCR was performed in triplicate and pooled to minimize random PCR bias. Each 20 μl reaction consisted of 0.5 Units Taq (GE Healthcare, Waukesha, WI, USA), 2 μl of supplied 10 × buffer, 0.4 μl of 10 mM dNTP mix (MBI Fermentas, Burlington, Ontario, Canada), 0.6 μl of 10 mg ml−1 bovine serum albumin (New England Biolabs, Ipswich, MA, USA), 0.2 μl of each 10 μM primer and 10 ng of template DNA. Each reaction proceeded under the following conditions: 95 °C for 3 min; 25 cycles of 95 °C for 30 s, 55 °C for 45 s and 72 °C for 90 s; followed by a final extension at 72 °C for 10 min. Amplification products were purified on Qiagen MinElute PCR columns (Qiagen, Valencia, CA, USA) according to the manufacturer's instructions and quantified using a Qubit fluorometer (Invitrogen, Carlsbad, CA, USA). To obtain a similar number of reads from each sample, amplicons were mixed in equal concentrations. Emulsion PCR and sequencing were performed using a GS FLX emPCR amplicon kit according to the manufacturer's protocols (454 Life Sciences, Branford, CT, USA).

Informatics analysis

Pyrosequencing flowgrams were converted to sequence reads using the standard software provided by 454 Life Sciences. Reads were processed using the computational pipeline described in the study by Kunin et al. (2010). Briefly, the reads were end trimmed with LUCY (Li and Chou, 2004) using an accuracy threshold of 0.5% per base error probability, and then the bar code and primer sequences were removed from the 5′ end of the read. Reads lacking exact matches to a bar code and primer were discarded. All remaining reads were uniformly truncated to 220 nt on the basis of the length histogram of the quality-trimmed reads (not shown). Reads shorter than 220 nt were excluded from further analyses. Identical 220-nt reads were removed and unique sequences compared by blastn using a word length of 25. The blast output was filtered to remove all pairwise matches with similarities <97% across the entire read length and clustered using the Markov Cluster algorithm using default parameters (van Dongen, 2000). Overall, 97% operational taxonomic units (OTUs) were classified taxonomically by blastn comparisons against the greengenes database (DeSantis et al., 2006) using a word length of 25. Pass rates for each step of the processing pipeline were recorded.

To assess richness, rarefaction curves and bootstrapping were performed using an in-house script that plots randomly sampled clustered reads as a function of the number of 97% OTUs. To assess evenness, rank abundance curves were prepared using 97% OTUs with 0.5% relative abundance, averaged for each region. Simpson's measure of evenness (E1/D; D=∑(n/N)2, where n is the number of organisms for a given species and N the number of organisms for all species) was calculated for each amplicon using the statistical program R (RDC Team, 2008). This metric is insensitive to the taxa richness and ranges from 0 to 1, with 0 representing complete dominance and 1 representing an evenly structured community. We statistically compared differences in evenness estimates between the primer regions using a two-sample t-test (Minitab Inc., State College, PA, USA). For the F_573 amplicon with anomalous OTU evenness (see Supplementary Figure S2), we estimated the presence of a mismatch in the 519R primer for each OTU on the basis of the closest matched full-length greengenes sequence. Relative abundances of phyla were calculated using the greengenes classifications of the OTUs. Analysis of Similarity (ANOSIM) in the statistical package Primer V (Plymouth Marine Laboratory, Plymouth, UK), an analog to the standard univariate one-way ANOVA (analysis of variance) designed for ecological data, was performed on phylum-level pyrosequencing data to statistically assess assemblage differences between primer pairs (Clarke 1993). For statistical testing, we applied either an ANOSIM or two-sample t-test, and considered a probability (P-value) less than 0.05 to denote significance.

Data submission

454 GS FLX flowgrams (sff files) were submitted to the Short Read Archive database at NCBI (accession no. SRA009438).

Results and Discussion

To determine the effect of primer pair and amplicon length on 16S rRNA-based community composition estimates, we assessed OTU richness (number of OTUs) and evenness (relative abundance of OTUs) for a range of amplicons obtained from a termite hindgut community using bar-coded pyrosequencing (Sogin et al., 2006; Parameswaran et al., 2007). 16S rRNA genes were PCR amplified using a combination of broad-specificity (domain or universal) primers with 454 FLX adaptor sequences. A total of 11 amplicons, ranging in size from 352 to 1443 bp, were prepared with primer sets that spanned either the V1 and V2 (sequenced forward from 27F) or the V8 regions (sequenced reverse from 1392R) of the 16S rRNA gene (Figure 1). The technical replicates of four amplicons (namely F_394, F_573, R_352 and R_544; Figure 1) were used to compare differences between amplicon data sets with the variation inherent in the method.

Amplicon pyrosequencing data processing and statistics

A total of 680 744 pyrosequence reads (termed ‘pyrotags’) were produced from the amplicons (Supplementary Table S1). Quality-based trimming resulted in the loss of 18.6% of data, and a further 17.6% of data were lost because of short sequence length or no identifiable bar code. In total, 36.2% of reads were excluded after these quality-filtering steps (Supplementary Table S1); however, this loss was not uniform as longer amplicons contributed disproportionately to those eliminated (Supplementary Figure S1). Amplicons >1 kb in length had failure rates >90% and were not included in subsequent analyses. However, it is worth noting that amplicons as long as 963 bp had pass rates >50% (Supplementary Table S1), despite recommendations by manufacturers to limit amplicons to <500 bp.

Sequences that passed quality filtering were trimmed to a uniform length of 220 bases to facilitate comparative analyses. Trimmed reads were grouped into clusters with a 97% identity threshold producing a total of 2269 OTUs of which 1617 and 652 were from the forward (V1–V2) and reverse (V8) regions, respectively. The applied quality-filtering and clustering parameters were previously shown to minimize the effect of pyrosequencing errors on microbial diversity estimates (Kunin et al., 2010).

Species richness

The first hypothesis was that, for a given number of reads, shorter 16S rRNA gene amplicons yield greater species richness than do longer amplicons. To estimate species richness, rarefaction curves were generated by randomly sampling reads and plotting the number of novel 97% OTUs against the number of reads sampled (Figure 2). Most noticeably, forward amplicons all produced markedly (approximately threefold) higher OTU richness estimates than did reverse amplicons (Figure 2). This is due to higher sequence variability in the V1–V2 region than the reverse V8 region as has recently been observed by Youssef et al. (2009). Therefore, OTU richness estimates provided by 16S pyrotags can vary according to the particular region surveyed, and absolute richness estimates based on different portions of the 16S rRNA gene should not be compared directly.

Figure 2
figure 2

Rarefaction curves of the 97% OTUs for different length amplicons from forward (V1 and V2) and reverse (V8) regions of the 16S rRNA molecule. Colored hatching represent 95% confidence intervals.

Within-region comparisons showed that the shortest amplicons produced higher richness estimates than did longer amplicons (Figure 2), although this trend does not seem to hold beyond 400-bp fragments (for example, F_963 seems to indicate higher richness than does F_839). The apparently much lower richness estimate of F_573 compared with other forward amplicons is likely due to a mispriming effect biasing against many phylotypes (see below). It should also be noted that technical replicates of some of the shorter amplicons resulted in different richness estimates, suggesting that rare populations were not reproducibly sampled despite the relative simplicity of the community studied. A similar relationship between amplicon length and estimated richness was observed by Huber et al. (2009) using 16S rRNA gene clone library data from two hydrothermal vent fluid samples. In this case, 100-bp amplicons produced significantly higher estimates of richness than did 400- or 1000-bp products.

Species evenness

Our second hypothesis posited that primer choice affects the relative abundance of OTUs (that is, species evenness). Whereas many primers designed to amplify 16S rRNA genes are broadly conserved, no primer pair is truly universal because of base-pairing exceptions present in one or more lineages targeted by broadly conserved primers (Hugenholtz and Goebel, 2001). However, the extent of this problem on PCR-based community profiling has not been systematically addressed.

Rank abundance curves of the dominant OTUs (>0.5% relative abundance in at least one primer set) were prepared by averaging OTU abundances across amplicons (excluding F_573, see below). This was performed separately for each region, as forward and reverse data are not directly comparable. The relative abundance of the dominant reverse OTUs was greater than the dominant forward OTUs (Figure 3), because of higher sequence conservation in the V8 region (Youssef et al., 2009) resulting in larger clusters of reverse reads at the 97% identity threshold than forward reads. To corroborate our interpretation of the curves, we calculated Simpson's inverse index of diversity (E1/D) for each amplicon (Supplementary Table S2). A two-sample t-test performed on E1/D values for each amplicon confirmed that the 16S rRNA regions (V1–V2 vs V8) evaluated in this study resulted in statistically different estimates of evenness (P<0.05; P=0.012).

Figure 3
figure 3

Rank abundance curves of the top 97% OTUs in the forward V1–V2 (red diamonds) amplicons and reverse V8 (blue diamonds) amplicons. Standard errors are shown.

Remarkably, all of the primer pairs within either the forward or the reverse region (with the exception of F_573) produced very similar estimates of species evenness for the dominant OTUs, as evidenced by the low standard error for most OTUs (Figure 3). The single outlier to this trend, amplicon F_573, produced markedly different OTU abundances (Supplementary Figure S2) attributable to a C:A mismatch between the 519R primer and 16S rRNA gene templates at Escherichia coli position 534, three bases from the 5′ end of the primer sequence. Such mismatches are customarily considered as having little or no impact on PCR because extension occurs from the 3′ end (Bru et al., 2008). However, the addition of the 18-bp 454 B adaptor to the 5′ end seems to have sufficiently destabilized the binding of this primer to mismatched templates, thereby favoring the amplification of perfectly matched templates. This resulted in a consistent overrepresentation of perfectly matched (T:A) templates coupled with an underrepresentation of C:A mismatched templates (Supplementary Figure S2). Therefore, primer selection can significantly affect species evenness if base variations in templates are not accounted for by degeneracies in the primer sequence. However, when these variations are addressed, evenness of dominant OTUs is highly reproducible between different primer pairs targeting the same region.

To compare the phylogenetic diversity uncovered in pyrosequence data from each region to previous estimates of the community structure in the termite hindgut, we classified all OTUs by blastn against the greengenes database (DeSantis et al., 2006) and then amalgamated the OTUs at phylum level (Figure 4). With the exception of F_573, estimates of the Nasutitermes hindgut community structure from the forward and reverse amplicons were not significantly different (P>0.05; P=0.10, R=0.556), despite the difference in OTU granularity between regions (Figure 3). The dominant phylum, the Spirochaetes, comprises 67–71% of the reads in each amplicon data set, followed by the Fibrobacteres (16–25%) and a handful of other phyla each representing >1% of reads, including Proteobacteria, Firmicutes, Bacteroidetes, Acidobacteria and candidate phylum ZB3 (Figure 4). These results are consistent with previous PCR-based (Warnecke et al., 2007) and fluorescence in situ hybridization-based (Hongoh et al., 2006) profiles of Nasutitermes spp. hindgut communities. In contrast, Spirochaetes were significantly underrepresented and Fibrobacteres significantly overrepresented in the F_573 sample because of the aforementioned C:A mismatch in most Spirochaetes and T:A match in most Fibrobacteres.

Figure 4
figure 4

Relative abundance of bacterial phyla in the termite hindgut for each amplicon. Data from each technical replicate pair were averaged.

Conclusions

This study tested the hypotheses that shorter pyrotag amplicons produce higher richness estimates and that primer choice affects species evenness. Our results show that the shortest amplicons tested (<400 bp) produce higher richness estimates than do longer amplicons. However, regional variation in the 16S rRNA molecule has a much greater effect on apparent richness. Within a common region, primer choice had little effect on evenness of dominant OTUs (>0.5% abundance), provided that template mismatches are accommodated for by degeneracies in the primer. This surprising reproducibility may have been facilitated by the use of a common primer. However, pronounced differences in evenness were observed between the two regions of 16S rRNA tested because of differences in sequence conservation. Despite the observed inconsistencies in both richness and evenness estimates between variable regions, the inferred community structure at higher taxonomic ranks (phylum) was consistent between amplicons and regions and to previous estimates of community structure from the termite hindgut. We conclude that species (97% OTUs) evenness and richness should not be directly compared between different regions of the 16S rRNA molecule. However, species evenness estimated using different primer pairs targeting the same region may be reliably compared.