Challenging the Wigglesworthia, Sodalis, Wolbachia symbiosis dogma in tsetse flies: Spiroplasma is present in both laboratory and natural populations

Profiling of wild and laboratory tsetse populations using 16S rRNA gene amplicon sequencing allowed us to examine whether the “Wigglesworthia-Sodalis-Wolbachia dogma” operates across species and populations. The most abundant taxa, in wild and laboratory populations, were Wigglesworthia (the primary endosymbiont), Sodalis and Wolbachia as previously characterized. The species richness of the microbiota was greater in wild than laboratory populations. Spiroplasma was identified as a new symbiont exclusively in Glossina fuscipes fuscipes and G. tachinoides, members of the palpalis sub-group, and the infection prevalence in several laboratory and natural populations was surveyed. Multi locus sequencing typing (MLST) analysis identified two strains of tsetse-associated Spiroplasma, present in G. f. fuscipes and G. tachinoides. Spiroplasma density in G. f. fuscipes larva guts was significantly higher than in guts from teneral and 15-day old male and female adults. In gonads of teneral and 15-day old insects, Spiroplasma density was higher in testes than ovaries, and was significantly higher density in live versus prematurely deceased females indicating a potentially mutualistic association. Higher Spiroplasma density in testes than in ovaries was also detected by fluorescent in situ hybridization in G. f. fuscipes.


Results
16S rRNA gene amplicon sequencing reveals novel interspecific diversity in natural populations of tsetse flies. Microbial community composition and diversity of thirty-two whole insects from G. medicorum (Gmed), G. morsitans submorsitans (Gms), G. p. gambiensis (Gpg), and G. tachinoides (Gt) collected in Folonzo, Burkina Faso were investigated by 16S rRNA gene amplicon sequencing, producing 5,761,899 reads after quality filtering. These reads were combined with a total of 8,300,515 quality-filtered reads generated from 124 whole guts of Gff, Gmm, Gpal from a previous study 18 , which used an identical technical approach for amplicon generation and sequencing. Including the data from the above mentioned study 18 provided additional Wigglesworthia/co-divergence context to our dataset due to the increased host diversity. This approach enabled us to characterize low-frequency, high-abundance taxa. Whole insect samples from Gpg, and Gt were the most bacterial species-rich samples containing higher numbers of unique OTUs (Supplementary Table 1).
The primary nutritional endosymbiont of tsetse flies Wigglesworthia glossinidia was the most abundant taxon in all samples, and constituted between 71 and 99% of the total community in each individual. Variation in the relative abundance of W. glossinidia was due to the heterogeneous distribution of secondary taxa, which varied in infection frequency and abundance between individuals in both an intra-and inter-specific fashion (Fig. 1a). Secondary taxa included the facultative symbionts S. glossinidius and Wolbachia, alongside Spiroplasma, which have not previously been reported in tsetse flies. The relative abundance of secondary taxa was highly variable (from <0.01% to 28%) depending upon the genus of the bacterium and the species of Glossina (Fig. 1a). This contributed to the variation in bacterial community composition between Glossina species. Clustering by species is illustrated in Fig. 1b, where Principal Component 1 and Principal Component 2 describe 58.53% and 10.84% of the variance respectively. Clustering can be partly attributed to the co-diversification of Wigglesworthia, which is the main component of the community, with its tsetse host 56 . For this reason, outliers are conspicuous, as is observed with the two individuals infected with Spiroplasma and Rickettsia at 13.15% and 23.72% relative abundance respectively (Fig. 1b).
Sodalis was found at higher frequency and relative abundance in whole Gmed and Gpal guts (Figs 2 and Supplementary Figure 1). In the other Glossina species it has been detected but to a much lower abundance with a relative abundance of 0.5% or less, with Gms exhibiting the lowest abundance. Wolbachia infections were found infrequently and with low relative abundance of up to 0.04% in any wild sample, with Gmed exhibiting the highest infection prevalence (Supplementary Figure 1).
In addition, several other taxa previously associated with tsetse flies were detected including multiple members of the Enterobacteriaceae, such as Klebsiella, Erwinia, Trabulsiella, Pantoea, and Serratia. These infections occurred at low relative abundance, excluding those with Klebsiella, which was found to be dominant in one Gpg and one Gms whole fly at a relative abundance of 24.3% and 15.3% respectively (Fig. 1a). Amplicon profiling was also able to detect taxa that had not previously been associated with the tsetse fly. Several wild individuals of Gff and Gt, which belong to the palpalis subgroup of the Glossina genus, were infected with Spiroplasma. Relative abundances were generally low (<1%) (Supplementary Table 2), but were found to be as high as 13.2% in one Gt whole fly from Burkina Faso (Fig. 1a).
16S rRNA gene amplicon sequencing of laboratory reared tsetse flies. Gff, Gmm, and Gpal tissue samples from three developmental stages were sequenced, producing 2,445,369 reads after quality filtering. Similarly to wild populations, the three known taxa (Wigglesworthia, Sodalis and Wolbachia) were found in the laboratory flies. However, additional bacterial species were also detected, with members of Flavobacterium, Propiniobacterium, Brevundimonas, Aeromonas, and Rhodospirillales identified in Gmm, Gff, and Gpal. Sequences related to Acinetobacter and Pantoea were identified in Gmm and Gpal. Additionally, sequences related to Streptococcus were found in Gmm, and Gff, while sequences related to Shewanella, and Pedobacter were discovered only in Gmm. Relative abundance was influenced by tissue sample type with gut tissues being enriched for Wigglesworthia while reproductive tissues were characterized by the presence of Wolbachia and Sodalis.
For Gpal, the most bacterial species-rich samples were those associated with gonads of teneral flies while gut samples were less species-rich based on both Chao1 and ACE indices (Supplementary Table 3). Gut samples of teneral males and females displayed lower species richness (Supplementary Table 3). The same trend was observed for Gff. Gut samples of teneral flies exhibited the lowest species diversity and richness indices, which increased over time (Supplementary Table 3). Conversely, gonads presented a higher diversity and richness index in teneral flies and decreased in aged flies. This pattern was not observed in Gmm. Finally, the natural populations exhibited a statistically significant higher species-rich index (Chao1) when compared with the laboratory populations (p < 0.016).
We observed variation in the frequency and relative abundance of Wolbachia in lab populations. The mean relative abundance of Wolbachia was significantly higher in Gmm flies compared with those from the Gff or Gpal populations (ANOVA, p ≤ 0.01) (Supplementary Table 2). This was due to increased relative abundance of Wolbachia in reproductive tissues compared to larval or gut tissues within the Gmm population (ANOVA, p ≤ 0.01).
Bacterial communities were strongly clustered according to the tissue of origin separating the bacterial communities from guts from those from reproductive tissues (Fig. 3a). This factor explained 81.3% of the total variance. Canonical analysis of principal coordinates (CAP), revealed distinct clustering within the gonadal tissue (Fig. 3b). The bacterial communities associated with the gonadal tissue also seem to be statistically affected by

Spiroplasma infection status assessed by PCR screening of natural and laboratory tsetse populations.
We used PCR-based screening methods to assay for the presence of four insect reproductive parasites: Spiroplasma, Arsenophonous, Rickettsia, and Cardinium, in four Glossina species from the laboratory, Gmm (n = 19), Gff (n = 76), Gpal (n = 20), and Gpg (n = 19) and wild Gff (n = 98). Of the four examined Glossina species, Spiroplasma infections were found only in Gff with an infection ranging from 6.7 to 80% (Table 1), while none of the four tsetse species examined were infected with Arsenophonus, Rickettsia or Cardinium.
To examine the distribution of Spiroplasma, six additional Glossina species were PCR-screened for Spiroplasma infection. Only Gt and Gpp were positive for Spiroplasma, and showed an infection rate of 26.7% and 12.5% respectively ( Table 1). The PCR screening for Spiroplasma infection was further extended to 327 historical and contemporary samples from wild and laboratory colonies representing 10 species of tsetse fly (Table 1). Only members of the palpalis subgroup were found infected with Spiroplasma, including Gff, Gpp and Gt, with a prevalence ranging from 6% to 80%. Notably, the prevalence was higher in laboratory colonies than natural populations, and some populations demonstrated a disparity in infection between sexes ( Table 1).
Genotyping of Spiroplasma strains. Spiroplasma strains from Gff flies of both sexes from laboratory colonies, a natural population from Uganda and from one natural population of Gt flies from Burkina Faso were genotyped by MLST analysis. Four laboratory and one field sample of Gff harbour Spiroplasma strains with identical sequences for all loci studied (Supplementary Table 4). Interestingly, the Spiroplasma strain present in Gt is distinct from the Gff Spiroplasma strain with sequence polymorphisms detected in all loci examined. Eight polymorphisms were observed in fruR, seven in the region 16S rRNA-23S rRNA-5S rRNA, four in 16S rRNA, three in dnaA, two in ftsZ, and one in rpoB and parE. Both strains belong to the citri clade, which is mostly composed of plant pathogens ( Fig. 4 and Supplementary Figures 2-7). Most of the pathogenic Spiroplasma species belong to the Citri clade 57 with prominent examples including S. kunkelii that causes the corn stunt disease 21 , S. phoeniceum that infects periwinkle 58 , and S. penaei that infects Pacific white shrimp 42 . The closest relatives of the tsetse Spiroplasma strains are S. insolitum and S. atrichopogonis, which were isolated from a fall flower and a biting midge (Diptera: Ceratopogonidae) respectively 59,60 . Neither S. insolitum or S. atrichopogonis have been reported to be pathogenic to plants or midges.
Spiroplasma density across developmental stages. qPCR was used to assess the density of the Spiroplasma infection in larval guts, and in guts and gonads of males and females collected at two developmental stages: (a) teneral and (b) 15-day-old adults. Spiroplasma infection levels were significantly higher in larval guts compared to the guts of teneral or 15-day-old adults (Fig. 5a). There was no significant difference in the infection levels between testes of teneral and 15-day-old adults (Supplementary Figure 8). In a similar way no significant difference was observed between ovaries of teneral and 15-day-old adults (Supplementary Figure 9). However, there was a significant difference in Spiroplasma infection level between testes and ovaries from teneral flies (Fig. 5b).
Spiroplasma density was also examined in a mass-rearing colony where mortality was high and the colony was on the verge of collapse. Examination of live and dead insects indicated that in males Spiroplasma density was similar, whereas in females density was higher in live insects than in those that had recently perished ( Fig. 6a  and b). When we examined exclusively females carrying a larva, we found that the live females with a larva had a higher titre of Spiroplasma than gravid females that died prematurely (Fig. 6c). The prevalence of Wolbachia, Arsenophonus, Cardinium, and Rickettsia was also examined in whole tsetse flies from the collapsing colony. None of the 34 individuals tested were found to harbour any of the above mentioned symbionts.
In situ hybridization of Spiroplasma. Dissected ovaries and testes of teneral adults from a Gff laboratory colony were subjected to FISH using a Spiroplasma specific probe. Spiroplasma detection was sparse and sporadic in ovaries (Fig. 7a), while in testes it was observed at high densities (Fig. 7b).

Discussion
The present study showed that the bacterial communities associated with tsetse flies are more complex than previously reported, thus challenging the Wigglesworthia-Sodalis-Wolbachia dogma 3, 61, 62 . Using 16S rRNA gene-based sequencing approaches, several additional bacterial genera with broad phylogenetic origins were discovered to be associated with the tsetse fly including Klebsiella, Rickettsia and Spiroplasma. The prevalence and infection levels observed in some tsetse species, particularly those of Spiroplasma, were similar to those seen for Sodalis, suggesting that they may play an important role in the biology and ecology of tsetse flies. The question is where these symbionts come from, and what factors determine the structure of the symbiotic communities of tsetse flies.
Previous studies have shown that the microbiota of tsetse flies is characterized by the presence of Wigglesworthia, Sodalis and Wolbachia. All three symbionts are maternally transmitted, while Sodalis can also be transmitted paternally, and colonize during the early juvenile stages: Wigglesworthia and Sodalis through milk gland secretions as larvae, and Wolbachia through the germ line during embryogenesis 3, 63, 64 . As larvae are intrauterine, the only bacteria that they encounter prior to pupation originate from within the adult female tsetse fly. Due to the obligate requirement of Wigglesworthia, there is high fidelity in vertical transmission from mother to offspring 65 . This makes it difficult for other bacteria to invade, as microbes occupy many of the available niches within the host from the early stages of development. Conversely, this also means that the tsetse immune system has evolved to accommodate bacteria, which could facilitate colonization by environmental microbes able to exploit deficits in the immune system. Due to the unique biology of tsetse flies, there is only a short time window for colonization between larval deposition and pupation in the soil. In addition, the colonizers would have to survive metamorphosis in order to persist.
Until recently, there was the notion that tsetse flies feed exclusively on blood, which is mostly sterile and therefore should not serve as a source of microbes. There is now evidence that Gpg flies deprived of a blood meal can feed on water or sugar water, and that sugar residues are detectable in wild-caught flies 66 . Therefore, it is possible that these previously unrecognized feeding habits could be a source of environmental microbes, and could be the origin of the low-frequency high-abundance infections observed in multiple individuals in this study.
Spiroplasma was detected in members of the palpalis sub-group (Gff, Gpp and Gt), whereas Sodalis was significantly more prevalent in Gmed (fusca group). Previous studies have also shown that Sodalis infection is more prevalent in G. brevipalpis (fusca group) than in Gmm and Gpal (both morsitans group) 67 . However, the relationship of Spiroplasma with the palpalis subgroup seems to be more exclusive than that of Sodalis, since the latter has previously been identified in individuals belonging to all tsetse sub-groups 18,67,68 .
A key approach to detecting invasive taxa is to sample whole insects rather than individual tissues such as the gut, where Wigglesworthia is dominant and will therefore obscure the detection of lower-abundance taxa. A broad phylogenetic range of host species is important to encompass the available diversity, as there seems to be variation between sub-groups, species, and even individuals within the same species.
For example, Rickettsia was discovered at high abundance in just one individual, despite the profiling of hundreds of insects by amplicon and PCR profiling. Rickettsia has been also identified in a previous study using an amplicon sequencing approach 18 but also to G. morsitans from Senegal during a PCR screen 69 .
Spiroplasma infection was more prevalent in laboratory colonies with both males and females harbouring Spiroplasma, whereas in natural populations prevalence was lower and only females were infected. The lack of Figure 6. Quantification of Spiroplasma titre as Spiroplasma dnaA gene copy number normalized to the tsetse β-tubulin gene. (a) Gff whole insects from healthy/live males and prematurely dead males from the mass-rearing facility in Ethiopia (n = 6), (b) Gff whole insects from healthy/live females and prematurely dead females from the mass-rearing facility in Ethiopia (n = 9), p < 0.05. (c) Gff whole insects from healthy/live females carrying a larvae and prematurely dead females carrying a larva from the mass-rearing facility in Ethiopia (n = 6), p < 0.05. (ANOVA test was performed; statistical significant differences are indicated with an asterisk *).
Scientific RepoRts | 7: 4699 | DOI:10.1038/s41598-017-04740-3 infection in wild individuals may be due to insufficient sampling effort, or could be due to the differences in population dynamics between laboratory-reared and wild-caught flies. It has been reported, for example, that some symbionts may be present in such low abundances that they are undetectable by conventional PCR screens 70 . MLST indicated that the strain found in wild Gff from Uganda was identical, based on the loci examined, to that in the colonized flies (originating from the Central African Republic), suggesting the association between Spiroplasma and Gff may be ancient. Although there have been no direct studies on the relative transmission rate of tsetse symbionts in the laboratory and field, paternal transmission during mating can occur for the secondary symbiont Sodalis 64 . While this study only detected Spiroplasma infection in palpalis group flies, screening more specimens from the morsitans and fusca groups should provide more detailed information on the dynamics and spread of Spiroplasma infection in natural populations.
Another potential explanation for the absence of Spiroplasma in the morsitans and fusca groups is their frequent infection with Wolbachia 12, 71 . In the morsitans group the prevalence of Wolbachia can vary between 9.5 and 100%, while in the fusca group it can vary from 0 to 15.6% 12,71 . An existing Wolbachia infection may have led to the development of competitive exclusion with Spiroplasma, though it is not yet clear whether they share an ecological niche within the host, and whether co-occurrence could create evolutionary pressure strong enough to drive competitive exclusion 72 . In D. melanogaster, coinfections between Wolbachia and Spiroplasma were asymmetrical: Spiroplasma negatively affected the titre of Wolbachia, whereas Wolbachia density did not affect Spiroplasma titre 73 . Similarly to Spiroplasma in Gff, tissue tropism was observed in D. melanogaster infected with Spiroplasma, with the ovaries showing the highest density 73 . Competitive inter-and intraspecific microbial interactions have also been observed in mosquito vector species where mutual exclusion between Asaia and Wolbachia has been observed in the reproductive organs while native gut microbiota seems to prevent the vertical transmission of Wolbachia in Anopheles mosquitoes 74,75 . Gff has previously been shown to harbor Wolbachia, though prevalence in natural populations is very heterogeneous, with an average infection rate of 44.3% 76 . Spiroplasma, on the other hand, is found at much lower frequency in natural populations, but is found at higher density per individual when compared with Wolbachia.
MLST analysis indicated that the Spiroplasma strains detected in Gff and Gt populations, albeit different, both belong to the citri clade. Prominent examples of taxa from this clade include S. kunkelii, S. phoeniceum, and S. citri, all of which are plant pathogens 21,58,77 . S. poulsonii, which has been shown to have a protective effect against parasitic wasps in D. melanogaster, is also a member of this clade 20 .
When examining gut tissues, Spiroplasma titre was highest in larvae, and gradually decreased in both males and females over the course of adulthood. High larval titre indicates vertical transmission from mother to offspring, possibly via the milk gland; a mechanism already exploited by Wigglesworthia and Sodalis. High larval density is an abnormal trait in the context of other insect-associated Spiroplasma species. Multiple strains of Spiroplasma infect a number of species of Drosophila and are able to induce a variety of phenotypes in their insect host ranging from parasitic reproductive manipulators to protective symbionts 20,24,78 . In D. hydei and D. melanogaster, Spiroplasma titre steadily increases during larval and adult development with no differentiation between males and females 73,79 . Interestingly, Drosophila male killing Spiroplasma strains exhibit a very high titre in the haemolymph 78 , a pattern not observed in the Gff Spiroplasma strain (data not shown). In addition, Spiroplasma titre in Gff is much lower than that described for Drosophila male killing strains 29,78 . Wolbachia is the only other maternally inherited endosymbiont found in Drosophila, and is also found in tsetse flies. Wolbachia confers density-dependent protection against insect viruses at different developmental stages in several Drosophila species [80][81][82][83] . Based on the above, it is possible that high Spiroplasma density may also play a role in larval fitness. This warrants further study, as protection against viral or bacterial pathogens during intrauterine larval development would constitute a rare phenotype for a bacterial endosymbiont. Recent studies in D. melanogaster showed that Wolbachia and Spiroplasma can affect immune signalling pathways in the presence of both insect pathogenic and non-pathogenic bacteria 84 .
Gut infection was maintained into adulthood, particularly in males. This suggests that Spiroplasma is either able to maintain infection during metamorphosis, possibly due to extracellular proliferation 73 , or that it can rapidly re-colonize upon reformation of the gut. Spiroplasma density was also significantly higher in the testes of teneral males than in the ovaries of teneral females. Localization to the testes suggests that Spiroplasma may be sexually transmitted from males to females, as has already been observed with Sodalis in tsetse flies, and Asaia in Anopheles stephensi 64,85 . The above properties can be exploited in paratransgenic approaches in a similar way to those currently being explored for Sodalis 64,86 and Asaia 87 .
In a collapsing colony of Gff flies, live females had a higher Spiroplasma density than prematurely dead females. This was true of both gravid and non-gravid females, and indicates that Spiroplasma may contribute to adult female fitness. It is therefore possible that Spiroplasma could play a protective role, as has been observed in other facultative strains of Spiroplasma 20, 34, 88 and/or a nutritional role.

Materials and Methods
Insect specimen collection and DNA isolation. All natural populations of Glossina specimens were collected in four countries, Burkina Faso, Uganda, United Republic of Tanzania, and South Africa (Table 1 and  Supplementary Table 5). All wild flies were collected using biconical traps and collection intervals were four hours. Upon collection, flies were transferred to the main collection point and were placed in 100% acetone and stored at room temperature. Upon arrival in the lab, DNA was extracted immediately using the CTAB method (Cetyl trimethylammonium bromide) 89 . Laboratory populations were also analysed in a similar way. Samples of Gff suffering high mortality were collected from the mass rearing facility in Kality, Ethiopia. For a detailed description of the analysis performed see Supplementary Information. Multiplex Illumina MiSeq Sequencing, data, and statistical analysis. The V4 region of the 16 S rRNA gene was amplified using fusion primers F515 (5′-GTGCCAGCMGCCGCGGTAA-3′), and 805R (5′-GACTACCAGGGTATCTAAT-3′) from individual wild flies of G. medicorum (Gmed), G. m. submorsitans (Gms), G. p. gambiensis (Gpg), and G. tachinoides (Gt) collected in Burkina Faso. Data generated from the wild flies were combined with the data generated from 124 whole guts of Gff, Gmm, Gpal from a previous study 18 , which used an identical technical approach for amplicon generation and sequencing.
For a detailed description of the PCR conditions please see Supplementary Information. The gene sequences reported in this study have been deposited in NCBI under Bioproject numbers PRJNA345319, and PRJNA345350-52. Statistical analyses was performed using Unifrac distances, PCoA analyses, CAP, ANOVA and Tukey-Kramer post-hoc tests as described in the Supplementary Information. PCR screening and Spiroplasma multi locus genotyping. Gmm, Gff, Gpg, and Gpal were assayed for the presence of Spiroplasma, Arsenophonus, Cardinium, and Rickettsia symbionts by PCR. An additional six species of Glossina (G. austeni (Ga), G. brevipalpis (Gb), G. m. centralis (Gmc), Gms, G. p. palpalis (Gpp) and Gt were screened for Spiroplasma only. The primer sequences used to detect each symbiont along with their target genes, product sizes, conditions, and annealing temperatures are listed in the Supplementary Information.
The Spiroplasma strains present in Glossina species were genotyped with a multi-locus sequence typing (MLST) approach using five marker genes (rpoB, parE, dnaA, ftsZ and fruR) and a 4,702 bp region spanning the 16S rRNA-23S rRNA-5S rRNA region. Details of the conditions used are presented in the Supplementary Information. Sequencing was performed as described previously 90 . All gene sequences generated in this study have been deposited into at GenBank under accession numbers KX159363-KX159393. Phylogenetic analysis. All nucleotide sequences were manually edited with Geneious 7.1.2. Multiple alignments were generated by MUSCLE 91 and ClustalW 92 by Geneious 7.1.2, and adjusted by eye. Phylogenetic analyses were conducted for all analysed Spiroplasma sequences (16 S rRNA, rpoB, dnaA, parE, ftsZ and fruR genes, and the region 16 S rRNA-23S rRNA-5S rRNA region) separately by two methods: Bayesian Inference (BI) and Maximum Likelihood (for a detailed description see Supplementary Information).

Quantitative Real Time-PCR and Fluorescent in situ Hybridization (FISH). Spiroplasma density
was quantified by qPCR using the dnaA Spiroplasma specific primers FqdnaA/RqdnaADoud for 35 cycles at 56 °C and normalized to the host β-tubulin gene. Primers and a detailed description used for the qPCR experiments are presented in Supplementary Table 6. qPCR data were analysed using a one-way ANOVA method, as described previously 93 using the XLSTAT program.
Gff specimens from the Seibersdorf laboratory colony were used for FISH. Teneral male and female flies were dissected in PBS 2-3 days after eclosion. Dissected tissues were dried on poly-L-lysine-coated glass slides (Sigma, UK) for 20 min at 65 °C and kept at 4 °C until further use. Tissue samples were fixed in freshly prepared 4% paraformaldehyde solution for 30 min at 4 °C. A detailed description of tissue processing and image capture is included in the Supplementary Information.