Substantial rearrangements, single nucleotide frameshift deletion and low diversity in mitogenome of Wolbachia-infected strepsipteran endoparasitoid in comparison to its tephritid hosts

Insect mitogenome organisation is highly conserved, yet, some insects, especially with parasitic life cycles, have rearranged mitogenomes. Furthermore, intraspecific mitochondrial diversity can be reduced by fitness-affecting bacterial endosymbionts like Wolbachia due to their maternal coinheritance with mitochondria. We have sequenced mitogenomes of the Wolbachia-infected endoparasitoid Dipterophagus daci (Strepsiptera: Halictophagidae) and four of its 22 known tephritid fruit fly host species using total genomic extracts of parasitised flies collected across > 700 km in Australia. This halictophagid mitogenome revealed extensive rearrangements relative to the four fly mitogenomes which exhibited the ancestral insect mitogenome pattern. Compared to the only four available other strepsipteran mitogenomes, the D. daci mitogenome had additional transpositions of one rRNA and two tRNA genes, and a single nucleotide frameshift deletion in nad5 requiring translational frameshifting or, alternatively, resulting in a large protein truncation. Dipterophagus daci displays an almost completely endoparasitic life cycle when compared to Strepsiptera that have maintained the ancestral state of free-living adults. Our results support the hypothesis that the transition to extreme endoparasitism evolved together with increased levels of mitogenome changes. Furthermore, intraspecific mitogenome diversity was substantially smaller in D. daci than the parasitised flies suggesting Wolbachia reduced mitochondrial diversity because of a role in D. daci fitness.

have not yet been sequenced and characterised.
In this study, we obtained six mitogenome variants of D. daci and nine mitogenome variants of four of its 22 tephritid host species, B. frauenfeldi, B. neohumeralis, B. tryoni and Z. strigifinis by WGS of DNA libraries obtained from parasitised individual hosts. We then compared the arrangement, nucleotide composition and codon usage of these mitogenomes together with the previously sequenced mitogenomes of four other strepsipterans, four species of closely related insect orders and the host fruit flies. As mitogenome rearrangements have previously been detected in other strepsipterans 18,[42][43][44] , we expected that the D. daci mitogenome would also differ from the ancestral mitogenome arrangement of insects and the fruit fly mitogenomes. Furthermore, we anticipated that D. daci mitogenomes contain more rearrangements compared to the mitogenomes of Mengenilla (with free-living adults) but share some of these differences with the mitogenomes of Xenos (with more extreme endoparasitic life cycles). Furthermore, we compared the intraspecific mitogenome diversity between D. daci and the fruit flies from which the D. daci mitogenomes were obtained. Due to the previously described association of D. daci with Wolbachia 47 , we expected that intraspecific mitogenome diversity would be less in D. daci than the fruit fly species.

Results
Genome sequencing and assembly. Whole genome sequencing was performed on genomic extracts of nine individuals of four tephritid fruit fly species that were parasitised with Wolbachia-infected D. daci and were collected across a range of > 700 km in Australia (Table 1). Of these, six sequence libraries produced a good coverage (≥ 26.7-fold) of D. daci mitogenomes; three other sequence libraries contained D. daci mitogenomic sequences but not of sufficient coverage to assemble mitogenomes (Table 1). However, all nine sequence libraries included mitogenomes of the four fruit fly species. The D. daci and fruit fly mitogenomes were first filtered from the contig list of Bfra485 which had the highest read number. Its D. daci mitogenome comprised two contigs of approximately 12 kb and 3.2 kb while the fly mitogenome comprised one contig of approximately 15.9 kb. Then, iterative mapping using Bfra485 reads resulted in an almost complete D. daci mitogenome with a minimum estimated length of 16,255 bp and a complete circular B. frauenfeldi mitogenome of 15,935 bp ( Fig. 1, Table S1). These two mitogenomes were used to filter the D. daci and fruit fly mitogenomes from the other sequence libraries. Dipterophagus daci mitogenomes were successfully assembled from six libraries (Bfra485, Bn171, Bn342, Bt194, Bt210 and Zst503). Albeit detectable, D. daci coverage in the three remaining libraries (Bn135, Bn240 and Bn244) was too low (< fivefold) for mitogenome assembly but was sufficient in Bn240 for the calling of single nucleotide polymorphisms (SNPs) at informative sites. The fruit fly mitogenomes were successfully assembled from all nine libraries (Bfra485, Bn135, Bn171, Bn240, Bn244, Bn342, Bt194, Bt210 and Zst503) ( Table 1). The size of the mitogenomes ranged from 16,243 to 16,255 bp for D. daci, and from 15,858 to 15,935 bp for the fruit flies (Table S1).
Mitogenome base composition. The nucleotide composition of D. daci mitogenomes was AT-biased (approximately 84%) and this was similar to the mitogenomes of the other strepsipterans. The fruit fly mitogenomes were less AT-biased (approximately 72%) (Fig. 2, Table S1) and their AT contents were similar except for B. frauenfeldi 485 and Z. strigifinis 503, which had AT contents of 74.1% and 73.4% respectively (Fig. 2, Table S1). Comparative mitogenome analyses of D. daci and their fruit fly hosts revealed a clear bias in nucleotide composition with positive AT-skews and negative GC-skews (Table S1). This was also noted for X. vesparum while M. australiensis and Mengenilla moldryzki had a negative AT skew (Table S1). All the insect taxa had a negative GC skew (Table S1).
Mitochondrial protein coding genes. The total length of the 13 PCGs of the D. daci mitogenomes was on average 10,696 bp and was relatively shorter than the total length of the PCGs of the fruit fly mitogenomes with an average length of 11,187 bp (Table S1). The start codons ATT, ATA and ATG were used in both D. daci and fruit fly PCGs, except the fruit fly atp8 gene which started with GTG (Table S2). In D. daci PCGs nad1, nad2, nad3 and nad4L started with ATA, cox2, atp8, nad5 and nad6 with ATT, atp6, cox3, nad4 and cob with ATG, and cox1 with CAA (Table S2). Furthermore, the D. daci PCGs nad2, atp8, nad6, cox3, nad4L and nad1 ended with TAA, while it is assumed that the remaining PCGs that ended with T are completed by adding 3' A nucleotides to the mRNA (Table S2). The fruit fly PCGs nad2, nad3, nad5 and nad6 started with ATT, cox2, atp6, cox3, nad4, nad4L and cob with ATG, atp8 with GTG, nad1 with ATA, and cox1 with TCG (Table S2). Seven fruit fly PCGs stopped with TAA, while nad3 and nad4 stopped with TAG; nad5, cob and nad1 that ended with T, (and cox1 ending with TA) are presumably completed by adding 3' A nucleotides to the mRNA (Table S2). Comparative analyses of the relative synonymous codon usage (RSCU) revealed that across D. daci, the fruit fly and the other insect species, codons ending with A or T prevailed. Amino acids Ala, Gly, Leu, Pro, Arg, Ser, Thr and Val were commonly used, and Leu had the highest RSCU in all insect species (Table S3).
Surprisingly, the nad5 gene contained an unusual deletion of one nucleotide (nucleotide position 291) in all six D. daci mitogenomes which introduced an in-frame stop codon (TAA) at amino acid position 98 (Fig. 3); the remainder of nad5 further downstream, however, still constituted an open reading frame but started from a different position. The unexpected finding of a single nucleotide -1 frameshift deletion was further verified by Sanger sequencing of the nad5 region of D. daci from five samples in addition to those used for WGS; these samples did not undergo REPLI-g amplification which was used for the WGS samples prior to library preparation (Table S4). All nad5 gene Sanger sequences were identical to the assembled mitogenomes and confirmed this nucleotide deletion. Subsequently, the domain architecture of the nad5 gene was checked using CDART (NCBI) 54 . This revealed that, similar to other nad5 genes, the second part of the D. daci nad5 gene downstream  Table S2). Their average total length was 1424 bp in D. daci and 1468 bp in fruit fly mitogenomes (Table S1). Both 16S rRNA and 12S rRNA genes (rrnL and rrnS respectively), had a total length of 2074 bp in the D. daci mitogenomes, while both combined ranged from 2081 to 2110 bp in the fruit fly mitogenomes (Table S1). Across the six D. daci mitogenomes, MITOS2 could only identify one part (688 bp 3' section adjacent to the nad1 gene) of the 16S rRNA gene because the 5' section flanked by trnV was highly diverged. However, the entire coding sequence was confirmed by sequence alignment with 16S rRNA genes of the reference strepsipteran mitogenomes obtained from GenBank and by BLASTn. In fruit fly mitogenomes the 16S rRNA gene was flanked by trnL 1 and trnV and the 12S rRNA gene was flanked by trnV and the AT-rich region ( Fig. 1, Table S2).
Mitochondrial gene arrangement. Significant gene rearrangements were observed in the D. daci mitogenomes relative to the ancestral insect mitogenome, while the gene arrangement of the fruit fly mitogenomes were identical to the ancestral insect mitogenome pattern (Fig. 4A,B). Gene rearrangements in the D. daci mitogenomes were observed in two regions: the first region involved the transposition of trnA, trnS 1 and trnF; and the second region involved the transposition of trnS 2 , trnL 1 and rrnS (Fig. 4A), resulting in a different rRNA gene order when compared to all other mitogenomes. The D. daci mitogenome arrangement was also compared with the mitogenomes of the four other strepsipteran species, one representative species each of four closely related insect orders (Coleoptera, Megaloptera, Neuroptera, Rhaphidioptera), B. frauenfeldi 485 and a reference B. tryoni (GenBank accession NC014611) (Fig. 4B). Generally, most genes in the D. daci mitogenome had a conserved gene arrangement position (Fig. 4B). However, comparisons revealed that D. daci contained more mitogenome rearrangements (6 transpositions) compared to Xenos cf. moutoni, X. vesparum, M. moldryzki and M. australiensis that contained 4, 3, 2 and 1 transpositions, respectively (Fig. 5). The transposition of trnS 1 observed in D. daci was also observed in the four strepsipteran species, and the transposition of trnA and trnL 1 was also found in X. cf. moutoni and X. vesparum (Fig. 5). The transposition of trnF, trnS 2 and rrnS were unique to D. daci, while the transposition of trnM (from I-Q-M in ancestral arrangement to M-I-Q) was unique to X. cf. moutoni and not seen in D. daci (Fig. 5). Mitogenomes of the fruit flies as well as the three representative species of Coleoptera, Megaloptera and Rhaphidioptera were arranged according to the ancestral insect mitogenome pattern while Dendroleon pantherinus (Neuroptera) exhibited a C-W-Y (W-C-Y in ancestral) gene arrangement (Fig. 4B).

Discussion
We have analysed the mitogenome of D. daci as the first sequenced mitogenome of Halictophagidae, the largest strepsipteran family, together with the mitogenomes of four of its 22 tephritid fruit fly host species, B. frauenfeldi, B. neohumeralis, B. tryoni and Z. strigifinis. We obtained these sequences from fly individuals with concealed D. daci parasitisation. Mitogenome analyses revealed extensive mitogenome rearrangements in D. daci relative to the inferred ancestral holometaboloan mitogenome arrangement and the fruit fly mitogenomes. Furthermore, in comparison to the other strepsipteran mitogenomes, D. daci has, with six gene transpositions, the most rearranged strepsipteran mitogenome characterised so far. While it shared some of the mitogenome rearrangements with other Strepsiptera, D. daci contained additional and unique mitogenome differences. These included a single nucleotide -1 frameshift deletion in the coding region of the nad5 gene possibly requiring translational frameshifting 16,17 , other unknown compensation mechanisms, or, alternatively, leads to a significant truncation of the gene product. Another unusual feature was a different order of the rRNA genes because of the transposition of the rrnS gene. Our findings also revealed that D. daci mitogenomes have shorter PCGs than mitogenomes of other insects which is typical for strepsipterans 18,43 . Despite the low sample number but whole-mitogenomic representation and similar sampling effort for D. daci and fruit fly species across a geographic range of > 700 km, covering a large part of known D. daci distribution 45 , we observed substantially (15-33x) lower genetic diversity  Table 3. Nucleotide diversity of the mitochondrial PCGs of Dipterophagus daci (n = 6), Bactrocera neohumeralis (n = 5) and Bactrocera tryoni (n = 3), showing the number of single nucleotide polymorphisms (SNPs). The 5' part of the D. daci nad5 gene with the stop codon is listed separately as nad5_5'. www.nature.com/scientificreports/ in the D. daci mitochondrial PCGs relative to their host fruit fly species, suggesting that Wolbachia may be the cause for the loss of mitogenome diversity in D. daci. Insect mitogenomes have a fairly conserved gene order, however, gene rearrangements occur in several insect taxa 55 . In the current study, we found extensive gene rearrangements in D. daci mitogenomes relative to the ancestral holometabolan mitogenome pattern. Mitochondrial gene rearrangements are usually characterised by either transposition, inversion or inverse transposition 56 , and more frequently involve tRNA genes than PCGs and rRNA genes 57 . In D. daci, rearrangements involved six gene transpositions (five tRNA genes and one rRNA gene). These were more mitogenomic transpositions in D. daci than in any other strepsipterans further suggesting that the D. daci lineage has experienced accelerated structural mitogenome rearrangements. The transpositions of trnF, trnS 2 and rrnS were unique to D. daci, however, the transpositions of trnA and trnL 1 were also observed in X. cf. moutoni and X. vesparum, while the transposition of trnS 1 was common to the five strepsipteran species.

Total number of nucleotides SNPs Total number of nucleotides SNPs Total number of nucleotides SNPs
We also found that nad5 of D. daci had one single nucleotide -1 frameshift deletion that resulted in the introduction of a stop codon at amino acid position 98. However, the downstream part of the gene still had an open reading frame but starting with another nucleotide position. This could be indicative that D. daci experiences translational frameshifting, similar to the translational editing mechanism proposed to overcome the issues of single nucleotide insertion and deletions found in PCGs of some animal mitogenomes 58 . Previously, single nucleotide insertions have been observed in cob of ants 16 and nad3 of some bird and turtle species 17 . It is noteworthy that our finding is, to our knowledge, the first example of -1 frameshift deletion found in an invertebrate mitogenome. So far single nucleotide deletions in mitochondrial PCGs have only been found in a few turtle species 58 , and, overall, -1 frameshift deletions appear to be rarer than +1 frameshift insertions 59 . Alternatively, the single nucleotide deletion in nad5 of D. daci could result in the expression of a truncated but still functional nad5 gene product because it still contained the proton-conducting transporter domain similar to nad5 genes in other species 60 , however, this scenario may be less likely because it would constitute a substantial truncation. Yet another scenario could be compensation of the frame shift mutation by an unknown mechanism other than translational frameshifting, via the D. daci nuclear genome, Wolbachia or the fruit fly mitochondrial or nuclear genomes. There are several examples of intracellular endosymbionts with degraded gene functions that are compensated by other endosymbionts 61 or their hosts 62 .
It has previously been hypothesised that mitogenome rearrangements arose with the evolution of parasitic life cycles. This is because a transition to a parasitic life cycle in a lineage may come in hand with a relaxation of selective constraints acting on mitogenomes and their functions 40 . Based on our findings we can now add single nucleotide frameshift mutations that may also arise in lineages that have evolved parasitic life cycles. There is evidence for the association between mitogenome changes and evolution of parasitic life cycles, because mitogenomes of parasitic lineages of Hymenoptera are highly rearranged when compared to the conserved mitogenome arrangement patterns in the more basal lineages of Hymenoptera which are not parasitic 31 . Mitogenome rearrangements were also reported for the two egg parasitoids, Trichogramma japonicum and Trichogramma ostriniae 55 as well as a parasitoid of Drosophila larvae, Leptopilina boulardi 63 . Similarly, rearrangements have been observed in three parasitoid wasp species of the genus Psyttalia which parasitise Bactrocera oleae 64 . Furthermore, the numbers of mitogenome rearrangements in Strepsiptera correlated with the transition from moderate to extreme levels of parasitism. More gene rearrangements were observed in the mitogenomes of the Stylopidia species D. daci, X. cf. moutoni and X. vesparum compared to the Mengenillidia species M. australiensis and M. moldryzki. The largest number of differences when compared to the ancestral insect mitogenome arrangement were observed in D. daci, and the single nucleotide frameshift deletion in nad5 and the transposition of rrnS were unique, and possibly associated with the more extreme endoparasitism displayed by D. daci and its different host utilisation (i.e. Diptera). Rearrangements involving ribosomal RNA genes have been found in other insects, such as thrips 65 . It is unclear how the nad5 frameshift deletion could have occurred, but its effect may not be as severe in an endoparasitic insect 36 . Flight muscles rely heavily on mitochondrial function 66,67 , and an insect with limited flight function may be able to cope with a less efficient mitochondrial function.
The overall length of the D. daci and fruit fly mitogenomes were within the expected length of 15-18 kb 3 . Both D. daci and the fruit fly mitogenomes contained the 37 genes and the AT-rich region usually found in animal mitogenomes 1,2 . The conserved location for AT-rich region is between rrnS and trnI, however in the D. daci mitogenome the AT-rich region is located between trnV and trnS 2 , which is similar to its position in a gnat bug, Stenopirates sp. 68 , while it is located in the conserved location in M. moldrzyki 42 and X. cf. moutoni; however, incomplete information is available for X. cf. moutoni 44 . The D. daci mitogenome assembly contained a gap in this region and hence the full length of the AT-rich region could not be estimated. Attempts to close the mitogenome by iterative mapping with short reads proved impossible. This region could either be too long and repetitive to be closed with bioinformatics approaches, or have secondary folding structures resulting in sequencing difficulties, as also found for M. australiensis, X. cf. moutoni and X. vesparum 18,43,44 . Our study revealed that the mitochondrial PCGs of D. daci are shorter relative to the PCGs of their host fruit flies, and this could be associated with the evolution of the strepsipteran life cycle, as also suggested for M. australiensis, X. cf. moutoni and X. vesparum 18,43,44 . Similar to other parasitic insects 3,55 , the nucleotide composition of the D. daci mitogenomes were more AT-biased compared to fruit fly mitogenomes. The high AT bias observed in D. daci was found to be similar to the other Strepsiptera 18,43,44 . Furthermore, the D. daci mitogenome had a positive AT skew and a negative GC skew indicating that its genes contain more A than T, and more C than G, as also reported for other insects 69 .
Low intraspecific mitogenome diversity is generally attributed to founder events 70,71 , or can be due to Wolbachia endosymbionts which manipulate host reproduction or provide a fitness benefit to hosts 9,12 . Maternal coinheritance of mitogenomes and Wolbachia may facilitate Wolbachia-driven selective sweeps of the infected mitochondrial haplotype resulting in low mitochondrial genetic diversity 10

Methods
Insect specimens and WGS. This study analysed WGS libraries of nine males of four tephritid fruit fly species (B. frauenfeldi, B. neohumeralis, B. tryoni and Z. strigifinis) representing field populations across a region from Mackay to Cairns (> 700 km distance) in Queensland, Australia (Table 1). These specimens formed part of a previous survey of Wolbachia in 24 Australian tephritid fruit fly species and were collected using traps with male attractants as previously described 48,49 . DNA was extracted from fly abdomens and tested for Wolbachia using Wolbachia surface protein (wsp) and 16S rRNA gene primers; furthermore, two strains of Wolbachia-positive flies were characterised using multi-locus sequence typing (MLST) as ST-285 and ST-289 48,49 (Table 1), with later assignment of these strains to their actual host D. daci as wDdac1 and wDdac2 47 . DNA extracts of 14 Wolbachia-positive flies were selected and amplified by multiple displacement using the REPLI-g mini kit (Qiagen) previously used to amplify DNA of mitochondrial and bacterial chromosomes at higher coverage than eukaryotic chromosomes 11 and submitted for library construction and WGS using the Illumina Hiseq2500 platform as previously described 47 . Nine of these 14 WGS libraries produced sufficient mitogenome coverage and were used for analyses in the current study ( Table 1). The remaining five WGS libraries were of low quality and excluded from the analyses. Furthermore, all nine samples were PCR positive for Wolbachia and D. daci 47 (Table 1).
Genome assembly. Sequence quality control and de novo assembly were performed in CLC Genomics Workbench as previously described 47 . Sequence identification and extraction was achieved by querying the reference genomes against the WGS library contig lists. First, BLASTn using the M. australiensis partial mitogenome (GenBank GU188852) was performed to filter the D. daci mitogenome from the contig list of Bactrocera frauenfeldi Bfra485 (Table 1). Contigs with the best hit were concatenated and manually gap-filled by iterative mapping of the trimmed reads at 90% similarity and 60-80% read length. The final D. daci draft mitogenome consensus sequence of this library was verified by mapping reads at 99% similarity. The final D. daci mitogenome extracted from the Bfra485 contig list was then used as a reference for the identification and filtration of D. daci mitogenomes from the other five libraries (Table 1). Similarly, BLASTn using the Ceratitis capitata mitogenome (GenBank AJ242872) was performed to identify and extract the fruit fly mitogenomes from the libraries, and the contigs with the best hit in each library were then assembled by iterative mapping as described earlier. The extracted D. daci and fruit fly mitogenomes were manually aligned and inspected in Geneious v10.0.9 73 .
PCR amplification and Sanger sequencing of nad5. The D. daci mitogenome assembly revealed an unusual deletion of one nucleotide in nad5. This genomic dataset was obtained from WGS libraries which underwent REPLI-g amplification prior to library preparation 11 . To verify that this mutation was not due to a rare amplification error, PCR primers were designed to specifically amplify nad5 of D. daci to confirm the WGS results, using Primer-BLAST (NCBI); Dd_nad5F: 5' GAA ACT GGA GTT GGA GCA GC 3' and Dd_nad5R: 5' ATA GCG TGT GAT AAG TTA AAT CGT T 3' with an expected amplicon size of 396 bp. MyTaq™ Mix (Bioline) PCR reagents were used according to the manufacturer's instructions. PCR cycling conditions began with an initial denaturation for 3 min at 94 °C, followed by 35  Mitogenome annotation and analysis. Annotation of the D. daci and fruit fly mitogenomes was performed using MITOS2 with "RefSeq 63 Metazoa" provided by MITOS2 and the invertebrate genetic code 74 , followed by manual verification of the coding regions and comparison with published mitochondrial sequences in Geneious v10.0.9 and NCBI BLASTn. The tRNA genes predicted by MITOS2 were confirmed using tRNAscan-SE 75 and ARWEN 76