Mitochondrial genomes of two Australian fishflies with an evolutionary timescale of Chauliodinae

Fishflies (Corydalidae: Chauliodinae) with a total of ca. 130 extant species are one of the major groups of the holometabolous insect order Megaloptera. As a group which originated during the Mesozoic, the phylogeny and historical biogeography of fishflies are of high interest. The previous hypothesis on the evolutionary history of fishflies was based primarily on morphological data. To further test the existing phylogenetic relationships and to understand the divergence pattern of fishflies, we conducted a molecule-based study. We determined the complete mitochondrial (mt) genomes of two Australian fishfly species, Archichauliodes deceptor Kimmins, 1954 and Protochauliodes biconicus Kimmins, 1954, both members of a major subgroup of Chauliodinae with high phylogenetic significance. A phylogenomic analysis was carried out based on 13 mt protein coding genes (PCGs) and two rRNAs genes from the megalopteran species with determined mt genomes. Both maximum likelihood and Bayesian inference analyses recovered the Dysmicohermes clade as the sister group of the Archichauliodes clade + the Protochauliodes clade, which is consistent with the previous morphology-based hypothesis. The divergence time estimation suggested that the divergence among the three major subgroups of fishflies occurred during the Late Jurassic and Early Cretaceous when the supercontinent Pangaea was undergoing sequential breakup.

. Mitochondrial map of Archichauliodes deceptor. Circular maps were drawn with CGView 25 . The arrows indicate the orientation of gene transcription. The tRNAs are denoted by the color blocks and are labelled according to the IUPACIUB single-letter amino acid codes (L1: UUR; L2: CNU; S1: AGN; S2: UCN). The GC content was plotted using a black sliding window, as the deviation from the average GC content of the entire sequence. GC-skew was plotted as the deviation from the average GC-skew of the entire sequence. The inner cycle indicates the location of the genes in the mt genome.
Scientific RepoRts | 7: 4481 | DOI: 10.1038/s41598-017-04799-y The gene order is in accordance with the gene order of Drosophila yakuba 6 , and no gene rearrangement was found. The published mt genomes of all 12 species of Megaloptera exhibit a highly conserved gene order 7 . However, the gene order of some reported Neuroptera mt genomes differs slightly from the conserved gene order in the translocation of tRNA Cys , which is located upstream of tRNA Trp but not at its traditional downstream location of tRNA Trp . Among all 37 genes in the A. deceptor mt genome, 14 genes (4 PCGs, 2 rRNAs and 8 tRNAs) are encoded on the minority strand (N-strand), and 23 genes (9 PCGs and 14 tRNAs) are on the majority strand (J-strand). Among 33 genes in the partial P. biconicus mt genome, 12 genes (4 PCGs, 1 rRNA and 7 tRNAs) are encoded on the minority strand (N-strand), and 21 genes (9 PCGs and 12 tRNAs) are on the majority strand (J-strand). Gene overlaps were found at 12 and 16 gene junctions in the mt genomes of A. deceptor and P. biconicus, respectively. Furthermore, ATP6 and ATP8 overlap 7 nucleotides (i.e., "ATGATAA"), and this phenomenon is also reported in the mt genome of some related species (e.g., Neochauliodes bowringi (McLachlan), Neochauliodes punctatolosus Liu & Yang, Protohermes concolorus Yang & Yang). Similarly, ND4L-ND4 also had a 7 bp overlap (i.e., "TTAACAT"), but the overlapped sequences between ND4L-ND4 were not always the same in insect mt genomes, such as "TTAACAC" in N. bowringi and "ATGTTAA" in N. punctatolosus. In addition, there were 13 intergenic regions in the mt genome including 65 nucleotides and ranging from 1 to 16 bp in the A. deceptor mt genome. In the P. biconicus mt genome, 12 intergenic regions were found, including 96 nucleotides which ranged was a stem structure with an independent nucleotide in the middle. This phenomenon is common in sequenced Neuropterida mt genomes. The DHU and TψC stems are more variable and range in size from 3 bp to 10 bp. Based on the secondary structure, the amount of mismatched base pairs in tRNAs of A. deceptor and P. biconicus was 21 and 19, respectively, this includes G-U pairs, A-A mismatches, U-U mismatches and U-C mismatch. These mismatches were mostly found in the 3′ end of acceptor stem and DHU arm.
Ribosomal RNAs. The lrRNA was assumed to fill up the blanks between tRNA Leu(CUN) and tRNA Val , while the srRNA was flanked by tRNA Val and the control region. In A. deceptor the length of lrRNA and srRNA was determined to be 1319 bp and 786 bp with A + T content as 82% and 79.4%, respectively. The lrRNA of P. biconicus is 1318 bp in length, while the size of srRNA we sequenced is 610 bp. The GC content was plotted using a black sliding window, as the deviation from the average GC content of the entire sequence. GC-skew was plotted as the deviation from the average GC-skew of the entire sequence. The inner cycle indicates the location of the genes in the mt genome.
We inferred the secondary structure of lrRNA and srRNA of A. deceptor using the published rRNA structure of Neoneuromus tonkinensis 8 and Agriosphodrus dohrni 9 as models. There were 49 helices in lrRNA in five structural domains (I-II, IV-VI), domain III is absent as in other arthropods (Fig. 6) 10 . The multiple alignments of  Table 2. Organization of the Protochauliodes biconicus mt genome. IGN: Intergenic nucleotide, minus sign indicates overlapping between genes. tRNA X : where X is the abbreviation of the corresponding amino acid. Chauliodinae and Megaloptera indicate that conserved nucleotides were distributed unevenly throughout the lrRNA secondary structure. In addition, most invariable positions were found within domain IV, while the lower conserved positions were in domains I-II. The secondary structure of srRNA contains three domains (Fig. 7), and it is less conservative than lrRNA. The H7 region within srRNA is highly variable and difficult to predict among different insects.
The estimation of divergence times among the sampled Megaloptera taxa showed a Mesozoic diversification for the extant families and subfamilies of Megaloptera, as well as for the genera of Corydalidae (Fig. 8, Table 3). In Chauliodinae, the divergence between Dysmicohermes and the other genera of fishflies was dated to be in the Late Jurassic ca. 159 MA (95% HPD 121.01-169.64 MA), which is slightly earlier than the estimate in Wang et al. 11 (ca. 140 MA/95%HPD 67.96-234.89 MA); however, in consideration of the 95% credibility interval, the current estimate falls within the range of the previous study 11 . Protochauliodes was estimated to have diverged from Archichauliodes + Neochauliodes in the Early Cretaceous, ca. 117 MA. Divergence between Archichauliodes and Neochauliodes was also dated in the Early Cretaceous, ca. 102 MA. In Corydalinae, the divergence times among the five dobsonfly genera were estimated to be in the Cretaceous, which corresponds to results in Winterton et al. 12 and Wang et al. 11 .

Discussion
Phylogenetic considerations. The species of fishflies investigated here from Archichauliodes and Protochauliodes are significant for inferring the phylogeny of Chauliodinae, since they represent two major subgroups of fishflies, i.e. the Archichauliodes clade and the Protochauliodes clade, as proposed by a recent morphology-based phylogeny of fishflies 4 . Aside from these two clades, the remaining major subgroup of Chauliodinae is the Dysmicohermes clade. Our phylogenetic analysis is the first to use molecular data to test the relationships of these three major fishfly subgroups. Our results are generally consistent with the previous morphology-based phylogeny, in which the Dysmicohermes clade was the sister group of the Archichauliodes clade + the Protochauliodes clade, although we could not corroborate the monophyly of each clade due to lack of many genera.
Dysmicohermes, together with its sister genus Orohermes, are basal fishflies and possess a number of plesiomorphic morphological characters, such as the moderately developed male gonocoxites and gonostyli 9, and the feebly produced spiracles on the larval abdominal segment 8 2, 4 . Furthermore, the wing venation and larval morphology of Dysmicohermes and Orohermes largely resemble that in Jurochauliodes which is one of the most ancestral fishfly genera currently known from the Middle Jurassic of China. Thus, the Dysmicohermes clade should be considered as the basal most subgroup among extant Chauliodinae on the basis of morphological and molecular evidence. The divergence time estimation indicated that the evolutionary history of this subgroup is considerably long and dates to the earliest Late Jurassic (ca. 159 MA).
The Protochauliodes clade currently includes five genera, four of which possess a distinct wing character (i.e., anterior branch of 2 A partially fused with stem of 1 A) which supports the autapomorphic nature of these genera 4 . Furthermore, additional genital characters indicate autapomorphies of the Protochauliodes clade 2 . This

Biogeographic considerations. Previous study on the phylogeny and historical biogeography of
Chauliodinae suggested a Pangaean origin and a global distribution of the subfamily before the Middle Jurassic, while the earliest diversification of fishflies might have occurred before the initial split of Pangaea 4 . Moreover, Liu et al. (2012) considered that the divergence among the three major fishfly clades might have taken place when Pangaea was not yet separated. However, our estimate for divergence between the Dysmicohermes clade and the Archichauliodes clade + the Protochauliodes clade is slightly after the initial breakup of Pangaea, which led to the formation of Laurasia and Gondwana during an interval of 180-160 MA 13 . Since the Archichauliodes and Protochauliodes clades were considered to have originated from Gondwana, the divergence of the Dysmicohermes clade, which is endemic to western North America, could be correlated to the geographic vicariance formed by the separation of Laurasia and Gondwana.
The Archichauliodes and Protochauliodes clades include many austral endemic genera, which were thought to have diverged in connection with the sequential breakup of Gondwana 4 . As mentioned above, the molecule-based result of the divergence time between these two clades is much later than that inferred from the morphological data, with the molecular estimate being ca. 117 MA (95% HPD 84.50-138.87 MA) in the Early Cretaceous. The sequential breakup of Gondwana continued into the Early Jurassic and Late Cretaceous. By ca. 120 MA, Gondwana had split into several landmasses, e.g. Africa + northern South America, Madagascar + India, and a landmass including Antarctica, Australia, southern South America, etc 13 . In Chauliodinae, both Archichauliodes and Protochauliodes clades include extant genera distributed in the areas belonging to at least two of the above main Gondwanan landmasses. If the initial divergence between the Archichauliodes and Protochauliodes clades took place after 120 MA, it would be difficult to infer any correlation of the intergeneric divergence within either of these two major clades to Gondwanan plate drifting. Moreover, the disjunct distribution of the austral endemic genera in the Archichauliodes and Protochauliodes clades seems to be insufficiently explained since fishflies possess a relatively weak capacity for long-distance dispersal. Therefore, the initial divergence between the Archichauliodes and Protochauliodes clades might have been much earlier than 120 MA. Based on the range of their divergence time presently estimated (95% HPD 84.50-138.87 MA), it is plausible to assume that these two

Conclusions
The present study is the first to present a phylogenetic analysis based on mt genomic data to infer relationships among the major subgroups of Chauliodinae. Similar to the previous morphology-based intergeneric phylogeny of fishflies, the present results indicate that the Dysmicohermes clade is the sister group of the Archichauliodes clade + the Protochauliodes clade. However, the timescale we estimated for the divergence among the three major subgroups is much later than that hypothesized from the morphology-based phylogeny 4 , suggesting these major divergences were possibly infected by the sequential breakup of Pangaea during the Late Jurassic and Early Cretaceous. Unfortunately, it is still hard to reveal any clear divergence pattern of the whole subfamily due to lack of many genera, particularly those endemic to certain austral landmasses. Future study should focus on a total-evidence analysis with comprehensive sampling to elucidate the evolutionary history of fishflies.

Material and Methods
Specimens PCR amplification and sequencing. The mt genomes of A. deceptor and P. biconicus were generated by amplification of overlapping PCR fragments. PCR primers we used included universal and specifically designed primers (Tables S6-S7). All PCRs were performed using NEB Long Taq DNA polymerase (New England Biolabs, Ipswich, MA) under the following amplification conditions: 95 °C for 30 s, 40 cycles of denaturation at 95 °C for 10 s, annealing at 43-55 °C (depending on the primer pair used) for 50 s, elongation at 65 °C for 1 kb/min (depending on the size of amplicon), and the final elongation step at 65 °C for 10 min. The quality of PCR products was assessed through electrophoresis in a 1% agarose gel and staining with Gold View. All PCR products were sequenced in both directions using the BigDye Terminator Sequencing Kit (Applied Bio Systems) and the ABI 3730XL Genetic Analyzer (PE Applied Biosystems, San Francisco, California USA) with two vector-specific primers and internal primers for primer walking.
Bioinformatic analysis. The complete mt genomes of A. deceptor and P. biconicus are deposited in GenBank with accession numbers KU925864 and KY230493, respectively. Sequence assembly was done using ContigExpress. As for the sequence analysis, the tRNAs were identified by tRNAscan-SE Search Server v. 1.21 14 , while for those tRNAs which could not be detected by this program we compared them with the corresponding tRNAs gene sequence of Neochauliodes punctatolosus Liu & Yang 5 to determine the position and sequence. The annotations of PCGs and rRNA genes were verified by hand alignment with closely-related species of Chauliodinae. The control region was identified afterwards by the boundary of the rRNA genes and compared with other insect mt genomes. Nucleotide substitution rates, base composition and codon usage were analyzed with MEGA 5.0 15 The GC and AT skews were measured using the following formulae: AT-skew = (A − T)/(A + T) and GC-skew = (G − C)/(G + C) 16 .
Phylogenetic analysis. Nine species of Megaloptera with determined mt genomes were included in the ingroup taxa and the outgroup taxa comprised two species of Neuroptera, namely Thyridosmylus langii (McLachlan) (Neuroptera: Osmylidae) and Rapisma xizangense (Neuroptera: Ithonidae) ( Table 4). Sequences of 13 PCGs and two rRNAs were used in the present phylogenetic analysis. The PCGs were aligned based on the amino acid alignment using ClustalW in MEGA 5.0 15 Table 4. Taxa used in the phylogenetic analysis.
SequenceMatrix v1.7.8 17 . We performed maximum likelihood (ML) and Bayesian inference (BI) using the best-fit partitioning schemes recommended by PartitionFinder 18 . For the ML analysis, we ran 1,000 bootstrap replicates and used the rapid bootstrap feature (random seed value 12345) in RAxML 19 . MrBayes 3.2.2 20 was used to conduct the BI analysis with the GTR + I + G model as the optimal model selected by PartitionFinder. Two simultaneous runs of 2 million generations were conducted for the dataset, the tree samples were outputted every 1,000 generations with a burnin of 25%.
Divergence time estimation. Estimation of divergence times was conducted with all mt genome data using BEAST version 1.5.3 21 . The taxa and data partitioning we used were consistent with the previous phylogenetic analysis using the GTR + I + G model, estimated base frequencies and Yule process of speciation. Minimum node constrains were assigned a normal prior distribution with standard deviations equal to 12 Ma. Due to the difficulty of fossilization in habitats associated with fast-flowing water 22 , there are scarce fossil records of Megaloptera. We set two fossil calibrations in our analysis (1) the mean age of Corydalidae + Sialidae was set at 185 MA with the 95% credibility interval around the mean spanning the period from 204.7 to 165.3 MA, reflecting the minimum age of these two families, which is based on oldest known fossil of Sialidae (Dobbertinia reticulata Handlirsch) from the Lower Jurassic of Dobbertin, Germany (~185 MA) 23 ; (2) the mean age of Chauliodinae + Corydalinae was set at 165 MA with the 95% credibility interval around the mean spanning the period from 184.7 to 145.3 Ma based on the fossil evidence of an adult fishfly (Jurochauliodes ponomarenkoi Wang & Zhang) from the Middle Jurassic of Inner Mongolia, China (~165 MA) reported in Liu et al. 4 . Two independent MCMC analyses were run for 5 million generations under the uncorrelated lognormal relaxed clock model and sampled every 1000 generations. We combined tree files of both runs using LogCombiner 1.5.3, with the first 25% of the generations from each run discarded as burnin. Finally, we used TreeAnnotator 1.5.3 21 to calculate divergence time from a combined tree file. The phylogenic tree was viewed and edited using FigTree 1.3.1 24 .