Comparative analysis of the complete mitochondrial genomes in two limpets from Lottiidae (Gastropoda: Patellogastropoda): rare irregular gene rearrangement within Gastropoda

To improve the systematics and taxonomy of Patellogastropoda within the evolution of gastropods, we determined the complete mitochondrial genome sequences of Lottia goshimai and Nipponacmea fuscoviridis in the family Lottiidae, which presented sizes of 18,192 bp and 18,720 bp, respectively. In addition to 37 common genes among metazoa, we observed duplication of the trnM gene in L. goshimai and the trnM and trnW genes in N. fuscoviridis. The highest A + T contents of the two species were found within protein-coding genes (59.95% and 54.55%), followed by rRNAs (56.50% and 52.44%) and tRNAs (56.42% and 52.41%). trnS1 and trnS2 could not form the canonical cloverleaf secondary structure due to the lack of a dihydrouracil arm in both species. The gene arrangements in all Patellogastropoda compared with those of ancestral gastropods showed different levels of gene rearrangement, including the shuffling, translocation and inversion of single genes or gene fragments. This kind of irregular rearrangement is particularly obvious in the Lottiidae family. The results of phylogenetic and gene rearrangement analyses showed that L. goshimai and Lottia digitalis clustered into one group, which in turn clustered with N. fuscoviridis in Patellogastropoda. This study demonstrates the significance of complete mitogenomes for phylogenetic analysis and enhances our understanding of the evolution of Patellogastropoda.

within Patellogastropoda mitochondrial genomes have been relatively conservative, but those of Lottiidae differ to some extent. The comparison of the two newly sequenced mitogenomes with a reported mitogenome from Lottiidae revealed the rearrangement of gene positions and structures. The complete mitochondrial genome sequences of L. goshimai and N. fuscoviridis were 18,192 bp and 18,720 bp, respectively (GenBank accessions MT248298 and MK395167) (Fig. 1, Table 1). Both circular mitochondrial genomes of the species contained 13 PCGs, 2 rRNA genes (12S rRNA and 16S rRNA), 22 putative tRNA genes and a control region (CR). Compared to the fragment of the genome previously published, we found an additional trnM gene in both species and additional trnW gene in N. fuscoviridis.
Overlapping and noncoding regions. Most of the genes identified in N. fuscoviridis are located on the heavy strand except for three PCGs and seven tRNAs. In addition, fourteen genes of L. goshimai (seven PCGs and seven tRNA genes) are located on the light strand, with the remaining genes being located on the heavy strand ( Fig. 1 and Tables 2, 3). The mitochondrial genome of L. goshimai contains intergenic spacers with lengths ranging from 1 to 178 bp, and there are two genes showing overlapping nucleotides (6 and 20 bp). The longest intergenic spacer is located between trnY and nad5 ( Table 2). The mitochondrial genome of N. fuscoviridis exhibits intergenic spacers with lengths ranging from 2 to 380 bp, and there are two genes with overlapping nucleotides (4 and 11 bp). The longest intergenic spacer is located between trnY and nad3 (Table 3). In conclusion, there are significant differences in the intergenic spacers and overlapping nucleotides of the two species, and these species of limpets also present large variations compared with other families (e.g., Nacellidae, Acmaeidae and Patellidae) [24][25][26][27][28] .  www.nature.com/scientificreports/ The control region (CR) is the largest non-coding region; it usually presents a high AT content and is therefore also known as the A + T rich region 29 . It is an essential element involved in mitochondrial genome replication and transcription initiation 30 . The mitogenomes of L. goshimai and N. fuscoviridis each contain one CR, and both CRs show relatively high AT contents of 61.61% and 53.43%, respectively. The CR is located between trnR and atp8 in L. goshimai, with a length of 1722 bp. In N. fuscoviridis, it is located between nad5 and atp8, with a length of 1561 bp. It also contains a replication origin for light-strand synthesis (OL), which is 21 bp (CCC TCC CCC CCA GGG GGA GGG) in length and folds into a hairpin secondary structure.
Base composition of mitogenomes. The A + T content of the whole mitogenome if 60.17% for L. goshimai (28.18% A, 32.00% T, 24.11% G and 15.71% C), and 54.15% for N. fuscoviridis (23.83% A, 30.32% T, 25.39% G and 20.46% C) ( Table 4). The A + T contents of all PCGs in L. goshimai range from 55.65% (atp8) to 62.64% (cytb), and those in N. fuscoviridis range from 52.07% (nad4) to 57.25% (cox1) ( Table 4). We observed the highest A + T contents of the two species in PCGs (59.95% and 54.55%), followed by rRNAs (56.50% and 52.44%) and tRNAs (56.42% and 52.41) ( Table 4). The AT skew of the total PCGs is negative, and the GC skew is positive across the two species, indicating that they contain a slightly higher percentage of T and G bases than A and C bases. For each PCG of two Lottiidae species in addition to the cox2 gene of L. goshimai, most of the AT skew values are negative.
Protein-coding genes and codon usage. The total length of the all PCGs is 11,238 bp in L. goshimai and 11,154 bp N. fuscoviridis, accounting for 61.77% and 59.58% of the whole genome, respectively ( Table 4). The comparison of the initiation and termination codons of all PCGs showed that most of the PGCs of the two Lottiidae species are initiated with an ATN codon and terminated with TAN. Only the cox1 gene of L. goshimai and nad3 of N. fuscoviridis start with GTG (Tables 2, 3). While the cox2, cox3 and cytb genes of N. fuscoviridis use an incomplete T stop codon, which is remarkably common in invertebrate mitogenomes.
Transfer RNA genes. We identified 23 tRNA genes from the mitochondrial genome of L. goshimai, including one more trnM gene than is common invertebrates, with lengths ranging from 65 (trnS2) to 72 bp (trnI). In addition, N. fuscoviridis exhibited one more trnW gene than L. goshimai, and 24 tRNA genes ranging from 64 (trnM1) to 72 bp (trnI) in length were identified. In both Lottiidae species, trnS1 and trnS2 cannot form a secondary structure due to the lack of dihydrouracil (DHU) arms, while other tRNAs are capable of folding into a typical clover-leaf secondary structure. The comparison of the tRNA genes of the two species showed that each corresponding amino acid is encoded by the same anticodon with the exception of the trnW1 gene of N. fuscoviridis, which is encoded by different anticodons (CCA). Moreover, methionine is encoded by two tRNAs with the same anticodons (CAT) ( Tables 2, 3

and Figs. 3, 4).
Nonsynonymous and synonymous substitutions.. We calculated the selection pressure (estimated by using Ka/Ks) on 13 PCGs in the two Lottiidae species (Fig. 5). Most of the Ka/Ks ratios are below 1 for these PCGs, indicating that they evolved under purifying selection. The remaining nad2, nad5, nad6 and cytb genes, with high Ka/Ks ratios, may have been affected by positive selection during evolution. Positive selection is influenced by the external environment for the self-regulation and transformation of genes, the elimination www.nature.com/scientificreports/ of genes that do not adapt to the environment, and the production of genes that can effectively adapt to the environment 31 . Therefore, advantageous genes are retained after non-synonymous mutations. The substitution saturation index was analysed on the basis of the combined dataset of all PCGs of 60 Gastropoda mitogenomes, and the observed Iss value (Iss = 0.651) was significantly lower than that of the critical value (Iss.cSym = 0.859, p = 0.0000) (Fig. 6), indicating that sequence substitution is unsaturated; thus, the combined data are suitable for phylogenetic analysis.
Phylogenetic analysis. We used the Bayesian inference (BI) and maximum likelihood (ML) methods to reconstruct a phylogenetic tree based on 13 PCGs from the two new Lottiidae species and 58 other species within Gastropoda (i.e., 8 Patellogastropoda species, 11 Caenogastropoda species, 3 Neomphalina species, 17 Vetigastropoda species, 7 Neritimorpha species, and 12 Heterobranchia species), using two Mopaliidae species as outgroups.
In addition, in the BI analysis, due to the high rearrangement rate of Lottiidae species, which exhibited a long branch compared to other species of Patellogastropoda, we encountered a long-branch attraction (LBA) artefact in the process of constructing phylogenetic trees. This is a common systemic error in phylogenetic reconstruction resulting from the clustering of fast-evolving taxa in the tree, instead of revealing their genuine  32,33 . Specifically, the three species of the Lottiidae family and Heterobranchia erroneously formed a clade, but this situation did not appear in the ML analysis. Finally, we combined these two methods and obtained a basically consistent evolutionary tree through reference to previous research on the phylogeny of gastropods [34][35][36][37] (Fig. 7). The results showed a stable evolutionary tree topology in which each subclass formed a monophyletic clade. Most of the recovered clades were highly supported (Bayesian posterior probability (BPP) = 1, and Bootstrap (BS) = 100). The higher phylogenetic relationship of clade formed: (((Neomphalina + Vetigastropoda) + Neritimorpha) + Caenogastroopoda) + (Patellogastropoda + Heterobranchia). Patellogastropoda and Heterobranchia clustered together in the same clade, which was located on the outermost branch of the six subclasses. Lottiidae formed an independent branch as (N. fuscoviridis + (L. goshimai and Lottia digitalis)) within Patellogastropoda. L. goshimai was shown to be the closest extant relative of Lottia digitalis, and this clade clustered with N. fuscoviridis.
The significance of Lottiidae species in the evolution and development of gastropods was confirmed through this study. Further mitogenome sequencing work was carried out to provide more comprehensive taxon sampling for the future, thus improving the understanding of the Lottiidae phylogeny and evolution within Gastropoda. to the hypothetical ancestral gastropod gene order 38 (Fig. 8). Among these subclasses, the fewest gene rearrangements are observed in Bathyacmaea nipponica of the Acmaeidae family, and only certain tRNA sequences exhibit shuffling (trnY and trnM), translocation (trnF, trnQ, trnF, trnC) and inversion (trnE) 39 . The gene order is closest to that of the family Nacellidae, with six tRNAs (trnT, trnR, trnN, trnA, trnK, trnI) and one PCG (nad3) exhibiting translocation. Recent studies of Nacellidae mitogenomes suggest that genome rearrangements are relatively conservative in this group 11 . The phylogenetic analyses showed that Nacellidae is the sister group of Acmaeidae, which confirmed that rearrangement may be helpful for phylogenetic analysis. Compared with the Table 4. Base composition of the mitochondrial genome of the two limpets.  www.nature.com/scientificreports/ above two families, the gene order in Patellidae differs substantially, but the fragment from cytb to atp8 has been retained, with only a portion of this fragment exhibiting local inversion. However, the genome organization is almost the same in Patella ferruginea and Patella vulgate, indicating that they are conservative in the family Patellidae. The most noteworthy finding was that there are essential differences in gene arrangement among species of different Lottiidae families, but they share the common characteristic of rrnL and rrnS gene inversion. The mitogenomes of the Lottiidae family have retained a fraction of the clusters found in ancestral gastropods 31 . For instance, Lottia digitalis has retained nad4-nad4L, and L. goshimai has retained nad5-nad4-nad4l, with the nad4 and nad4l fragments inverted in both cases. In addition, an extremely high rate of gene rearrangement is found in N. fuscoviridis, and the irregular ordering may be caused by a high rate of sequence evolution 40 . We will need to conduct more research on the family to verify this in the future.

Conclusion
In this study, the complete mitochondrial genome sequences of two new limpets, L. goshimai and N. fuscoviridis, belonging to Lottiidae, were characterized and compared. Duplications of tRNA genes are found in both species (trnM or trnW). In their tRNA secondary structures, both trnS1 and trnS2 are missing DHU stems, which is also observed in other species of the family. The phylogenetic relationships with other members of Gastropoda based on 13 mitochondrial PCGs were analysed. The results showed that the phylogeny was consistent with morphological observations and previous reports. In addition, a highly irregular rearrangement of mitochondrial genes was found within Lottiidae. Since there are currently few species in the family, it is impossible to determine whether this situation is associated with a single species or occurs throughout the family, which is worthy of further study.   41,42 . The samples were preserved in absolute ethyl alcohol before DNA extraction. Total genomic DNA was extracted from the operculum using the salting-out method 43 and was then stored at − 20 °C before sequencing.
Mitochondrial genome sequencing, assembly and annotation. The whole mitogenomes of the two limpets were sequenced using the Illumina HiSeq X Ten platform (Shanghai Origingene Bio-pharm Technology Co., Ltd. China). An Illumina PE library with an insert size of 400 bp was generated. The original sequencing data have been stored in the sequence read archive (SRA, https ://trace .ncbi.nlm.nih.gov/Trace s/sra/) of the National  www.nature.com/scientificreports/ Center for Biotechnology Information (NCBI). NOVOPlasty software (https ://githu b.com/ndier ckx/NOVOP lasty ) was used for the de novo assembly of the clean data without sequencing adapters to obtain the optimal assembly result 44 . Two newly assembled mitochondrial genomes were annotated on the MITOS web server (https ://mitos 2.bioin f.uni-leipz ig.de/index .py) using the invertebrate genetic code, and start and stop codons were confirmed by comparing the obtained nucleotide sequences with those from closely related limpets 24,45,46 . Preparation of datasets, model selection, phylogenetic analyses. For the phylogenetic analysis, DAMBE 5.3.19 was used to adjust the nucleotide sequences of 13 protein-coding genes (PCGs) of each species, and the nucleotide substitution saturation was analysed to determine whether these sequences were suitable for constructing phylogenetic trees 53 . Sixty published mitochondrial genomes were downloaded from NCBI as references, including those of 58 other marine gastropods and two outgroups (Cryptochiton stelleri and Katharina tunicata of Polyplacophora), and were analysed along with the mitogenome sequence of the two new Lottiidae species (Table 1). Then, the sequences of each of 62 species were aligned using ClustalW with the default parameters in MEGA 7.0. The phylogenetic analyses incorporated Bayesian inference (BI) methods using the program MrBayes v3.2 and maximum likelihood (ML) using IQ-TREE 54,55 . MrMTgui was used to combine the results of PAUP 4.0, Modeltest 3.7 and MrModeltest 2.3 to find the best substitution models (GTR + I + G) with the AIC for Bayesian inference (BI) [56][57][58] . BI analyses were conducted with two Markov chain Monte Carlo (MCMC) runs, each with four chains (three heated and one cold) run for 2,000,000 generations, with tree sampling every 1000 steps and a burn-in of 25%. ML analysis was performed with the best-fit substitution model automatically selected by ModelFinder, and the number of bootstrap replicates was set to 1000 in ultrafast likelihood bootstrapping to reconstruct a consensus tree 59 . The phylogenetic trees were visualized and edited using FigTree v1.4.3 60 .

Data availability
The mitochondrial genome data has been submitted to NCBI GenBank under the following accession numbers: Lottia goshimai (MT248298), Nipponacmea fuscoviridis (MK395167).