Characterization of four mitochondrial genomes of family Neritidae (Gastropoda: Neritimorpha) and insight into its phylogenetic relationships

Neritidae is one of the most diverse families of Neritimorpha and possesses euryhaline properties. Members of this family usually live on tropical and subtropical coasts and are mainly gregarious. The phylogenetic relationships between several subclasses of Gastropoda have been controversial for many years. With an increase in the number of described species of Neritidae, the knowledge of the evolutionary relationships in this family has improved. In the present study, we sequenced four complete mitochondrial genomes from two genera (Clithon and Nerita) and compared them with available complete mitochondrial genomes of Neritidae. Gene order exhibited a highly conserved pattern among three genera in the Neritidae family. Our results improved the phylogenetic resolution within Neritidae, and more comprehensive taxonomic sampling of subclass Neritimorpha was proposed. Furthermore, we reconstructed the divergence among the main lineages of 19 Neritimorpha taxa under an uncorrelated relaxed molecular clock.

PCGs, tRNA genes, rRNA genes and codon usage. The AT contents of PCGs (− 0.2014 to − 0.0577) and tRNAs (− 0.0365 to − 0.0044) in the 14 Neritidae species had the same base bias as the entire genome (Table 4); however, the AT skew of the rRNAs (0.0614 to 0.0970) was slightly positive. All AT skew values were negative, while most GC skew values were positive. The AT content values of PCGs ranged from 60.43% to 65.64% in the 14 Neritidae species, indicating strong AT bias. All PCGs in the four mitogenomes started with the conventional initiation codon ATG or ATT and stopped with TAA or TAG.
The most frequently utilized amino acids in the four species were Leu2, Lys, Phe, Ser1 and Val (with frequencies ranging from 6.17% and 7.60%) (Fig. 2). The least common amino acid was Arg (all frequencies less than 2%), which is similar to the pattern previously reported in two Neritidae species (N. undata and N. balteata) 30 . Relative synonymous codon usage (RSCU) values for the 13 PCGs showed that UUA (Leu2) and CCU (Pro) were the two most frequent codons in the Clithon species (Fig. 3), and the most frequent codons in the Nerita species were CCU (Pro) and GCU (Ala). The 13 PCGs ranged in size from 165 bp (atp8 of all Neritidae) to 1717 bp (nad5 of C. sowerbianum). It is noteworthy that the atp8 gene is the smallest PCG in all currently described neritids. These comparative analyses showed that codon usage patterns are conserved among Neritidae species.
The lengths of the tRNA genes were almost identical among the four Neritidae species, ranging from 57 (trnL1 of N. chamaeleon) to 74 bp (trnN of two Nerita species). The AT contents of tRNA genes ranged from 62.06% to 63.93% in the 14 Neritidae species (Table 4). The rrnL genes of the four Neritidae species were 1318 to 1334 bp in length, while the rrnS genes were 863 to 870 bp. In general, the A and T contents were greater than the G and C contents in the two rRNA genes (Table 3).

Selective pressure analysis.
To investigate the evolutionary relationships among and selective pressure on 16 Neritimorpha species, we used the nonsynonymous to synonymous substitution (Ka/ Ks) ratio. The result showed that the average Ka/Ks ratio ranged from 0.060 for cox1 to 0.766 for nad4. This result indicated that the 13 PCGs of all Neritimorpha mitogenomes evolved under purifying selection (Fig. 4) www.nature.com/scientificreports/ lowest Ka/Ks ratio among studied genes and little change in amino acids; hence, it is widely used as a molecular marker for species identification and phylogenetic analysis 36,37 . The substitution saturation index value for the combined dataset of the 13 PCGs in all species (Iss = 0.685) was significantly lower than the critical values (Iss. cSym = 0.859 or Iss.cAsym = 0.847, p = 0.000) (Fig. 5). Thus, the combined sequence substitution was unsaturated, making the sequences suitable for phylogenetic analysis.
Phylogenetic relationships. Phylogenetic analyses were conducted on the concatenated alignment of 13 PCGs covering 88 gastropod species from thirty families of six subclasses (Vetigastropoda, Neomphaliones, Caenogastropoda, Neritimorpha, Patellogastropoda and Heterobranchia). We selected two Veneridae species (Bivalvia) as the outgroup. Maximum likelihood (ML) and Bayesian inference (BI) analyses produced almost identical topologies, with strong bootstrap and posterior probability values. However, family Lottidae of Patellogastropoda exhibited potential long-branch attraction (LBA) when we construct a Bayesian tree. Due to the large difference in branch length between members of this family and other related species, systematic errors occurred, and the true placements of these Lottidae taxa were not revealed 38,39 . This is the same as the result previously reported for the mitogenome of two limpets 40 . Finally, we combined the two methods to obtain a consistent evolutionary tree (Fig. 6). Our phylogenetic analysis indicated that all species representing subclass Neritimorpha clustered on the same branch; meanwhile, all posterior probability values were 1, and the bootstraps values were greater than 80. Within the Gastropoda class, the six subclasses exhibited the following phylogenetic relationships: ((((Vetigastropoda + Neomphaliones) + Caenogastropoda) + Neritimorpha) + Patellogastropoda) + Heterobranchia. Neritimorpha is closely related to Caenogastropoda and Patellogastropoda. Strikingly, we found that the branching orders of Neritimorpha and Caenogastroopoda were slightly different due to the increasing abundance of Neritimorpha species.
In Neritimorpha, whole mitogenomes are available for only two families, and Helicinidae forms an independent branch. The main evolutionary pattern in the Neritimorpha was the division of Neritidae into three genera,  . 7), in agreement with the finding of a previous study suggesting that Neritimorpha appeared in the Triassic period 30 . The Triassic was the first period of the Mesozoic, which was the transitional period of the formation of the modern biota after the disappearance of the Paleozoic biota. Great changes have taken place in marine invertebrate groups 41 . In Neritidae, the differentiation time between Nerita and the other three genera was the earliest (97.65 Mya). However, the estimate provided by this analysis was slightly older than the origin of the Neritidae estimated in our previous analyses (76.17-83.25 Mya) 30 . This is probably due to misidentification in the fossil record, which is determined by various taxonomic methods and influenced by different levels of experience and  30 . Most Neritidae species differentiation was concentrated in the Cenozoic Paleogene (approximately 2.4-65 Mya). This is the period when continental transgression was rapidly reduced and marine sediments appeared in the marginal areas of China. On the other branch, the differentiation time of  Most PCGs were initiated with the ATG codon and terminated with the TAA codon. The Ka/Ks ratio indicated that these Neritimorpha species were subjected to purifying selection. Phylogenetic trees contributed to the scientific classification of Neritimorpha species. This study provides information on the genetic characteristics, phylogenetic relationships and evolution of neritids as well as a basis for resource management and selective breeding in aquaculture. These four species differentiated in the late Paleogene and early Neogene, and their evolution may be related to the geological events that changed their living environments.  44 , and we consulted taxonomists from the marine biology museum of Zhejiang Ocean University. Genomic DNA was extracted from small pieces of foot tissue taken below the operculum using the salting-out method and was stored at − 20 °C before sequencing. Only one specimen of each species was used for sequencing. All animal experiments were conducted in accordance with the guidelines and approval of the Animal Research and Ethics Committees of Zhejiang Ocean University.
DNA sequencing and genome assembly. The mitogenomes of four Neritidae species were submitted to Origingene Bio-pharm Technology Co., Ltd. (Shanghai, China), for Illumina PE library construction and high-throughput sequencing by the Illumina HiSeq X Ten platform. Sequencing libraries with average insert sizes of approximately 400 bp were prepared. Each library generated approximately 5 Gb of raw data. Removing the low-quality and contaminated reads resulted in higher 'N' ratio sequences and adapters. The clean reads of the four species were de novo assembled separately using NOVOPlasty software (https:// github. com/ ndier ckx/ NOVOP lasty) 45 .
Gene annotation and sequence analysis. Four newly assembled mitogenomes were annotated with the MITOS web server (http:// mitos2. bioinf. uni-leipz ig. de/ index. py) based on the invertebrate genetic code 46 . Start and stop codons were confirmed using previously published Neritidae mitogenomes as references 29,30 . The circular genomes of the four Neritidae species were visualized with the CGView Server (http:// stoth ard. afns. Table 4. Summary of the base composition of the mitogenomes from 14 species in family Neritidae of the Neritimorpha.

Species (Neritidae) Length (bp)
Entire Genome  47 (Table 1). Phylogenetic trees were reconstructed using BI and ML methods. The nucleotide sequences for each PCG were adjusted by DAMBE 5.3.19 51 , and substitution saturation was tested for using the GTR substitution model. Sequences for each PCG were aligned using ClustalW of MEGA 7.0 48 . Phylogenetic analyses incorporated both the maximum likelihood (ML) method using IQ-TREE 52 and Bayesian inference (BI) using MrBayes v3.2 53 . The best-fitting model (GTR + F + R7) selected by the BIC criteria implemented in ModelFinder 54 was used for the ML analyses. In ultrafast likelihood bootstrapping, 1000 bootstrap replicates were applied to reconstruct a consensus tree. The MrBayes settings for the best substitution model (GTR + I + G) were determined by MrModeltest 2.3 55 under the AIC. The BI analyses involved two Markov chain Monte Carlo (MCMC) runs with 2,000,000 generations, sampling every 1000 generations and a discarded burn-in of 25%. The estimates of divergence times among subclass Neritimorpha species were based only on nucleotide level (12 PCGs, with cox3 excluded due to this gene being incomplete in some species) and obtained using a Bayesian framework with an uncorrelated relaxed clock and lognormal relaxed molecular clock model in BEAST v1.8.4 56 . The Yule process of speciation was used for the tree prior. For divergence time calibration, two calibration points were used as the prior for the corresponding split divergence time. Priors for fossil ages were drawn from normal distributions, and the root Pleuropoma jana was constrained between 235 and 223 million years ago (MYA) 57 . The 80 Ma point calibration was set as the root rate of Nerita based on the fossil of Nerita melanotragus (95-80 MYA) 58 . The final Markov chain was run twice for 100 million generations, with sampling every 1000 generations and 10% of samples discarded as a burn-in by TreeAnnotator v1.8.4 software (in the BEAST package). Then, using Tracer v. 1.6 59 Supplementary Table S1.