The first mitochondrial genome of the genus Exhippolysmata (Decapoda: Caridea: Lysmatidae), with gene rearrangements and phylogenetic associations in Caridea

The complete mitochondrial genome (mitogenome) of animals can provide useful information for evolutionary and phylogenetic analyses. The mitogenome of the genus Exhippolysmata (i.e., Exhippolysmata ensirostris) was sequenced and annotated for the first time, its phylogenetic relationship with selected members from the infraorder Caridea was investigated. The 16,350 bp mitogenome contains the entire set of 37 common genes. The mitogenome composition was highly A + T biased at 64.43% with positive AT skew (0.009) and negative GC skew (− 0.199). All tRNA genes in the E. ensirostris mitogenome had a typical cloverleaf secondary structure, except for trnS1 (AGN), which appeared to lack the dihydrouridine arm. The gene order in the E. ensirostris mitogenome was rearranged compared with those of ancestral decapod taxa, the gene order of trnL2-cox2 changed to cox2-trnL2. The tandem duplication-random loss model is the most likely mechanism for the observed gene rearrangement of E. ensirostris. The ML and BI phylogenetic analyses place all Caridea species into one group with strong bootstrap support. The family Lysmatidae is most closely related to Alpheidae and Palaemonidae. These results will help to better understand the gene rearrangements and evolutionary position of E. ensirostris and lay a foundation for further phylogenetic studies of Caridea.

www.nature.com/scientificreports/ The species Exhippolysmata ensirostris (Kemp 1914), which is widely distributed in the Pacific region, extends from the coast of the East China Sea and South China Sea to the Indo-West Pacific. It is an important and commercially exploited species in the East China Sea and the South China Sea. However, research on the genus Exhippolysmata has been limited to its species investigation and morphological description. Most of the research in Lysmatidae has focused on the genus Lysmata, including their mitochondrial genes and evolutionary relationships [11][12][13][14][15][16] . Consequently, research on the mitochondrial genes of the genus Exhippolysmata has rarely been reported.
The complete mitochondrial genome (mitogenome) is typically extrachromosomal and characterized by maternal inheritance and with a high evolution rate 17 . A complete mitogenome is a powerful tool for analysing the evolutionary history and phylogeny of species 18 . The mitogenome can also provide direct molecular clues for gene rearrangement processes, which would reveal important information for phylogenetic analyses 19 . The mitogenome of most metazoans is a double-stranded closed circular molecule approximately 11-20 kb in length. It typically contains 37 genes, including 13 protein coding genes (PCGs), two ribosomal RNA genes (16S rRNA and 12S rRNA) and 22 transporter RNA genes 20 .
In this study, the first complete mitogenome of the genus Exhippolysmata was described for the first time. We first successfully determined the complete mitogenome sequence of E. ensirostris using Illumina sequencing technology. We also analysed the nucleotide composition, codon usage profiles of protein coding genes (PCGs), Ka/Ks ratios of 13 PCGs, tRNA secondary structures, gene order and investigate the evolutionary relationships www.nature.com/scientificreports/ within Caridea. The purpose of this study was to understand the characteristics of the E. ensirostris mitogenome and clarify the evolutionary relationships within the Caridea mitogenome.

Results and discussion
Genome organization and base composition. The complete mitogenome of E. ensirostris was found to be a typical circular molecule of 16,350 bp ( Fig. 1), and the sequence was deposited in GenBank under accession number MK681888. The data that support the findings of this study are openly available in Microsoft OneDrive at (https:// 1drv. ms/w/ s!Ag1aK daw8C T3iHx X9f98 FCkZv Q3n?e= BaRfdq). The newly sequenced mitogenome contains 13 PCGs, 22 tRNA genes, two rRNA genes and a large noncoding or control region (CR). Of the 37 genes, 23 were encoded on the heavy strand, and the other 14 were encoded on the light strand ( Fig. 1, Table 1). The longest noncoding region was located between trnL2 and trnK, and the largest gene junction was located between trnL1 and 12S rRNA. The base compositions (Table 2) showed a high A + T content in the complete mitogenome (64.43%), PCGs (62.6%), tRNAs (66.04%), rRNAs (66.62%) and a CR (69.33%). The relative order of the nucleotide composition was A > T > C > G. The complete sequence had a positive AT skew (0.009) and a negative GC skew (− 0.199). As in other invertebrate mtDNAs, there were overlapping and noncoding bases between some genes.  (Table 1). Three genes (nad6, cox1 and cox3) were found to start with ATA, a further three (nad5, nad4 and nad4 L) with ATT, and the other seven with ATG. Eleven PCGs were found to end with the typical stop codon TAA, whereas cox1 and nad4 were found to end with TAG. Codon number and relative synonymous codon usage in the E. ensirostris mitochondrial genome are listed in Table 3. The patterns of codon usage of 13 PCGs are exhibited in Fig. 2A. The abundance of codon families and the relative synonymous codon usage (RSCU) in the PCGs were investigated for all available E. ensirostris mtDNAs, and the results are shown in Fig. 2B. The most frequently used codon was UUR (trnL2). There were 22 non-coding regions and eight overlaps of neighbouring genes in the mitochondrial genome of E. ensirostris. The largest non-coding region of E. ensirostris was identified as a putative control region. In addition, the position of the largest gene overlap (23 bp) was between trnL1 and 16S rRNA.  www.nature.com/scientificreports/ To analyse the selection pressure on mitochondrial PCGs of the caridean shrimps, the ratio of the nonsynonymous and synonymous substitution rates (Ka/Ks) for the 13 PCGs from the six caridean species (E. ensirostris, Alpheus japonicas, Alvinocaris longirostris, Halocaridina rubra, Heterocarpus ensifer and Macrobrachium lanchesteri) was calculated. We found that the Ka/Ks values for all PCGs were lower than one (between 0.187 and 0.959), indicating that they are evolving under purifying selection (Fig. 3). Among all 13 caridean proteincoding genes, the average Ka/Ks of nad1 was the highest (0.959), and nad2 (0.941) and nad5 (0.927) also had very high average Ka/Ks values, indicating that these genes bear less selective pressure than other mitochondrial protein-coding genes.
Transfer and ribosomal RNA genes. The E. ensirostris mitochondrial genome encodes 22 tRNA genes, each of which was predicted to fold into a clover-leaf secondary structure that ranged in size from 64 bp (trnC) to 70 bp (trnV) of nucleotides ( Table 1). The DHU arm of the trnS1 gene lacked any secondary structure (Fig. 4). The total length of the 22 tRNA genes in the E. ensirostris mitochondrial genome was 1446 bp. The overall A + T content of tRNA genes was 66.04%, which is similar to that of other carideans (Table 2) 21 . The mt tRNAs had a weakly positive AT skew (0.012) and positive GC skew (0.104). Fourteen tRNA genes (trnL2, trnK, trnD, trnG, trnA, trnR, trnN, trnS1, trnE, trnT, trnS2, trnI, trnM and trnW) were present on the heavy strand, and eight tRNA genes (trnF, trnH, trnP, trnL1, trnV, trnQ, trnC and trnY) were present on the light strand.
The 12S rRNA gene lay between trnL1 (CUN) and trnV, while the 16S rRNA gene lay between trnV and the putative control region, and both rRNA genes were encoded by the β-strand. As typically seen in other shrimp mitogenomes, the 16S rRNA and 12S rRNA genes of the E. ensirostris mitogenome were 1368 bp and 818 bp in length, respectively. The location and orientation of the rRNA genes were identical to the original arrangement of ancestral Caridea 22 . The A + T content of the two rRNA genes was 66.62%, and they had a negative AT skew (− 0.068, Table 2). www.nature.com/scientificreports/ Gene rearrangement. Gene rearrangement in the Decapoda mitogenome commonly occurs and can be a tool to study phylogenetic relationships. Tan et al. 19 gave an overview of mitochondrial gene orders (MGOs) of Decapoda, which revealed a large number of MGOs deviating from the ancestral arthropod ground pattern and unevenly distributed among infraorders. Here, we compared the MGOs of the Caridea mitogenomes with ancestral Decapoda and Caridea (Fig. 5). Among them, the MGOs in the mitogenomes of the families Pandalidae, Atyidae, and Alvinocarididae were identical to those of the ancestral Decapoda. However, fourteen carideans from the families of Lysmatidae, Alpheidae and Palaemonidae displayed gene rearrangements. This is in contrast with previous views that the gene order in Caridea is conserved [23][24][25][26] . Compared with the gene order of the ancestral Decapoda, E. ensirostris has a translocation, for which the gene order is trnL2-cox2 instead of cox2-trnL2 (Fig. 5C). Alpheus distinguendus, Alpheus hoplocheles, Alpheus inopinatus, Alpheus bellulus, Alpheus randalli and Alpheus japonicas in Alpheidae also undergo gene rearrangement, and trnE translocates and reverses with trnP 27 (Fig. 5D). Alpheus lobidens has an extra duplication of trnQ located downstream of nad4l 28 (Fig. 5E). In addition, the translocation of two tRNA genes was found in the mitochondrial genomes of Exopalaemon carinicauda, Palaemon annandalei, Palaemon capensis and Palaemon gravieri in Palaemonidae, wherein trnP or trnT were translocated, while the arrangement of other genes was identical 29 (Fig. 5F). Palaemon sinensis in Palaemonidae has an extra translocation between trnG and trnE (Fig. 5G). The mitochondrial genome of Hymenocera picta in Palaemonidae bears a novel gene order, the gene block (nad1-trnL1-16S rRNA-trnV-12S rRNA-CR-trnI-trnQ) was rearranged from the downstream of trnS2 to the position downstream of nad4l (Fig. 5H). These data indicate that gene order is not conserved among caridean shrimp and could be useful for inferring phylogenetic relationships within Caridea when more mitochondrial data from Caridea become available in the future. Some mechanisms have been proposed to explain the rearrangement of genes in animal mitogenomes, including the tandem duplication/random loss model (TDRL) 30 , tandem duplication/non-random loss model (TDNL) 31 , and recombination 32 . Generally, TDRL is one of the most widely accepted mechanisms of mitochondrial gene rearrangement, which involves tandem duplication of gene regions caused by downstream chain mismatch during replication. TDNL attribute gene rearrangement to clustering by common polarity. The recombination within mitochondria mechanism involves the breaking and reconnecting of DNA double strands, leading to gene rearrangement and gene inversion 33 . Here, we propose that TDRL is more capable of explaining the cox2 and trnL2 translocations of the tRNA genes in the E. ensirostris mitochondrial genome.
Phylogenetic relationships. Many studies on the classification and evolutionary history of the Decapoda relied on morphological characteristics, which led to conflicting phylogenetic relationships. Under the best model, both ML and BI analyses of two data sets, based on the nucleotide sequences of the 13 PCGs and reconstruction of 53 species (including 51 Caridea species and two outgroup species) revealed the phylogenetic relationship between them. This study proposes a consistent phylogenetic relationship based on BI and ML methods; therefore, only one phylogenetic tree with both support values is presented (Fig. 6). Our results indicate that the mitochondrial genome sequence is robust for the inference of the relationships between shrimps. In addition, both ML and BI analyses of the two data sets show high branch support values. The phylogenetic tree based on the mitogenomes indicates that Palaemonidae and Alpheidae forme a monophyletic group and show a statistically significant relationship at the family level. Our complete mitogenome data suggest phylogenetic www.nature.com/scientificreports/ relationships among the major lineages of Caridea as ((((Alpheidae + Palaemonidae) + Lysmatidae) + Pandalidae) + Atyidae) + Alvinocarididae. Although the main phylogenetic structures of our tree were consistent with those of previous result, some controversial findings were observed. Here, the families Alpheidae, Pandalidae, Lysmatidae and Palaemonidae clustered together as sister groups and were distantly related to Alvinocarididae, which supports the previous finding revealed by five nuclear genes (18S, Enolase, H3, NaK and PEPCK) in Li et al. 8 . However, Li et al. 8 also revealed that Atyidae has been considered as basal lineages within the Caridea, which was conflict with our results. Based on both mitochondrial and nuclear genes (16S and 18S), Bracken et al. also revealed Atyidae represent basal lineages within the Caridea 7 . Meanwhile, in Sun et al. 's recent study, the phylogenetic relationship among Caridea was ((((Alpheidae + Palaemonidae) + Pandalidae) + Alvinocarididae) + Atyidae), which also considered Atyidae was distantly related to the four above families 34 . Furthermore, our result does not agree with Tan et al. 35 and Wang et al. 28 , which state that Atyidae was the sister clade to Alvinocarididae. In our phylogenetic tree, most of the unstable and conflicting clades might have resulted from the limited taxon samples. The sequencing and assembly of the mitochondrial genome current result will promote the future work of further www.nature.com/scientificreports/ mitochondrial genome sequencing, and to increase in taxon sampling and genome sequencing which will help to resolve the classification of Caridea. Thus, more mitochondrial genome data will lead to a more comprehensive understanding of the phylogenetic relationships within Caridea and to resolve its classification.

Conclusions
Using next-generation sequencing methods, the mitogenome of E. ensirostris was determined to be a circular molecule of 16,350 bp. Compared with typical Decapoda mitogenomes, the gene order of this species had undergone a rearrangement, wherein cox2 and trnL2 were translocated to trnL2 and cox2. The gene rearrangement event occurring in E. ensirostris mitogenome can be explained by the TDRL model. The evolutionary patterns of PCGs were observed in the six caridean shrimp mitogenomes, which indicates that these genes were evolving under purifying selection. Phylogenetic analyses indicated the Caridea clades as monophyletic groups with strong bootstrap support. The family Lysmatidae is most closely related to Alpheidae and Palaemonidae. However, the lack of complete mitogenomes of other species of the Lysmatidae has limited the understanding of the evolution of this group at the genome level. Therefore, further studies are required to elucidate the phylogenetic status of species belonging to this group and their relationships.

Materials and methods
Sampling, identification and DNA extraction. An individual specimen of E. ensirostris was collected from Zhoushan, Zhejiang Province, China (30° 09′ 41″ N, 122° 35′ 10″ E) by bottom trawl fishery resource monitoring in November 2018. The specimen was identified morphologically and preserved in absolute ethanol. The total genomic DNA was extracted from muscle tissues of the specimen by the salt-extraction procedure with a slight modification 36 . Once extracted, the DNA was stored in 1 × TAE buffer at 4 °C. The extracted DNA was identified by 1.5% agarose gel electrophoresis and stored at − 20 °C.
Sequencing and assembly. The mitogenome of E. ensirostris was sequenced using next-generation sequencing by Origin Gene Co. Ltd., Shanghai, China. The mitogenome was sequenced from the total genomic DNA using an Illumina HiSeq X Ten platform to generate a library with an insert size of 400 bp. Then, the raw image data were converted into sequential data by base calling. A total of 5,515,049,137 bp of clean data and 37,141,698 clean reads were retrieved. Raw sequencing data were deposited into the Sequence Read Archive (SRA) database (SRR12199494) (http:// www. ncbi. nlm. nih. gov/ Traces/ sra). De novo assembly of clean data without sequencing adapters was conducted using NOVOPlasty software (https:// github. com/ ndier ckx/ NOVOP lasty) 37 .  www.nature.com/scientificreports/ www.nature.com/scientificreports/ were determined based on the locations of adjacent tRNA genes and by comparisons with other shrimp. Strand asymmetry was calculated using the formulae AT-skew = (A − T)/(A + T) and GC-skew = (G − C)/(G + C) 41 . The graphical map of the circular E. ensirostris mitogenome was drawn using the online mitochondrial visualization tool CGView Server 42 . In addition, we estimated the value of synonymous (Ks) and nonsynonymous substitutions (Ka) in the 13 mitochondrial PCGs using DnaSP 5.1.0 43 . A Ka/Ks rate that is significantly less than one indicates negative (purifying) selective pressure, and a Ka/Ks rate that is significantly greater than 1 indicates positive selection 44 .
Phylogenetic analysis. A total of 51 caridean shrimp mitogenomes were downloaded from GenBank (https:// www. ncbi. nlm. nih. gov/ genba nk/) for phylogenetic analysis ( Table 4). The outgroup taxa were two Stenopodidea species: Stenopus hispidus and Spongiocaris panglao. We used the nucleotide sequences of the 13 protein coding genes (PCGs) to construct ML and BI phylogenetic trees. The 13 mitochondrial PCGs were aligned through MAFFT using default settings 45 , and then the resulting alignments were imported into Gblocks v. 0.91b (http:// molev ol. cmima. csic. es/ castr esana/ Gbloc ks_ server. html) to select the conserved regions 46 . A substitution saturation analysis was performed in DAMBE v. 5.3.15 to test whether the dataset was suitable for constructing trees 47 . ML analysis was conducted using IQ-TREE v1.4.1 48 with the best-fit substitution model automatically selected by ModelFinder 49 in the IQ-TREE package. GTR + I + G was selected as the best-fit model for nucleotide datasets under the Akaike Information Criterion (AIC) by MrModeltest 2.3 50 , and then BI analysis was carried out using MrBayes 3.2.6 51 BI analysis was performed using default settings over four independent runs for 2 million generations sampled every 100 generations. The average standard deviation of split frequencies was < 0.01, the estimated sample size was > 200 and the potential scale reduction factor approached 1.0. The first 25% of samples were discarded as burn-in, and the remaining trees were used to calculate the Bayesian posterior probabilities for a 50% majority-rule consensus tree. All parameters were checked with Tracer v.

Data availability
The mitochondrial genome data has been submitted to NCBI GenBank under the following Accession Numbers MK681888.