Introduction

Mitochondrial DNA (mtDNA) is a typically closed circular molecule approximately ranging in size from 14 to 18 kb. It contains 13 protein-coding genes (PCGs), 2 ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes, and control region (CR)1, 2. mtDNA is characterized by maternal inheritance, simple structure, a small genome size, conserved gene content and organization, high mutation rate, and accelerated rate of nucleotide substitution3,4,5,6,7. The mitogenomes of animal mtDNA can provide important information on rearrangement laws and phylogenetic analysis because of their rapid evolutionary rate and lack of genetic recombination1. It is becoming increasingly common to use complete animal mitogenomes for phylogenetic reconstruction8,9,10. Partial DNA sequences are often too short to contain sufficient phylogenetic information11, and combination of mitochondrial and nuclear genomes makes model selection difficult12. Further, the addition of rRNA makes alignment ambiguous13.

The infraorder Brachyura contains about 7000 described species in 98 families14. C. sinensis is one of the most important Brachyura species, and is used as a good indicators of environmental changes and water pollutions in China15. Although C. sinensis was described over 80 years ago16, it is still very poorly understood. Earlier studies classified C. sinensis into Grapsidae, Sesarminae17. In recent years, some researchers have classified C. sinensis into Grapsoidea, Sesarmidae18. Gene rearrangements in mitogenomes are useful in reconstruction of Brachyuran phylogeny19. In the present study, we sequenced the complete mitogenome of C. sinensis with the aim of elucidating its evolutionary status and rearrangement information by comparing it with complete Brachyuran mitogenomes available to date20, 21. This information may provide insights into phylogenetic rearrangement and enable phylogenetic analysis.

Methods

Sample and DNA Extraction

Adult specimens of C. sinensis were captured from Yancheng, Jiangsu province, China. Total genomic DNA was isolated from individual specimens using the Aidlab Genomic DNA Extraction Kit (Beijing, China). All procedures were completed following the manufacturer’s instructions. The complete mitogenome was amplified from the DNA from one C. sinensis crab.

PCR Amplification and Sequencing

The complete mitogenome was obtained using a combination of conventional PCR and long PCR to amplify overlapping fragments spanning the whole mitogenome. Universal and specific primers were designed based on the conserved nucleotide sequences of known mitochondrial sequences in Brachyura (Table 1) and synthesized by Beijing Sunbiotech22,23,24,25,26. The fragments were amplified using Aidlab Red Taq (Beijing, China) according to the manufacturer’s instructions. All amplifications were performed on an Eppendorf Mastercycler and Mastercycler gradient in 50 µl reaction volumes with 5 µl 10 × Taq Buffer (Mg2+) (Aidlab), 4 µl of dNTPs (2.5 mM, Aidlab), 2 µl of each primer (10 µM), 2 µl of DNA temple (~30 ng), 34.5 µl ddH2O, and 0.5 µl Red Taq DNA polymerase (5U, Aidlab). PCR was performed using the following procedure: 94 °C for 3 min; followed by 40 cycles of 30 s at 94 °C, annealing for 35 s at 48–56 °C (depending on primer combination), and elongation at 72 °C for 30 s to 4 min (depending on the fragment length); and final extension at 72 °C for 10 min. The PCR products were separated by agarose gel electrophoresis (1% w/v) and purified using a DNA gel extraction kit (Transgen, Beijing, China). The purified products were then ligated into the T-vector (Sangon, Shanghai, China) and sequenced.

Table 1 Primers used in this study.

Complete Mitogenome Analysis

The graphical map of the complete mitogenome was drawn using the online mitochondrial visualization tool mtviz27. The secondary cloverleaf structure and anticodon of transfer RNAs were identified using the tRNA-scan SE webserver28. Codon usage and the nucleotide composition of the mitogenome were determined using MEGA6. The sequences of 29 Brachyura species and Alpheus distinguendus were aligned using MAFFT29.

Phylogenetic Analysis

Twenty-eight complete Brachyura mitogenomes were downloaded from GenBank (https://www.ncbi.nlm.nih.gov/genbank/). In addition, the mitogenome of A. distinguendus was downloaded from GenBank and used as an outgroup taxon. GenBank sequence information is shown in Table 2.

Table 2 List of Brachyura species analysed in this study with their GenBank accession numbers.

The sequences were aligned with the mitochondrial sequences of closely related species. In order to remove the gaps in sequences, poorly aligned positions and divergent regions were removed using Gblocks25. Then, fasta sequences were converted to nex format sequences and phylip format sequences for Bayesian inference (BI) and Maximum likelihood (ML) analyses using online software (http://sequenceconversion.bugaco.com/converter/biology/sequences/fasta_to_phylip.php). We used DAMBE to detect the saturation status of the sequences30.

We determined the taxonomic status of C. sinensis within Brachyura by reconstructing the phylogenetic tree. Nucleotide sequences from 30 mitogenome PCGs were combined. The dataset was run using two inference methods: BI and ML analyses. The former was performed using Mrbayes v3.2.131, while ML analysis was performed using raxmlGUI32. The nucleotide substitution model was selected using Akaike information criterion implemented in Mrmodeltest v2.333, 34. The GTR+I+G model was the best model to examine nucleotide phylogenetic analysis and molecular evolution. BI and ML analyses were performed under the GTRCAT model with nucleotide alignment (NT dataset) of the 13 mitochondrial PCGs. ML analyses were performed on 1000 bootstrapped datasets. The BI analysis ran as 4 simultaneous MCMC chains for 10,000,000 generations, sampled every 100 generations, and a burn-in of 5000 generations was used. The average standard deviation of split frequencies was less than 0.01, and the effective sample size determined using tracer v1.6 exceeded 200. These two findings indicate that our data was convergent. The resulting phylogenetic trees were visualized using FigTree v1.4.2.

Results and Discussion

Genome Structure and Organization

The mitogenome of C. sinensis is 15,706 bp long, and its gene content is same as that most known Brachyura: 13 PCGs, 2 rRNA genes, and 22 tRNA genes plus CR (Table 3 and Fig. 1). Twenty-three genes are coded on the J strand and the remaining 14 genes are transcribed on the N strand. It has been deposited in GenBank under accession number KU589292. The genome composition (A: 37.1%, T: 38.6%, C: 14.9%, G: 9.4%) shows a strong A+T bias, which account for 75.7% of the bases, and exhibits a negative AT skew ([A − T]/[A+T] = −0.020) and GC skew ([G − C]/[G+C] = −0.228). The A+T skew of other previously sequenced Brachyura mitogenomes ranged from −0.080 (Pachygrapsus crassipes) to 0.040 (Homologenus malayensis), while the G+C skew ranged from −0.341 (Austinograea rodriguezensis, Geothelphusa dehaani) to −0.219 (Portunus pelagicus) (Table 4). However, different regions have different A+T contents. The CR had the highest A+T content (82.9%), whereas the PCG region had the lowest A+T content (74.2%) (Table 5).

Table 3 Summary of Clistocoeloma sinensis mitogenome.
Figure 1
figure 1

Graphical map of the mitogenome of Clistocoeloma sinensis. Protein-coding and ribosomal RNA genes are shown using standard abbreviations. Genes for transfer RNAs are abbreviated using a single letter. S1 = AGN, S2 = UCN, L1 = CUN, L2 = UUR. CR = control region. The 13 protein-coding genes are yellow, tRNAs are green, rRNAs are red, and CRs are dark red.

Table 4 Composition and skewness of mitogenome in 29 Brachyura species.
Table 5 Composition and skewness of Clistocoeloma sinensis mitogenome.

Protein-Coding Genes

Among the 13 PCGs, 9 (nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad6, and cob) were coded on the J strand, while the rest (nad5, nad4, nad4L, and nad1) were on the N strand. The 13 PCGs ranged in size from 159 to 1731 bp (Table 3). Their A+T content was 74.2% and AT skew was −0.026 (Table 5). The relative synonymous codon usage for C. sinensis at the third position is shown in Fig. 2. The usage of both two- and four-fold degenerate codons was biased toward the use of codons abundant in A or T (Table 6), which is consistent with other Brachyura species35,36,37.

Figure 2
figure 2

Relative synonymous codon usage in Clistocoeloma sinensis mtDNAs. Codon families are provided on the x axis (A). (B) Nucleotide composition conditions.

Table 6 The codon number and relative synonymous codon usage in Clistocoeloma sinensis mitochondrial protein coding genes.

Transfer RNAs, Ribosomal RNAs, and A+T-Rich Region

Like most Brachyura mtDNA, the C. sinensis mitogenome contains a set of 22 tRNAs genes (Fig. 3), although this feature is not very well conserved in animal mtDNA. The tRNAs ranged in size from 64 to 73 bp and showed a strong A+T bias, as these bases accounted for 76.2% of the DNA. Further, they exhibited a negative AT skew (−0.010) (Table 5). Fourteen tRNA genes were present on the J strand and eight were on the N strand. All the tRNA genes had the typical cloverleaf structure, except for the trnS1 gene, whose dihydroxyuridine arm was instead just a simple loop (Fig. 3). These features are common in most Brachyura mitogenomes35,36,37. The secondary cloverleaf structure of 18 tRNAs was examined using tRNA-scan SE; 4 tRNAs not detected by tRNAscan-SE were found in the unannotated regions by sequence similarity to the tRNAs of other crabs. The 2 rRNA genes with 80.2% total A+T content and positive AT skew (0.007) (Table 5) were located between trnL1 and trnV and between trnV and CR. rrnL is 1336 bp while rrnS is 832 bp. The CR located between rrnS and trnQ, spans 684 bp. This region contains 82.9% AT nucleotides, with a positive AT skew (0.047) and negative GC skew (−0.228) (Table 5).

Figure 3
figure 3

Secondary structures of the 22 transfer RNA genes of Clistocoeloma sinensis. The tRNAs are labelled with the abbreviations of their corresponding amino acids. Dashes (−) indicate Watson-Crick pairing.

Gene Arrangement

Gene order within the complete mitogenome of C. sinensis is similar to the pancrustacean ground pattern38,39,40 (Fig. 4A), except for the translocation of trnH. Typically, the trnH gene is located between the nad4 and nad5 genes in the pancrustacean ground pattern, but in C. sinensis, it is between the trnE and trnF genes (Fig. 4B). This translocation was also observed in the mitogenomes of Brachyura crabs available in GenBank that were compared with the C. sinensis mitogenome. In addition, in the pancrustacean ground pattern, the tRNA gene order between the CR and nad2 is trnI-trnQ-trnM. However, in C. sinensis, it is trnQ-trnI-trnM (Fig. 4B). The tRNA rearrangements are generally considered to be a consequence of tandem duplication of part of the mitogenome41. Similar non-coding sequences are present at the position of trnI originally occupied by the transposed trnQ in C. sinensis. Because these intergenic sequences have similar lengths to those of typical tRNA genes, they were presumed to be remnants of the trnQ gene and its boundary sequences42. The gene order of C. sinensis is identical to that of S. sinensis (Fig. 4B), which indicates that C. sinensis may belong to the group Sesarmidae of the superfamily Grapsoidea and that C. sinensis and S. sinensis probably belong to sister groups.

Figure 4
figure 4

Linear representation of gene rearrangements of Brachyura mitogenomes. All genes are transcribed from left to right. tRNA genes are represented by the corresponding single-letter amino acid code. S1 = AGN, S2 = UCN, L1 = CUN, L2 = UUR. CR = control region. rrnL and rrnS are the large and small ribosomal RNA subunits.

The gene sequences of Varunidae species (Eriocheir japonica sinensis, E. j. hepuensis, E. j. japonica, and Helice latimera) are identical (Fig. 4C). As shown in Fig. 4D, the order and orientation of genes in 7 families are uniform. The order of genes in C. sinensis sequences is different from that in the sequences of the mitogenomes of these 7 families because of the rearrangement of two tRNA genes between CR and trnM: the placement of genes between CR and trnM in C. sinensis is CR-trnQ-trnI-trnM, while that in the 7 families is CR-trnI-trnQ-trnM. In this case, tandem duplication of gene regions may be the most likely mechanism for mitochondrial gene rearrangement, which includes trnI and trnQ, followed by loss of supernumerary genes43, 44. Slipped-strand mispairing occurred first, followed by gene deletion45. Partial PCGs, tRNAs, and rRNAs of Damithrax spinosissimus, G. dehaani, and Xenograpsus testudinatus appear to be rearranged compared to C. sinensis (Fig. 4E–G).

Phylogenetic analysis

Our analyses were based on the NT dataset in mitogenomes derived from 29 Brachyura species belonging to 12 families (Varunidae, Xenograpsidae, Homolidae, Menippidae, Mithracidae, Potamidae, Portunidae, Raninidae, Bythograeidae, Sesarmidae, Grapsidae, and Dotillidae). The data matrix (15,706 bp in all) was analysed using the model-based evolutionary methods of BI and ML analyses (Fig. 5). The ML and BI analyses of the dataset gave the same tree topology. It is obvious that C. sinensis and S. sinensis clustered in one branch in the phylogenetic tree with high nodal support values (Fig. 5), indicating that C. sinensis and S. sinensis have a sister group relationship. This result supported that C. sinensis belongs to Grapsoidea, Sesarmidae. From the phylogenetic tree, we found that X. testudinatus and two Sesarmidae species formed a group and showed close relationships. X. testudinatus, which was originally placed in Varunidae, has been transferred to its own family (Xenograpsidae)21, 46. Analysis of the nucleotide sequences of the 13 mitochondrial PCGs using BI and ML showed that E. j. sinensis, E. j. hepuensis, E. j. japonica, and H. latimera clustered together with high statistical support, showing that these species have a sister group relationship and belong to Grapsoidea, Varunidae. Our phylogenetic analysis indicated that Sesarmidae species, Xenograpsidae species and Varunidae species have close relationships47. In addition, P. crassipes belongs to Grapsoidea, Grapsidae48.

Figure 5
figure 5

Inferred phylogenetic relationships among Brachyura based on nucleotide sequence of 13 mitochondrial PCGs using maximum likelihood (ML) and Bayesian inference (BI). Alpheus distinguendus was used as the outgroup. The bootstrap value (BP) and Bayesian posterior probability (BPP) of each node are shown as BP based on the NT dataset/BPP based on the NT dataset, 100/1.00.

The phylogenetic position of Ilyoplax deschampsi is always within Grapsoidea21, 47, 49, 50. I. deschampsi belongs to the family Dotillidae, Ocypodoidea. The real phylogenetic position of I. deschampsi should be closer to the Grapsoidea species that shown in Fig. 5. Recent studies on the genus Ucides have also shown similar classification51, 52. G. dehaani belongs to Potamidae, Potamoidea53. However, the phylogenetic tree showed that Potamidae are associated closely with Varunidae, Grapsidae, Sesarmidae, Dotillidae, and Xenograpsidae. This result is in agreement to that inferred from 23 Brachyuran crabs, in which the author use the two mitogenomes21. Phylogenetic relationships between I. deschampsi, G. dehaani and Grapsoidea species need to be reconsidered by integrating more mitogenomic data. More mitogenomic data will also lead to a better overall understanding the phylogenetic relationships among Brachyuran crabs.

Availability of data and materials

The data set supporting the results of this article is available at NCBI (KU589292).