Comparative mitogenomics and phylogenetics of the family Carangidae with special emphasis on the mitogenome of the Indian Scad Decapterus russelli

Carangids are abundant and commercially important marine fish that contribute to a significant portion of the fisheries in many parts of the world. In the present study, we characterized the complete mitogenome of the Indian scad, Decapterus russelli and performed a comprehensive comparative mitogenomic analysis of the family Carangidae. The comparative mitogenomics provided valuable insights into the structure, variability, and features of the coding and non-coding regions that evolved across species over millions of years. The structural features of tRNAs revealed changes in the frequency of mismatched and wobble base pairs, which is reflected in the base composition of H and L strands. The highly conserved sequence motif of the mTERF binding site in carangids over the ~ 400 MYA of their divergence demonstrated the functional importance of these sites. The control region of carangids was characterized by the presence of discontinuous repeat units with a high rate of sequence divergence in the form of base substitutions, insertions, and deletions. The maintenance of secondary structures in the control region independent of the rapid evolution of primary structure suggested the effect of selective constraints on their maintenance. Maximum likelihood (ML) and Bayesian inference (BI) phylogeny revealed a similar topology consistent with previous taxonomic studies. The extant carangids diverged through the evolutionary events experienced during the Cretaceous, Paleogene, and Neogene periods.

www.nature.com/scientificreports/ of selective constraints for their maintenance. Although Zhu et al. 25 analyzed the phylogenetic relationships of carangids using the control region, there is a gap in knowledge regarding the mutational pattern and the nonrandom distribution of these mutations between the conserved motifs and their flanking regions. There is also a lack of knowledge about the structure and variability of tRNA genes in carangids and the frequency of occurrence of G-U wobble base pairs, the fundamental building block of RNA structure critical for RNA function in various biological systems. The phylogenetic relatedness between the carangids has been extensively studied using comparative morphological 4,[26][27][28][29][30][31] and molecular tools [32][33][34][35][36][37] . Although there has been a discourse on monophyly 4,28-31 vs. paraphyly 36,37 of carangids, the congruent topologies of phylogenetic trees constructed using DNA sequences suggested paraphyly of carangids if Echeneoids were not included [35][36][37][38] . Carangids can be considered as promising candidates for evolutionary studies, as they have a wide diversity in terms of habitat (marine, brackish water, freshwater), morphology (having variable body size; less than 20 cm to ~ 2 m), and shape (deep-bodied, streamlined). Some species of remoras live even as commensals on large pelagic vertebrates 39 . The phylogeographic study inferred from nuclear gene supermatrix suggested a terminal cretaceous crown age for carangids with major lineages diversifying during the Paleogene 36 . A late Eocene age has also been proposed for the crown carangids 35 . But densely sampled molecular studies indicated a Cenomanian stem age of the Carangoids which had undergone a significant diversification during the late cretaceous 37,38 . The subfamily Naucratinae originated early in the Paleocene 40 , Trachinotinae date back to the early Oligocene and the remaining Carangoid lineages including the morphologically heterogeneous group, Caranginae comprising the genus Decapterus appear to have originated during the middle Eocene 38 . The genus Decapterus originated in the upper Miocene 41 . The genetic distance between the genus Decapterus was significantly greater than that between Trachurus 42 , suggesting that Decapterus arose long before its sister group Trachurus, making them the most divergent species group. The morphology of the genus Decapterus includes multiple forms such as long-bodied, cylindrical, mackerel-like with a small mouth, reduced teeth and reduced scutes on the straight part of the lateral line, suggesting adaptations to the wide ranging off-shore life 43 .
In the present study, we report the complete mitogenome of D. russelli for the first time elucidating the special characteristics. Although recent studies have examined the evolutionary history of carangids 44 , we reanalyzed the patterns by including mitogenomes from 37 species and describing the relationships and evolutionary patterns among species with a robust phylogenetic tree. We also performed a comprehensive comparative analysis of carangid mitogenomes especially the mutational patterns of coding and non-coding regions, control region secondary structure, and structural features of tRNA to gain insights into the evolutionary dynamics of mitochondrial DNA.

Results and discussion
Mitochondrial genome organization and composition. The total length of the mitochondrial genome of D. russelli was 16,542 bp. The genome organization is congruent with that observed in other teleost fishes with a standard set of 13 putative PCGs, 22 tRNA genes, 2 rRNA genes, and a control region. All genes were encoded on the heavy strand (H) except for the ND6 and 8 tRNA genes which were encoded on the light strand (L) (Supplementary Fig. S1). A similar gene arrangement was observed in other Carangidae species studied (Supplementary Table S1). The overall nucleotide composition of D. russelli mitochondrial genome was 27.5% A, 30.2% C, 25.4% T, and 16.9% G. The AT and GC skew values of the whole mitogenome were 0.039 and − 0.028 respectively.
The results suggested that A and C are predominant base constituents than T and G, with the overall nucleotide composition being severely C-skewed. These results agree well with their respective counterparts in other Decapterus spp. [45][46][47] The bias against G was presented following the results reported previously in other teleosts 48,49 . The AT-/GC-skew values of other Carangidae species were also determined (Supplementary Table S2). The sequence of the complete mitogenome of D. russelli has been deposited with NCBI, GenBank with the accession number MN711693.
Overlapping and Intergenic spacer regions. Seven gene overlapping regions ranging from 1-7 bp and 11 spacer regions varying in length from 1 to 38 bp were detected in the D. russelli mitogenome with the longest intergenic spacer region observed between tRNA Asn and tRNA Cys. Significant differences in the number and length of both the intergenic spacer region and the overlapping gene regions were observed in carangids (Supplementary Table S3). The number of spacer and overlapping gene regions varied between 10-12 and 6-15, respectively, between species. The intergenic spacer is distributed over 11 regions in almost all carangid species studied so far. However, an additional spacer region was observed in C. armatus (tRNA-His-tRNA-Ser (GCT)), E. bipinnulata (ND6-tRNA-Glu), G. speciosus (tRNA-Val-16SrRNA), P. niger (16SrRNA-tRNA-Leu (TAA)), S. crumenophthalmus (ND6-tRNA Glu) and S. dumerili (ND6-tRNA Glu). In addition, the spacer region found between tRNA-Arg and ND4L in all species was absent in Seriola spp, S. nigrofasciata, T. carolinus and T. ovatus. Besides, C. equula and P. dentex lacked a spacer region found between ND1 and tRNA-Ile which was present in all other taxa. A comparative analysis among 37 carangids revealed that the longest spacer (42 bp) occurred between tRNA-Asn and tRNA-Cys of T. carolinus. Seven regions of overlapping gene regions were observed in most carangids, with the longest overlapping region being between ATP8 and ATP6 in some species and between ND4L and ND4 in others. www.nature.com/scientificreports/ We observed that the value of Ka was significantly smaller than Ks in all 13 PCGs on analyzing the evolutionary rate of PCGs. This indicates that there is a much lesser chance for a mutation to be different between species that can modify the amino acid sequences of a protein sufficiently to change its function than one which is silent 52 Table S5). The fundamental features of genome evolution were largely determined by the influence of two non-adaptive forces, mutation pressure and genetic drift 53 . Nevertheless, the deleterious effect of mutation is very difficult to establish under purifying selection 54 . In PCGs, the average Ka/Ks ratio varied from 0.008 in COI to 0.469 in ATP8. Altogether the results indicated that the functional genes of carangids have undergone strong purifying selection to eliminate deleterious mutations thus maintaining the protein structure.

Protein Coding Genes (PCGs
We estimated the degree of mitochondrial gene conservation by assessing the total p-genetic distance between 37 Carangidae species ( Supplementary Fig. S3). In our analysis, we obtained the highest overall p genetic distance for the ATP8 gene (0.301) and the lowest value for the COI gene (0.019) based on the data of the first and second nucleotide position of codons. The same trend was also observed in the values discerned from the full-length comparison of each gene, with ATP8 and COI represented by a total p-genetic distance value of 0.390 and 0.134, respectively. Based on these results it could be inferred that gene, ATP8 has the fastest evolutionary rate among Carangidae and COI the slowest. Besides, all genes had overall high p-genetic distance values for wobble nucleotide position, ranging from 0.393 to 0.564. This result was consistent with previous reports in other fish species that most of the differences occurred at the 3 rd codon position of PCGs in mtDNA 55 . A comparative analysis showed that the codons ending in A or C occurred most frequently in amino acids with sixfold and fourfold degenerate third positions in Carangidae. In a two-fold degenerate codon, C was used more frequently than T. Consistent with the overall bias against G 56 , it was the least common wobble position nucleotide in all categories. However, analysis of relative synonymous codon usage (RSCU) showed that the codons UCC for Ser, CUC and CUA for Leu and GCC for Ala occurred most frequently, while UUG for Leu, AGU for Ser and GCG for Ala were rare in the mitogenome of Carangidae ( Supplementary Fig. S4B). It was also inferred from the RSCU value that the synonymous codons NNA and NNC were in the majority than the codons NNT and NNG (Supplementary Table S6).

Codon usage analysis.
Transfer RNAs and ribosomal RNAs. The mitogenome of D. russelli contained 22 tRNA genes typical of the metazoan mitogenome. In total, the tRNA genes contributed 1555 bp to the entire genome, which ranged from 67 bp (tRNA-Cys) to 74 bp (tRNA-Lys (TAA)). The secondary structure of tRNA genes predicted by ARWEN 57 makes it clear that all tRNAs could be folded into a canonical cloverleaf structure, with the exception of tRNA-Ser (GCT), in which no recognizable dihydrouridine (DHU) stem was found ( Supplementary Fig. S5). Comparative analysis with other carangids revealed that the total length of tRNAs varied from 1548 (S. rivoliana and E. bipinnulata) to 1564 (T. blochii). The overall A + T content was estimated at 54.2%. Base skew analysis of 22 concatenated tRNA genes (Supplementary Table S2 revealed a positive AT skew (0.125) and a negative GC skew (− 0.098). The comparisons showed that the AT skew varied from 0.105 (S. nigrofasciata) to 0.224 (T. ovatus) and the GC skew varied from -0.120 (S. rivoliana) to − 0.091 (T. carolinus). The anticodons of these 22 tRNA genes were identical to those reported in other carangids (Supplementary Table S7). Despite the general oneto-one codon-anticodon correspondence, tRNA genes for Ser (TGA and GCT) and Leu (TAA and TAG) have been determined by two types of anticodons. Furthermore, G-U mismatches were observed for 16 of the tRNAs in their respective secondary structures ( Supplementary Fig. S5). Five of the tRNA genes appeared to have A-C mismatches (tRNA-Val (TAC), tRNA-Leu (TAA), tRNA-Ile (GAT), tRNA-Arg (TCG), tRNA-His (GTG) in their respective amino acid acceptor stems. Similarly, two A-A mismatches, each for tRNA-Phe (GAA) and tRNA-Ser (GCT) and two C-C mismatches each for tRNA-Leu (TAG) and tRNA-Thr (TGT) were also detected. The presence of unmatched base pairs in mitochondrial tRNAs appeared to be a general trend that can later be repaired by post-transcriptional editing mechanisms prevalent in vertebrate mtDNA 58 .
Two rRNA genes, a small (12 s rRNA) and a large (16 s RNA) subunit were located in the H strand of D. russelli. The total length of two concatenated rRNA genes was estimated to be 2677 bp with an A + T content of 53.7% (Supplementary Table S2). The AT content of the 12S rRNA gene was 52.2% with a positive AT skew of 0.172 and a negative GC skew of − 0.079. Similarly, the AT content of the 16S rRNA gene was 54.4% with a positive AT skew (0.191) and a negative GC skew (− 0.116). Moreover, the presence of a higher proportion of A and C bases in rRNA genes indicates an AC-rich trend, consistent with previous reports from other teleosts 53,59,60 . Consistent with previously reported results from other mitochondrial genomes 61-63 , these two rRNA genes were positioned between tRNA-Phe and tRNA-Leu  Table S2).
Features of tRNAs. Secondary structure. The comparative analysis of 37 carangids revealed that most of the tRNAs are structurally different in both the stem and the loop. However, 2 tRNA genes, tRNA-Trp, and tRNA-Leu (TAG) exhibited similarities in their secondary structure (Fig. 1). As can be seen from www.nature.com/scientificreports/ variation was detected in all four stems. The anticodon stem (AC stem) of all the tRNAs was 5 bp, except for tRNA-Ser (GCT), which displayed length variation (5 or 6 bp). Though the pseudouridine (TΨU) stem of most tRNAs was 5 bp, tRNA-Phe (4 bp), tRNA-Met (4 or 5 bp), tRNA-Lys (4 bp), and tRNA-Ser (GCT) (6 bp) indicated length variation. In the acceptor stem (AA stem), tRNA-Ser (TGA) displayed a length of 6 bp, and the rest of the tRNAs showed more of a length variation of either 7 bp or a range of 7 to 8 bp. Although the DHU stem was mostly 4 bp in length with a range of 3 to 5 bp, tRNA-Ser (GCT) of all carangids examined had only a small loop with no stem. Of the four loops, the length of the AC loop was fixed (4 nucleotides), whereas the remaining three loops showed considerable variation in length. Of these, the DHU loop was extremely variable, ranging from 3-13 Wobble base pairs in the stem regions. The wobble base pair is a unique feature of RNA structure and their often high evolutionary conservation 64 underscores their functional and structural importance. Due to its thermodynamic stability, the G-U often substitutes G-C or A-U base pairs and it differs from Watson-Crick and other mismatched base pairs owing to its structural, conformational, and functional properties 65 . Diverse proteins and other ligands specifically targeting tRNA exploit these features 66 and thus the comparative analysis of tRNAs is very crucial for understanding the structural and functional features of the mitogenome. Supplementary Table S8 illustrates the frequencies of Watson-Crick and Wobble pairing at each of the 25 base pair sites of each tRNA in 37 carangids (8 in AA stem + 5 in DHU stem + 6 in AC stem + 6 in TΨU stem). Seven sites from each stem (1-87, 2-86 and 8-80 base pair sites in the AA stem, 12-32 and 15-30 in the DHU stem, 40-48 in the AC stem, and 64-74 in the TΨU stem) revealed a high rate of (100%) Watson-Crick and Wobble base pairing (hydrogen bonding) and these sites may play a pivotal role in maintaining stem structure. Among the tRNAs, tRNA-Ala, tRNA-Asn, tRNA-Cys, tRNA-Ser (TGA), tRNA-Asp, and tRNA-Pro had a higher frequency (≥ 98% overall average) of hydrogen bonding than other tRNAs. Moreover, L strand-encoding tRNAs exhibited, on average, a higher frequency of hydrogen bonding (97.48%) than H strand encoding tRNAs (94.78%). Table 3 shows the number of G-U wobble base pairs in each stem of 22 tRNAs for the 37 carangid species. Individual stems and tRNAs rather displayed considerable variations in the frequency of GU base pairs, 6% of the base pairs comprised of wobble base pairs on average in the four stems of mitochondrial tRNA with a large difference between stems (4.57% in the TΨU stem and 12.26% in the DHU stem). A substantial difference was also discerned between the tRNAs (0.72% in tRNA-Ile versus 19% in tRNA-Glu) with no wobble base pairs observed in the case of tRNA-Lys and tRNA-Leu (TAG). Furthermore, the tRNAs encoded by different strands exhibited a significant difference in the frequency of wobble base pairs, as the L-strand encoded tRNAs had a higher frequency of wobble base pairs (13.18%) than those of H-strand encoded tRNAs (3%) (Mann-Whitney U test, p < 0.01).
Mitochondrial transcription termination factor (mTERF) binding site. We compared tRNA-Leu (TAA) which has been reported to contain the target binding site for mTERF in humans 67 to identify its presence in Carangidae. An important tridecameric sequence (5'TGG CAG AGC CCG G3') reported in the human mitochondrial tRNA-Leu (TAA) gene 48,49 was also identified in Carangidae in the same region as the corresponding part of www.nature.com/scientificreports/ their DHU arm. The mTERF binds to this site with high affinity and regulates the level of transcription from the rRNA genes into the remaining genes of the major coding strand 50 . All bases in the 13 nucleotide positions identified in 37 carangid taxa were completely identical to those in humans ( Supplementary Fig. S7), suggesting that this site in carangids may follow the same mechanism of function as proposed for humans.  Table S2). However, the control region of A. mate, A. djedaba, C. ignobilis, G. speciosus, and T. blochii possess a slightly higher percentage of adenine (A) than that of D. russelli. Besides, the composition of thymine (T) is higher than adenine (A) in C. armatus, C. malabaricus, C. tille, C. melampygus, T. japonicas, S. crumenophthalmus, S. dumerili, and S. lalandi. The GC skew of all the carangids was negative. It was found that the maximum length variation for the control region was observed in the range of 840 bp (D. russelli) to 904 bp (S. crumenophthalmus) in the carangids examined. Among the carangids, S. crumenophthalmus possessed the longest mitogenome, and S. rivoliana and S. dumerili were determined to have the shortest mitogenome. It was evident that these species do not exhibit any notable variation with respect to their PCGs, tRNA and rRNA ( Supplementary  Fig. S6), hence the 61-62 bp variation in the control region alone contributed to the difference in their overall mitogenome length. The majority of the primary sequences of the control region are presumed to be not involved in regulatory function as this region shows great variation even among closely related species 68 .
Comparative mitogenomics of carangids revealed high variability in the mitochondrial control region due to length variations and the accumulation of base substitutions and indels. Consistent with the recognition sites in many marine fish 69   www.nature.com/scientificreports/ and ' ATGTA' were found in multiple copies distributed towards the 5' end of the control region [ Fig. 2; Supplementary Table S9] and are believed to act as the termination site of heavy strand replication 69 . The segment corresponding to Extended Termination Associated Sequence (ETAS) previously reported in Scomberoides commersonianus, a member of the Carangidae 25 , could not be aligned to the corresponding segment in the control region of either species. No tandem repeat sequences were detected in the control region of Carangidae, but the T homopolymer of 5-12 nucleotides between CSB-D and CSB-1 was found in all taxa. Along with these discontinuous multiple interval repeats of ATA TTA and TAT AAT were also observed in the control region of Carangidae.
All species except D. maruadsi displayed substantial differences from the key sequences of CSB-E 25 and all were substitutions ranging from 1 to 9 nucleotides. Furthermore, M. cordyla, P. dentex, and Caranx spp. lacked a 'GTGGG'-box which is a typical feature of CSB-E in teleosts 69 . The CSB-D was highly conserved among carangids and exhibited high sequence similarity between species (Table 4). Although the functions of the CSBs remain unclear, CSB-D is highly conserved in fish families and is involved in regulation of H-strand replication and initiation of D-loop structure 70 and could play a potential role in mitochondrial metabolism 66 . Conserved sequence blocks (CSB-1,-2, and -3) exhibit a higher level of divergence in carangids with interspecies substitutions, insertions, and deletions (Table 4). Except for P. dentex, S. nigrofasciata, S. leptolepis and Seriola spp. other species showed a characteristic sequence at the 3' end (Table 4). These variations imply rapid evolution of the primary structure in the Carangidae control region, which can provide information for understanding their structural and functional relationship. Furthermore, these conserved regions are involved in the formation of secondary structures in the control region (Fig. 2) that serve as the target sites for proteins or enzymes 69 . These conserved sequence motifs formed similar secondary structures in Carangidae although their primary sequences are not conserved. Therefore, the data reinforce that although there is sequence divergence in the conserved sequence motif, their structural elements and locations appeared to be conserved in Carangidae and some selective constraints on these regions acted to maintain their structure and function.

Pd absent
Ci **************** Ct **************** Cme **************** Tt **************** Tj **************** Sc *****C*********T** www.nature.com/scientificreports/ Phylogenetic analysis. The phylogenetic position of D. russelli is shown in Fig. 3 relative to 36 species of carangids based on 1380 bp sequences. Analogous topologies were obtained for both maximum likelihood analysis (ML) and Bayesian inference (BI) with almost 100% bootstrap support and strong posterior probability values, respectively (Fig. 3). The result implied that 37 species formed 3 lineages corresponding to the subfamilies Caranginae, Naucratinae, and Trachinotinae. The reconstructed phylogenetic tree places D. russelli in the same branch with other Decapterus spp. and forms a sister group to the genus Trachurus and the findings support the previous classification of carangids 44,74 . The phylogenetic tree recovered the three carangid subfamilies as monophyletic and it also supported the groupings (Trachinotinae + (Naucratinae + Caranginae)). This result confirmed the phylogenetic relationship of carangids described by previous findings based on morphological synapomorphic characters 4,31 and partial sequences of mitochondrial genes 32,75 . Within the Naucratinae, the monotypic E. bipinnulata branched individually early from Seriolina and Seriola spp. After diverging from Seriolina, all Seriola species formed a monophyletic clade. Furthermore, both phylogenetic trees indicated two wellseparated lineages in Caranginae. The first lineage included the genus Selar, Trachurus, and Decapterus. This lineage also encompassed the species C. equula on the same branch with P. dentex in contrast to other Carangoides Spp. which were grouped into a different branch (Fig. 3). This result confirms the findings of previous studies 44,76 and thus provides compelling evidence for the assignment of C. equula to its previously described genus, Kaiwarinus. The second lineage was the most taxonomically diverse group of the family Carangidae delineated in the present study as it consisted of 10 genera divided into two larger groups. The first lineage comprised the genera: Caranx + Megalaspis + Gnathandon as well as Atule + Alepes. The latter also included the monotypic S. leptolepis which individually branched off from this lineage early. The second lineage included Uraspis + Parastromateus and Alectis + Carangoides. In addition, a paraphyletic relationship between Caranx + M. cordyla, Alepes + A. mate, and Uraspis + P. niger was also observed.
Divergence time estimation. The RelTime ML analysis revealed that the differentiation of D. russelli could be dated back to about 4.69 Mya in the early Pliocene. The family Carangidae had diverged from other lineages when the Cretaceous-Paleogene extinction event occurred and it was estimated to be around 79 Mya towards the end of the cretaceous (Supplementary Fig. S10). It was observed that the separation of the two groups, Trachinotinae and (Caranginae + Naucratinae) occurred at 74 Mya at the end of the Cretaceous. Further, the split between the subfamilies Caranginae and Naucratinae occurred at 62.20 Mya (Early Paleocene). Although the diversification of most species after the Cretaceous mass extinction occurred in the Paleogene 75 , several more recent Neogene radiations can also be observed (Fig. 4).The most recent split was observed at 0.34 Mya between the two Uraspis species. Our analyzes of the diversification of the genus Trachurus (18.40 Mya, early Miocene) were consistent with previous studies based on the data of morphological characters and ecological aspects 77 and cyt b gene sequence analysis 78 . The split between Seriola + Seriolina and Elgatis occurred during the early Paleocene, consistent with Swart et al. 40 but much younger than Le et al. 44 . The radiation of Seriola appears to be 20.87 Mya which is in direct agreement with Le et al. 44 , but slightly predates the estimates by Santini and Carnevale 38 .

Conclusion
The www.nature.com/scientificreports/ conservation of its sequence suggest the functional significance of these sites. The high rate of sequence divergence in the flanking regions of conserved sequence motifs indicated the absence of typical mitochondrial control region coding constraints. The presence of evolutionarily conserved secondary structures of the conserved sequence motifs despite sequence divergence revealed the effect of selective constraints on their maintenance. The length variations in protein-coding, tRNA, and rRNA genes among Carangidae were negligible. The phylogeny inferred from 13 gene concatenated data set resolved the monophyletic lineage of each subfamily and robustly supported the grouping (Trachinotinae + (Caranginae + Naucrautinae)) with the caveat that the subfamily Scomberoidinae could not be included as the mitogenome sequences were not available. Further, the speciation of the genus Decapterus occurred before the separation of Tethys Sea and D. russelli diverged before the mid-Pliocene warm period. Future studies involving the mitogenomic data of Scomberoidinae are required for a better understanding of the phylogenetic and evolutionary relationship between major groups www.nature.com/scientificreports/ within the Carangidae. Mitogenomic phylogeny is also influenced by the habitat characteristics of each group, as habitat-specific selection and adaptation in mitochondrial coding genes has recently been reported 79 . The subfamilies Trachinotinae, Naucratinae, and Caranginae consist of species with differential habitat preferences (pelagic, benthopelagic, and reef-associated) and schooling behavior. The mutational patterns in the proteincoding regions of these fish need to be further analyzed for adaptive variation and selection and correlated with habitat characteristics to understand the forces of selection, if any.

Materials and methods
Sampling. The wild specimen of D. russelli was collected from Kochi, Kerala (9°56′19"N 76°15′45" E) from trawl and purse seines. The species was confirmed based on morphological characters. Tissue samples were preserved in 95% ethanol for DNA extraction and further analysis. The fish sample used in this study was handled according to the guidelines for the care and use of fish in research by De Tolla et al. 80 . The protocols were approved by the ethical committee of the ICAR-Central Marine Research Institute, Kochi. These methods are also reported following ARRIVE guidelines (http:// arriv eguid elines. org).
DNA extraction and PCR amplification. Genomic DNA was isolated as previously described 81 . The extracted DNA was then eluted in T.E buffer (1X) and stored at − 20 °C for later use as a PCR template. The quality and quantity (concentration) of the isolated total DNA was checked by gel electrophoresis and NanoDrop One Spectrophotometer (Thermo Scientific, USA), respectively. The complete mitochondrial genome of D. russelli was determined by combining the sequences of contiguous and overlapping fragments amplified by PCR using both universal as well as custom-designed primers. Initial amplifications were performed using mitochondrial universal primer sequences (FishF1 + FishR1, 16SaRL + 16SbRH, 12S1 + 12S2, Dloop-Thr-F + Dloop-Phe-R, ATP 8.2 L8331 + CO3.2 H9236) [82][83][84][85][86] . The sequenced products of this reaction then served as a template for subsequent amplification. Fifteen sets of new primer pairs were designed based on the aligned complete mitogenome sequences of 36 species of carangids using Primer 3 software (http:// frodo. wi. mit. edu/ prime r3/) 87 . The above primers were screened for potential secondary targets using Primer Blast (https:// www. ncbi. nlm. nih. gov/ tools/ primer-blast/) 88 . PCR amplifications were performed in a Biorad T100 thermal cycler (Bio-Rad, USA). The reaction mixture included 15.6 µl sterile deionized water, 2 µl buffer (10X), 0.4 µl dNTP mix (10 mM each), 0.4 µl of each oligonucleotide (10 µM), 1 µl of template DNA (50 ng/µl) and 0.2 µl (1 unit) of Taq DNA polymerase with a total reaction volume of 20 µl. For confirmation, PCR products were electrophoresed on 1.5% agarose gel (1X TBE) and visualized using a gel documentation system (Bio-Rad, USA). The products were then sequenced completely in both directions using sanger sequencing.
Sequence analysis. Target sequences were assembled with the software MEGA X 89 . The identity of the D.
russelli mitogenome was verified with other carangid mitogenomes (Supplementary Table S2) by BLAST search. These mitogenome sequences served as the reference for assembly and manual annotations. The organization and the order of the genes were initially determined using the MitoFish database 90 and later verified manually by comparison with the reference sequences. The 13 PCGs and their codon usage were summarized with MEGA X 89 . The rate of Ka/Ks substitutions in the mitogenome of Carangidae was calculated by DnaSP 6.12.03 91 . tRNA genes were first annotated by using tRNA scan SE 92 with default parameters coupled with ARWEN software 57 . The secondary structure of conserved sequence motifs of the control region was inferred using Mfold web server 93 . The complete mitogenome map of the species was graphically visualized by OrganellarGenomeDRAW 94 . The 37 mitogenomes of Carangidae were compared by using CG view comparison tool (CCT) 95 with D. russelli as the reference sequence. Genes were assigned by Clusters of Orthologous Groups (COG) using the CG view comparison tool (CCT) and BLAST was used to align other genomes to that of D. russelli. Phylogenetic analysis. All available mitogenome sequences of other carangid species were retrieved from GenBank to find out the relative position of D. russelli among carangids and to delineate its phylogenetic relationship. L. crocea (GenBank accession number NC011710) was taken as an outgroup. The 13 PCGs were aligned using Clustal W and translated into amino acid sequences in MEGA X 57 . For the subsequent phylogenetic analysis, multiple alignments were generated for the 13 concatenated protein-coding sequences of D. russelli and 36 other Carangidae species. The phylogenetic relationship was inferred by constructing a tree based on Bayesian inference (BI) and maximum likelihood (ML) methods. The best model, GTR + G + I, was selected using Akaike Information Criterion (AIC). Bayesian phylogenetic analysis was performed using MrBayes 96 with the Markov Chain Monte Karlo Method (MCMC). MCMC chains were run for 10 million generations and obtained a standard deviation of split frequency less than 0.01. The ML phylogenetic tree was inferred using RaxML 97 and a consensus tree was obtained with 1000 bootstrap replicates using default parameters. The final phylogenetic tree was displayed, annotated, and embellished in FigTree software (http:// tree. bio. ed. ac. uk/ softw are/ figtr ee/) 98 .
Divergence time estimation. Molecular dating of 37 Carangidae species was performed with the Rel-TimeML method in MEGA X 99 using GTR + G + I modeling. The phylogenetic tree inferred from Bayesian analysis was imported to MEGA X to estimate divergence times. The time tree web resource (http:// www. timet ree. org/) 100 was used to infer calibration boundaries for a pair of taxa (minimum and maximum) and the same database was used to establish the time tree of 27 genera of carangids.

Data availability
All data generated or analyzed during this study is included in this published article (and its Supplementary information files).