Evolutionary and phylogenetic insights from the mitochondrial genomic analysis of Diceraeus melacanthus and D. furcatus (Hemiptera: Pentatomidae)

The mitochondrial genomes of D. melacanthus and D. furcatus were sequenced and used to investigate the phylogenetic relationships with 54 species of Pentatomidae. Their mitogenomes are 17,197 and 15,444 bp-long, respectively, including 13 protein-coding genes (PCGs), 2 ribosomal RNA genes, and 22/21 transfer RNA genes, with conserved gene arrangement. Leu, Lys, and Ser were the most common amino acids in their PCGs. PCGs evolutionary analysis indicated their mitogenomes are under purifying selection, and the most conserved genes are from the cytochrome complex, reinforcing their suitability as markers for molecular taxonomy. We identified 490 mtSSRs in 56 Pentatomidae species, with large variation and a positive correlation between mtSSR number and genome size. Three mtSSRs were identified in each Diceraeus species. Only the mtSSR in the nad6 (D. melacanthus) and nad4 (D. furcatus) appear to have application as molecular markers for species characterization. Phylogenetic analysis confirmed the monophyly of Pentatomidae. However, our analysis challenged the monophyly of Pentatominae and Podopinae. We also detected unexpected relationships among some tribes and genera, highlighting the complexity of the internal taxonomic structure of Pentatomidae. Both Diceraeus species were grouped in the same clade with the remaining Carpocorini analyzed.

The green-belly bugs Diceraeus furcatus and D. melacanthus (Hemiptera: Pentatomidae) are important pests of soybean, winter cereal, and especially maize in Brazil [1][2][3][4] .Their increased abundance is attributed to the agricultural system in practice in Brazil, with the continuous cultivation of maize following soybean harvest.This system provides year-round food and shelter for green-belly bugs and other pests [5][6][7] .Despite their relevance as agricultural pests, few molecular studies have been conducted on these species 8,9 , especially concerning their genetic delimitation, given their morphological similarity and considerable phenotypic plasticity 10 .
The identification of some insects is challenging and it requires additional tools, such as the use of integrative diagnostic methods based on molecular, morphological, and paleontological data 11 .In recent decades, the use of DNA barcode markers based on a short region of the mitochondrial cytochrome oxidase I gene (cox1) has been the most used molecular tool.The cox1 became widely used for species recognition and delineation in various taxonomic groups 12 , although it has shown limited application for species delineation in taxonomic groups carrying low intraspecific variability 13 .
The mitochondrial genome (mtDNA) has emerged as a powerful tool for insect diversity and phylogenetic studies, expanding the potential pool of informative marker genes 13 .Its unique features, such as small size 14 , stable genetic composition 15 , maternal inheritance 16 , orthologous genes and low intermolecular recombination rates 17 , make it suitable for evolutionary studies.mtDNA serves as a source of molecular markers for population genetics, comparative evolution, divergence time analysis [18][19][20] , and gene rearrangements 21,22 .mtDNA markers are also instrumental in species recognition and delineation 23 , resolving conflicts of species complexes and in studying the biodiversity of taxonomic groups lacking experts for identification.
The mitochondrial genome of insects is circular, compact and up to 18 kb.It typically comprises 37 genes, including 13 protein-coding genes (PCGs), 2 ribosomal RNA genes (rRNA), and 22 transfer RNA genes

Sequencing, assembly, and annotation of the mitochondrial genome
Libraries for sequencing the mtDNA of D. melacanthus were prepared using the Nextera XT DNA Library Prep system (Illumina), while those for D. furcatus were prepared using the TruSeq Nano DNA Library Prep system (Illumina), given the total amount of mtDNA obtained for each of the studied species.The mitochondrial genome sequencing of D. melacanthus was conducted on the Illumina MiSeq V3 platform (2 × 250 bp), while that of D. furcatus was performed on the NextSeq platform (2 × 100 bp).
The de novo assembly of the mitogenomes was performed using the NovoPlasty algorithm 45 , using a partial sequence of the cytochrome oxidase I gene from D. melacanthus (SRX2888372) deposited in the National Center for Biotechnology Information (NCBI) database (https:// www.ncbi.nlm.nih.gov/) as a seed.Annotation of the assembled mtDNA was conducted using the MITOS Web Server for mitochondrial annotation (http:// mitos.bioinf.uni-leipz ig.de/ index.py) 46 .The same software was employed to determine gene, rRNA, and tRNA boundaries, as well as to predict RNA secondary structures.GenomeVX (http:// wolfe.ucd.ie/ Genom eVx/) 47 was used to visualize the complete mitogenomes.
Nucleotide and amino acid composition, the relative synonymous codon usage (RSCU), the number of non-synonymous substitutions per non-synonymous site (ka), the number of synonymous substitutions per synonymous site (ks), and the ratio (ka/ks) for each protein-coding gene (PCG) were calculated in Mega X 48 .Ka and Ks values were obtained using trimmed PCG sequences.AT-GC content and skew statistics were calculated using CGView 49 .
The complete mitogenomes of D. melacanthus and D. furcatus were also utilized to identify short mitochondrial DNA sequence repeats (mtSSRs) with the MIcroSAtellite tool (https:// webbl ast.ipk-gater sleben.de/ misa/) 50 .The minimum repeat length used was ≥ 12 for mononucleotides, ≥ 6 for dinucleotides, ≥ 4 for trinucleotides, and ≥ 3 for tetra-, penta-, and hexanucleotides.The interruption between two microsatellites was considered as zero.To facilitate comparisons of genomes of different sizes, SSRs were standardized per 1 kilobase (kb) of genome, calculating the relative abundance (RA) and relative density (RD) for the identified mtSSRs.RA provides the total number of SSR classes per kb of genome, and RD offers the total length (in nucleotides) contributing to 1 kb of genome nucleotides 51 .
The mitochondrial genomes of D. melacanthus and D. furcatus have been deposited in GenBank under accession numbers PP235949 e PP235950.

Phylogenetic analyses
Phylogenetic analyses were conducted using the mitogenomes of D. melacanthus and D. furcatus, and of 54 species of Pentatomidae available at the NCBI GenBank database (Table 1).Two additional mitogenomes from Aleyrodidae and Reduviidae were used as outgroups (Table 1).We pre-processed the dataset using the Mito-PhAST V3.0 pipeline 52 .Nucleotide sequences of PCGs were extracted, translated into amino acids, and aligned using Clustal Omega 53 .Poorly aligned regions were removed with trimAI 54 , and a supermatrix was constructed with the concatenated alignments of PCGs in the following order: nad1, nad4, nad3, cox3, cox1, nad4L, cytb, atp8, nad6, nad2, cox2, atp6, and nad5.Phylogenetic analyses were performed using Maximum Likelihood (ML) and Bayesian Inference (BI) methods.ML analyses were conducted in IQ-TREE v1.6.10 55 , using the mtInv + F + R5  57 .Two independent runs with four chains (three heated and one cold) were conducted simultaneously for 10 million generations, with sampling every 1000 generations and 25% of burn-in.The remaining samples were used to construct the consensus tree and Bayesian posterior probabilities.Another Bayesian analysis under a site-heterogeneous model was implemented using PhyloBayes MPI 1.5a on the CIPRES Science Gateway 3.3 58 .After removing constant sites from the alignment, two independent chains starting from a random tree were run under the CAT + GTR + C4 (GTR) and C6 models.The final trees were visualized using FigTree v.1.4.4 software (http:// tree.bio.ed.ac.uk/ softw are/ figtr ee/).Tree topologies obtained with each phylogenetic analytical method used were compared using the IQ-TREE web server (http:// iqtree.cibiv.univie.ac.at/) 59 .The data sets of P123 and AA were subjected to Kishino-Hasegawa (KH) 60 , Shimodaira-Hasegawa (SH) 61 , weighted Kishino-Hasegawa test (WKH) 60 , weighted SH test (WSH) 61 , expected Likelihood Weight (ELW) 62 , and the approximately unbiased (AU) tests 63 with 1000 replicates.
The A + T content (AT%) in the mitogenome of D. melacanthus is 74%, with a lower AT% in PCGs (71.3%) than in rRNAs (74.2%) and tRNAs (75.7%) (Table 2), resulting in a mitogenome with a positive AT skew (0.149) and negative GC skew (− 0.154).The PCGs exhibit a slightly negative AT skew due to the biased AT% in some of the nad genes (nad1, nad4, nad4L, nad5) (Table 2).The tRNAs and rRNAs also have a negative AT skew.The AT% of the mitogenome of D. furcatus was 72.4%, with PCGs carrying lower AT% (70.5%) than rRNAs (73.8%) and tRNAs (75.8%).The AT skews of tRNAs (0.003) and of the complete mitochondrial genome (0.174) of D. furcatus are positive, while those of PCGs (− 0.188) and rRNAs (− 0.163) are negative.The negative AT skew of PCGs is also likely due to the nad genes (Table 2).
The biased AT% of the whole mitogenomes, PCGs, rRNAs, and tRNAs genes of Diceraeus is similar to that of other Pentatomidae species 76,91 .Small variations in AT% among different genes have been attributed to DNA repair, selection, and mutations that occur during the evolutionary process 81,92 .

Species
The loss of tRNA is considered rare in insect mitogenomes 98 , although many reports can be found.The gene cluster trnT, trnQ, and trnM was not found in the mitogenome of the hemipteran Nabis apicalis 99 ; trnI was not detected in Aleurochiton aceris 89 ; trnS1 and trnQ were lost by Aleurodicus dugesii 89 ; Aleurocanthus camelliae lacks trnI 100 ; Neomaskellia andropogonis does not have trnA, trnR, trnN, trnI, and trnS2 89 .The loss of mitochondrial genes can be tolerated by the cell when their function becomes unnecessary or when it is taken over by a substitute gene with a similar role, or yet because it has been functionally transferred to the nucleus.The transference of gene functions from the mitogenome to the nuclear genome has many advantages [101][102][103][104][105] , including effects on species differentiation and the functional evolution of mitogenomes 106 .Genes encoding tRNAs in mitogenomes exhibit a mutation rate five times higher than nuclear genes, which undergo intense purifying selection 107 .However, there is no evidence that this occurred with the absent tRNA in this study.Instead, its function may have been replaced by nuclear tRNAs, given the frequent use of the isoleucine amino acid.trnL and trnS were duplicated in both mitogenomes.trnS copies were located in the same strand (sense), while trnL in different strands.Gene duplications also contribute to tRNA gene number variation among taxa 108 , and the loss of mitochondrial tRNAs in eukaryotes is not uncommon 101 .
The 16S and 12S rRNA genes are in the antisense strand of the mitogenomes of D. melacanthus and D. furcatus (Figs. 1 and 1S).The larger rRNA subunit (16S) is located between trnL1 and a spacer region, and has identical sizes in both species (657 bp).The smaller rRNA subunit (12S) is located between the trnV gene and another spacer region, with of 791 bp in length for D. melacanthus and 792 bp for D. furcatus.
We demonstrated that the mitogenomes of D. melacanthus and D. furcatus share the same base composition pattern and conservation of content and gene arrangement observed after comparative analyses with other Pentatomidae mitogenomes 14,29,70,82,94 .
Leucine (Leu), lysine (Lys), serine (Ser), methionine (Met), and asparagine (Asp) are the most frequently used amino acids in the mitogenome of Diceraeus spp.(Fig. 3).Both Diceraeus species showed similar patterns of relative synonymous codon usage (RSCU).The most frequent codon (UUA-Leu2) used in D. melacanthus is the same of other Pentatomidae 91 , differing from the preferred AGA (Ser1) and UUA (Leu2) in the mitochondrial genome of D. furcatus.The preferred codons in PCGs showed a strong AT bias due to the predominance of A or T nucleotides in the third position of codons, following a pattern observed in Pentatomidae 69,91,94,109 .
The number of mtSSRs identified in D. melacanthus and D. furcatus was within the range observed in the mitogenome of the 56 pentatomid species analyzed (3-20 mtSSRs).A total of 490 mtSSRs were found, representing less than 1% of the mitochondrial genomes analyzed and within the range of microsatellites identified in the nuclear genome of insects (0.02-3.1%) 114 .The number of mtSSRs was positively related with the size of the mitochondrial genome (r 2 = 0.83, p < 0.05), but with a wide variation in abundance among insect species (Fig. 6a), as reported elsewhere 39,51 .The lowest abundance (0.17 mtSSR/kb) and relative density (0.64 bp/kb) of mtSSRs were observed for D. melacanthus, while the highest values occurred in Plautia fimbriata (abundance = 1.26 mtSSR/ kb; relative density = 4.15 bp/kb) (Fig. 6b).The abundance and relative density of mtSSRs in the mitogenome of pentatomids did not correlate with mitogenome size as previously reported in studies of insect genomes 114 .The GC content (%) of mtSSRs among species was very similar, ranging from 21.9 to 28.3%.The existence of a relationship between the number of microsatellites, abundance, and relative density with the size and GC content of mitochondrial genome in plants 51 was not observed in pentatomids.
The majority of mtSSRs of pentatomids were found in coding regions (70%), especially tri-and tetranucleotides (Fig. 7a).No mononucleotides were found in non-coding regions.Studies on mitogenomes of lower eukaryotes 115 and animals 116 also detected a higher abundance of mtSSRs in genic regions, possibly due to the prokaryotic origin of mitochondria (endosymbiont hypothesis) 116 .This is different from nuclear genomes in which SSRs are predominantly located in non-coding regions, including introns 117 .A significant portion of mtSSRs of pentatomids located in coding regions (Fig. 7b) is found in NADH dehydrogenase genes involved in the electron transport chain.nad6 had the highest abundance of microsatellites (18%), followed by nad4 (14%) and nad2 (11%).Additionally, 13% of mtSSRs were located in 16S rRNA.
Tetranucleotides (43.1%) and trinucleotides (40.4%) were the most abundant types of mtSSRs, with all other types of mtSSRs representing less than 7% of abundance (Fig. 8).Studies on SSRs in insects are relatively uncommon but indicate a predominance of di-and trinucleotides 114 .In plants, where the majority of such studies are conducted, mono-, di-, and trinucleotides often predominate 39,[118][119][120][121][122] .Among all mtSSRs in the mitogenomes of pentatomids, 98.8% of them had only A and T nucleotides in their composition.Dinucleotides were represented

Phylogenetic relationships in Pentatomidae
Four phylogenetic trees were obtained using different analytical methods (ML, CAT + GTR + C4, C60, and BI) of concatenated amino acid sequences of PCGs from 56 species of Pentatomidae (+ 2 outgroup species) (Fig. 9 and 2S).Topology tests (Table 1S) indicated tree topologies produced using CAT + GTR + C4, C60, and BI methods resulted in significant different log-likelihoods (p < 0.05).We chose to explore the BI tree due to the highest posterior probability value obtained (Fig. 9).The recovered phylogeny demonstrated that the Pentatomidae family is a monophyletic group with maximum posterior probability support (PP = 1), reinforcing previous reports 68,69,[123][124][125][126] .However, our data challenge the monophyly hypothesis of Pentatominae and Podopinae, as representatives from both subfamilies intermingle, corroborating published findings 69 .Asopinae and Phyllocephalinae formed independent clades (PP = 1), although within Pentatominae.The monophyly and position of Asopinae within Pentatominae have been previously proposed based on morphology 124 and molecular taxonomy 81,127 , highlighting the need for more detailed internal classification, especially because Asopinae is not divided into tribes 126,128,129 .
Neojurtina typica (Pentatominae: Pentatomini) is placed as the oldest representative within the family 94,95 .Other Pentatominae species are also dispersed on the tree, such as Placosternum urus and Piezodorus guildinii, forming sister group relationships with representatives of Phyllocephalinae and Podopinae, respectively (Fig. 9).The internal taxonomic structure of Pentatomidae has generally been considered stable.However, Pentatominae has been the subject of continuous questioning, revisions, and adjustments 125,128,129 .These questions arise because the studies that formed the basis for taxonomic classifications lacked high-quality phylogenetic methodology, had incomplete sampling, and poorly defined characters, limiting the impact of their findings 126 .Pentatominae comprises more than 40 tribes, including many genera that are challenging to accommodate in other subfamilies 129,130 .
The Pentatomini tribe is recognized as having less precise taxonomic definition 76 .The Nezarini and Antestini tribes are positioned within the same clade, suggesting a possible relationship due to their significant morphological similarity 126,131,132 .However, they are not considered monophyletic, although combined, they form a monophyletic group 94 .
Representatives of Halyini, also within Pentatominae, resolved into a single clade with high support probability, although positioned separately from most Pentatominae representatives (Fig. 9).The tribes Eysarcorini, Strachiini, Halyini, and Caystrini, and the genera Pentatoma, Plautia, and Diceraeus proved to be monophyletic.The two available mitochondrial genomes for D. melacanthus resolved into parallel branches forming a sister group with D. furcatus, in a clade with high posterior probability, which also includes another Carpocorini, Euschistus heros, in a slightly more external position.However, the Carpocorini tribe is not monophyletic, a result supported by recent analyses involving morphological and molecular data 126 .
Eysarcoris gibbosus was the first to branch off into the Eysarcorini clade, however, with Carbula sinica separating it from the other representatives of its genus.The positioning of E. gibbosus out of the Eysarcoris group supports the proposed reclassification of E. gibbosus to the genus Stagnomus (tribe Eysarcorini) 133 , which supports the closer systematically positioning of E. gibbosus to C. sinica 76,133,134 .
Members of the Piezodorini tribe have often been associated with members of the Menidini 131 , Nezarini 132 , Eurysaspini 129 , and the genus Rhaphigaster 126 because of some common external morphological characteristics.However, molecular data based on mitogenome analysis did not support a relationship of Piezodorini with these groups but rather with the Podopinae subfamily (Scotinophara lurida and Graphosoma rubrolineatum), although with low support.The position of Graphosoma and Scotinophora raises some questions because they group with a clade of pentatomines, and positioned themselves as sister species.Others also agree with the monophyly of Podopinae, even though there are several apomorphies that disagree with this classification 126 .
Our data highlights the conservation of gene content and arrangement between D. melacanthus and D. furcatus, enhancing our understanding of the mitochondrial diversity in Pentatomidae.The identification and characterization of mtSSRs in coding regions offer new perspectives for phylogenetic and genetic studies within the family, suggesting potential molecular markers for population genetics and evolutionary analyses.The phylogenetic analysis underscores the need for revisions in internal taxonomic classification, challenging the monophyly of Pentatominae and Podopinae.In summary, this study not only contributes to the specific understanding of D. melacanthus and D. furcatus but also sheds light on broader issues of the taxonomy and evolution of Pentatomidae, providing insights for future research and refinements in taxonomic classifications.

Figure 3 .
Figure 3. Amino acid composition of the protein-encoding genes of the mitogenomes of Diceraeus melacanthus and D. furcatus.

3 1Figure 4 .
Figure 4. Relative synonymous codon usage (RSCU) of 20 amino acids and stop codons of the PCGs of the mitochondrial genome of (a) Diceraeus melacanthus and (b) D. furcatus.

Figure 5 .
Figure 5. ka (non-synonymous substitutions), ks (synonymous substitutions) and ka/ks values for each PCG in the mitochondrial genomes of Pentatominae.

Figure 6 .Figure 7 .
Figure 6.(a) Relationship between the number of mtSSRs and the mitogenome size (kb) from 56 Pentatomidae species; (b) Genome size (kb) and genomic density (bp/kb) of mtSSRs from 56 Pentatomidae species.The code names are based on the first two letters of the genus and species name (see Table1).

Figure 8 .
Figure 8. Relative percentage of six sizes of mtSSRs in 56 pentatomid mitogenomes.The percentages of mono-, di-, tri-, tetra-, penta-.and hexanucleotides are shown in different colors.The tribes of Pentatomidae are shown to the left of the species.Insects without a tribe belong to the subfamily Asopinae.

Figure 9 .
Figure 9. Phylogenetic relationships of tribes within Pentatomidae reconstructed from mtDNA sequences of 13 PCGs using the BI method.The numbers on the branches are the posterior probabilities (PP).The length of the branches is proportional to the genetic distance.Outgroup branches that don't follow the scale are dashed.The subfamilies of Pentatomidae are shown by the colors in each species, and the tribes by vertical bars.

Table 1 .
Mitogenomes of Pentatomidae species and of two external groups (Aleyrodidae and Reduviidae) used in the analyses to determine the phylogenetic relationships and systematic positioning of two Diceraeus species.

Table 3 .
Sequences of mtSSRs, number of repeats, total size, start and end positions in the mitogenome, and location region of mtSSRs found in Dicereaus melacanthus and D. furcatus.