Introduction

The small leafhopper subfamily Mileewinae has approximately 160 described species worldwide. The subfamily contains small and medium-sized, slender, usually darkly pigmented species that inhabit wet tropical forests worldwide, where they usually occur on herbaceous vegetation in the understory. With the changing taxonomic status of the subfamily Mileewinae, there is some controversy regarding the relationship between this subfamily and both Cicadellinae and Typhlocybinae1,2,3,4,5. The subfamily Mileewinae includes four tribes: Makilingiini (Philippines and Thailand), Mileewini (Old and New World), Tinteromini (New World), and Tungurahualini (New World)6,7. All species of this subfamily in China belong to Mileewini and now comprise four genera: Mileewa Distant, Ujna Distant8, Processina Yang, Deitz & Li9, and Anzihelus Yan & Yang—a new genus established in 202110 which is can be distinguished by its elevated head above the pronotum and the male Xth segment (basal anal tube segment) with single caudal process. A great majority of Chinese Mileewini species belong to the genera Mileewa, which has 56 species, Ujna, which has eight species, and Processina, which has five species that are only recorded in China11.

Mileewa is a widespread and large genus of the tribe Mileewini established with Mileewa margheritae by Distant (1908). The genus has more than 80 valid species globally that have a dark appearance dorsally and forewing that is usually truncated or concave apically and sometimes distinctly expanded from base to apex. Moreover, in our recent study, seven species were newly recorded in this genus from China11,12,13. The genus Processina was completely established in China by Yang, Deitz & Li in 2005 and contains five species. Processina dashahensis (Yang et Li, 2005) is one of these species that was recorded in the Dashahe Natural Reserve of Guizhou Province. The other species originated from several places, such as Taiwan, Yunnan, Guizhou, and Sichuan. The traditional classification difference between the genera Mileewa and Ujna is that the head width is narrower than the pronotum or subequal to the pronotum width; the forewing apex is truncated or emarginated in Mileewa yet rounded in Ujna; and male style have long setae near their middle section in Mileewa versus an absent long setae in Ujna. In contrast, the genus Processina is distinguished from the other two genera mainly by its male style with a straight apical portion, rounded apex, and dense setae at the apex14. However, some ambiguous features make it difficult to distinguish between species in traditional classification studies. Therefore, there is an urgent need to increase our knowledge on the mitochondrial genomes of Mileewinae species and to use these molecular data to determine the phylogenetic relationships among other subfamilies and within this small group.

The insect mitochondrial genomes are mostly 14.5–17 kb double-stranded circular molecules comprising 37 typical genes: 13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNAs: s-rRNA; l-rRNA), 22 transfer RNA genes (tRNAs), and a control region (CR)15,16,17. To date, six Mileewinae species from China have been sequenced, and these sequences have been deposited in the NCBI database: Mileewa rufivena (MZ326689)18, Mileewa ponta (MT497465)19, Mileewa margheritae (MT483998)20, Mileewa albovittata (MK138358)21, Mileewa alara (MW533151)22 and Ujna puerana (MZ326688)18. They are all from the genera Ujna and Mileewa, while sequences of Processina species have not been deposited in NCBI. In this study, we sequenced and annotated the complete mitogenomes of five Mileewinae species: one from the genus Processina, and others from Mileewa. Furthermore, we analyzed the characteristics of these mitogenomes, including nucleotide composition, tRNA secondary structure, codon usage, gene overlaps, intergenic spacer, and CRs. The mitochondrial genome data in this study can help us understand the phylogeny and evolution of Mileewinae. The growing number of studies on the complete mitogenomes of Mileewinae will yield more evidence, which can render a better understanding of the taxonomic status of the subfamily Mileewinae among other Cicadellidae subfamilies.

Materials and methods

Sample collection and taxonomic identification

Specimens of adult male M. mira, M. lamellata, M. sharpa, M. amplimacula, and P. sexmaculata were collected from China (Table S1). All specimens were collected, immediately preserved in 100% ethanol, and stored at −20 °C in a laboratory freezer prepared for DNA extraction. The species used in this study (males) were identified by Maofa-Yang based on morphological characteristics, especially male genitalia, according to the taxonomy system described by Dietrich (2011)6.

DNA extraction and sequencing

After identifying the species, total genomic DNA was extracted from the head and thorax tissues of five single adult species, using the Tissue and Blood Genome DNA Extraction Kit (Qiagen, Hilden, Germany), according to the manufacturer’s protocols. Voucher DNA was stored at −20 °C, while the external genitalia were kept in glycerol. Both were deposited at the Institute of Entomology, Guizhou University, Guiyang, China (GUGC). The five new mitogenomes were subjected to next-generation sequencing on the Illumina NovaSeq6000 platform (Berry Genomics, Beijing, China) with a paired-end 150 sequencing strategy.

Mitogenomes assembly, annotation, and analysis

The clean sequence reads were assembled using Getorganelle 1.7.523 based on the Mileewa rufivena mitochondrial genome sequence from GenBank (Accession number: MZ326689.1)18. The five mitogenomes were initially annotated using MitoZ 2.4-alpha with invertebrate mitochondrial genetic codes24. The annotated results from MitoZ were then imported into Geneious Prime 2021.1 software for further editing. We used the MITOS (http://mitos.bioinf.uni-leipzig.de/index.py) web server25 with invertebrate genetic codes to further revise the tRNA and PCG locations and generate the secondary structures of tRNAs and the ARWEN program26 to check the MITOS results for tRNA secondary structures and locations. ORF finders in Geneious Prime were also used to annotate PCGs using the invertebrate genetic code. l-rRNA genes were defined according to adjacent tRNA genes (trnL1 and trnV), whereas s-rRNA genes were located by comparison with other homologous Cicadellidae species s-rRNA genes. Mitogenomic circular maps were generated using OGDRAW website (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html)27. The nucleotide composition and relative synonymous codon usage (RSCU) were calculated using Phylosuite v1.2.2 software28. Strand asymmetry was calculated according to the formula AT skew = [A − T]/[A + T] and GC skew = [G − C]/[G + C]29. Tandem repeats in the CR were recognized using the Tandem Repeats Finder program (https://tandem.bu.edu/trf/trf.basic.submit.html)30. Finally, the five newly sequenced mitogenomes of Mileewinae were submitted to GenBank with a tbl file using GB2sequin (https://chlorobox.mpimp-golm.mpg.de/GenBank2Sequin.html)31 under the accession numbers ON464171–ON464175.

Phylogenetic analyses

For phylogenetic analyses, mitochondrial genomes of 61 species belonging to 17 subfamilies were selected. Of them, mitogenomes of 59 Membracoidea species (50 leafhopper, 4 treehopper, and 5 newly sequenced species) comprised the ingroup, and those of two species—Callitettix braconoides (NC_025497)32 and Tettigades auropilosa (KM000129)—were used as outgroups. The detailed information and accession numbers for these mitogenomes are listed in Table S2.

For the five newly sequenced species, each PCG in the mitogenome was manually extracted using the Geneious Prime 2021.1 program. Thirteen PCG sequences were aligned in batches with MAFFT33 using the’–auto’ strategy and codon alignment mode. The alignments were refined using the codon-aware program MACSE v. 2.0334, which preserves the reading frame and allows the incorporation of sequencing errors or sequences with frameshifts. Ambiguously aligned fragments of 13 alignments were removed in batches using Gblocks35. The alignments of each individual gene were concatenated using the Phylosuite28 program. The best partitioning scheme and evolutionary models for 39 predefined partitions were selected using PartitionFinder236 with a greedy algorithm37 and Akaike Information Criterion (AIC) in Bayesian inference (BI) analyses and auto-generated in IQ-TREE38 with Bayesian Information Criterion (BIC) in maximum-likelihood (ML) analyses. Detailed model information is shown in Table S3. BI analysis was inferred using MrBayes 3.2.639 with four chains under an independent partition model (two parallel runs, 10,466,900 generations) with sampling every 100 generations, in which the initial 25% of sampled data were discarded as burn-in, and the remaining trees were used to generate a consensus tree and calculate Bayesian posterior probability (PP) values after the average standard deviation of split frequencies was < 0.01. ML analysis was performed using the IQ-TREE program38 under an edge-linked partition model for 1000 replicates of ultrafast40 bootstraps and generated bootstrap support (BS). BI and ML trees were viewed and edited using the iTOL online tool (https://itol.embl.de/)41.

Results

Genome organization and composition

The five new complete mitogenomes were identified as circular double-stranded molecules with the length of 14787–15436 bp for Mileewa mira, M. lamellata, M. sharpa, M. amplimacula, P. sexmaculata (ON464171–ON464175), respectively. Detailed annotations and circular maps are presented in Fig. 1 and Table 1. The species were medium length compared to other published Mileewinae species18,19,20,21,22. Each of the five newly sequenced mitogenomes contained 37 typical mitochondrial genes: 13 PCGs, two rRNA genes, 22 tRNA genes, and a CR (Fig. 1, Table 1). Regarding gene arrangement, all five mitogenome sequences displayed identical gene orders, consistent with previously published mitogenomes of Cicadellidae18,42,43. Of these 37 genes, 23 (9 PCGs and 14 tRNAs) were located on the heavy (H) strand, whereas 14 (4 PCGs, 8 tRNAs, and 2 rRNAs) were located on the light (L) strand.

Figure 1
figure 1

Circular maps of the mitogenomes of Mileewa mira (A), Mileewa lamellata (B), Mileewa sharpa (C), Mileewa amplimacula (D), and Processina sexmaculata (E). Genes are shown in different color blocks. Color blocks outside the circle indicates that the genes are located on the heavy strand (H-strand); color blocks within the circle indicates that the genes are located on the light strand (L-strand).

Table 1 Organization of the four Mileewa and one Processina species mitochondrial genomes.

The nucleotide composition of these five species was listed in Table 2. The AT nucleotide content of the five mitogenomes ranged from 78.3% (P. sexmaculata) to 80.2% (M. lamellata), whereas the GC nucleotide content ranged from 19.8% (M. lamellata) to 21.6% (M. sharpa and P. sexmaculata), revealing a significant AT bias. The highest and lowest AT contents were found in the CR (83.3%–86.6%) and PCGs (77.1%–79.6%), and the order was CR > rRNAs > tRNAs > PCGs. All five whole mitogenomes exhibited a positive AT skew (0.057–0.098), indicating that the A nucleotide was more prevalent than the T nucleotide, and a negative GC skew (−0.170 to −0.101), indicating that the nucleotide percentage of C was larger than that of G.

Table 2 Nucleotide composition and skewness of the four Mileewa and one Processina species mitochondrial genomes.

PCGs and codon usage

Among the five new mitogenomes, the total lengths of the 13 PCGs were 10953 bp in M. mira, 10947 bp in M. lamellata, 10956 bp in M. sharpa, 10947 bp in M. amplimacula, and 10944 bp in P. sexmaculata (Table 2). Each of the PCG sequence exhibited a comparable size, with the nad5 gene being the longest (1674 bp, except 1673 bp for M. sharpa) and atp8 being the shortest (153 bp). Of the 13 PCGs, 9 PCGs (cox1, cox2, cox3, atp6, atp8, nad2, nad3, nad6, and cytb) were located on the H-strand, and the remaining four (nad1, nad4, nad4L, and nad5) on the L-strand (Fig. 1, Table 1). In five newly annotated mitogenomes, most PCGs were initiated with the typical codon ATN (ATA, ATG, ATC, and ATT), while atp8 and nad5 were initiated with the codon TTG, similar to a previous report18,19,20,21,22. The majority of PCGs terminated with the complete stop codons TAA and TAG; however, some PCGs used an incomplete stop codon T– (TA or T, as shown in Table 1), such as nad1 and cox2. This phenomenon is common in invertebrate mitogenomes and can be completed by post-transcriptional polyadenylation17,44. Furthermore, nad2, atp8, atp6, and cox3 genes of all the five species had the same start and stop codons. The AT content of M. mira, M. lamellata, M. sharpa, M. amplimacula, P. sexmaculata were 78.2%, 79.6%, 77.4%, 77.3%, and 77.1%, respectively. The AT content distribution of the first codon position (1st), second codon position (2nd) and third codon position (3rd) were 73.5%–75.8%, 69.2%–70.2% and 88.6%–92.8%, respectively; and their order was 3rd > 1st > 2nd. Negative AT (−0.172 to −0.147) and positive GC (0.014–0.047) skews were detected in all 13 PCGs of these five mitogenomes (Table 2).

The relative synonymous codon usage (RSCU) values of the five mitogenomes were shown in Fig. 2 and Table S4. The results indicated that the RSCU values and codon numbers of each mitogenome were similar. The four most frequently used codons were UUU (Phe), UUA (Leu2), AUU (Ile), and AUA (Met), wherein all of which are formed by A and U. This suggests that the most frequently used codons could be found with a preference toward more A and T than G and C.

Figure 2
figure 2

Relative synonymous codon usage (RSCU) values in protein-coding genes (PCGs) of the four Mileewa and one Processina species mitogenomes.

Transfer and ribosomal RNA genes

All five new species contained 22 typical tRNA genes; of them, 14 were located on the H-strand, and the other eight on the L-strand (Table 1, Fig. 1). The length of tRNAs ranged 61–72 bp, and the concatenated total length ranged from 1422 bp (M. sharpa) to 1440 bp (M. amplimacula). The tRNA genes also presented a high AT bias with AT content ranging from 79.9% to 81.4%, and all exhibited a positive AT skew (0.014–0.02, except for −0.003 in M. amplimacula) and GC skew (0.16–0.197) (Table 2). The secondary structures of the 22 tRNAs in the five newly sequenced mitogenomes were generated and are presented in Figure S1S5. All tRNAs could be folded into the typical cloverleaf secondary structure, except for trnS1, which lacks the dihydrouridine (DHU) arm and was replaced by a simple loop, which has been observed in other Cicadellidae mitogenomes18,42,43,45,46,47. In addition, anticodons of all tRNAs were identified across the reported Cicadellidae species18,48,49. Several mismatched base pairs, for instance, UU, UG, and an extra A nucleotide, were detected among all tRNAs of the five species. Some single A mismatches appeared in both the trnS1 (anticodon arms) and trnR (acceptor arms) genes within the five newly sequenced mitogenomes (Figure S1S5). The full size of the two rRNA genes among them ranged from 1920 bp (M. sharpa) to 1974 bp (M. amplimacula), with a higher AT content (80.3%–81.3%) than tRNA genes. Both l-rRNA and s-rRNA were located on the L-strand, and showed a negative AT skew (−0.179 to −0.101) and a positive GC skew (0.196–0.286) (Table 2, Fig. 1). The l-rRNA genes were located between trnL1 and trnV, with sizes ranging from 1180 bp (M. sharpa) to 1217 bp (M. amplimacula). The s-rRNA genes were located between trnV and the CR, with sizes ranging from 740 bp (M. sharpa) to 757 bp (M. amplimacula).

Overlapping and intergenic spacers

All overlaps and intergenic spacers were identified and were displayed in Table 1. We identified 10 (P. sexmaculata) to 16 (M. sharpa) overlaps distributed in five mitogenomes ranged from 1 to 12 bp. The longest overlap was 12 bp in M. mira and located between nad6 and cytb. The five mitogenomes had 5–14 intergenic spacers, with sizes ranging from 1 to 24 bp. The longest (24 bp) intergenic spacer was identified between the trnS2 and cytb genes in P. sexmaculata. Two and three locations had same intergenic spacers and overlaps between genes across the five mitogenomes, respectively: trnP-nad6 and nad4L-trnT (2-bp long intergenic spacers), trnN-trnS1 (1-bp long overlaps), atp6-atp8 (7-bp long overlaps), and trnW-trnC (8-bp long overlaps). Additionally, the differences in overlaps and intergenic spacers between the four Mileewa mitogenomes and P. sexmaculata mitogenome were found to be located at nad4-nad4L (7-bp long overlaps in four Mileewa species; 1-bp long in P. sexmaculata) and nad5-trnF (1-bp long overlaps in four Mileewa species; 3-bp long intergenic spacers in P. sexmaculata). Furthermore, 23-bp (nad4-trnH), 13-bp (cox1-trnL2), and 14-bp long (trnY-cox1) intergenic spacers were identified in the mitogenome of P. sexmaculata but not in any of the other Mileewinae species.

Control region

The CR in the five newly sequenced Mileewinae mitogenomes was located between s-rRNA and trnI, with variable sizes ranging from 446 bp (M. lamellata) to 1062 bp (M. amplimacula) (Fig. 1, Table 1). The CR was the longest non-coding region, with the highest AT content, ranging from 83.3% (P. sexmaculata) to 86.6% (M. amplimacula). Except for −0.059 in P. sexmaculata, AT skew was positive (0.004–0.057). In addition, M. lamellata (0.096) and P. sexmaculata (0.042) had a positive GC skew in the CR, whereas the remaining three species had a negative GC skew (−0.06 to −0.03) (Table 2). The detailed structural information of the CR among the five mitogenomes is shown in Figure S6S10. The number of repeat units in the CR across the five mitogenomes were as follows: one repeat unit in M. sharpa (2 × 107 bp) and M. mira (2 × 168 bp), two in M. lamellata (unit 1, 2 × 15 bp; unit 2, 2 × 79 bp), three in M. amplimacula (unit 1, 2 × 39 bp; unit 2, 2 × 178 bp; unit 3, 2 × 144 bp), and more abundant units in P. sexmaculata (unit 1, 3 × 41 bp; unit 2, 2 × 151 bp; unit 3, 4 × 47 bp; unit 4, 2 × 102 bp). The largest repeat unit was 178-bp long in M. amplimacula, whereas the smallest with a size of 15 bp was found in M. lamellata; they were both contained two repeats.

Phylogenetic relationships

Phylogenetic relationships were analyzed using 13 PCGs of 59 Membracoidea species and two outgroups mitogenome datasets. Two phylogenetic trees were reconstructed using the BI and ML methods (BI-PCGS and ML-PCGS, respectively) (Figs. 3,4). According to these two trees, 11 Mileewinae species formed a monophyletic group with high support values (BS = 100, PP = 1) and were located at the apex of the trees, and each subfamily was completely recovered as a monophyletic group in both ML and BI analyses. Both analyses demonstrated that 11 Mileewinae species were clustered together with a sister relationship with the subfamilies Idiocerinae, Evacanthinae, and Ledrinae, and these four subfamilies (Mileewinae, Idiocerinae, Evacanthinae, and Ledrinae) clustered together to form a sister group to Cicadellinae, with high support values (BS > 86, PP = 1). Moreover, the subfamilies Mileewinae, Idiocerinae, Evacanthinae, Ledrinae, and Cicadellinae clustered in one clade to form a sister group to Typhlocybinae. Our study indicated that Typhlocybinae is more ancient than Mileewinae and Cicadellinae and that Mileewinae is not a sister group of Typhlocybinae, which differs from a previous study18,42,45,50.

Figure 3
figure 3

Maximum likelihood (ML) phylogenetic tree analysis of 13 protein-coding genes (PCGs). Callitettix braconoides and Tettigades auropilosa are outgroups. Bootstrap support (BS) values are shown at the nodes. Newly sequenced mitogenomes are indicated in red.

Figure 4
figure 4

Bayesian inference (BI) phylogenetic tree analysis of 13 protein-coding genes (PCGs). Callitettix braconoides and Tettigades auropilosa are outgroups. Posterior probability (PP) values are shown at the nodes. Newly sequenced mitogenomes are indicated in red.

Discussion

In the present study, we sequenced and comparatively analyzed the complete mitogenomes of five species belonging to two genera of Mileewinae. Processina sexmaculata was the first sequenced species from the genus Processina Yang, Deitz & Li, 2005, which was established in China. The length of the mitogenomes of the five Mileewinae species ranged from 14787 bp (M. lamellate) to 15436 bp (M. amplimacula). We compared these five mitogenomes to those of established Mileewinae species18,19,20,21,22 and found that the length of each PCG and tRNA gene is quite similar, while the difference is mainly observed in rRNAs and CRs. All the PCGs have relatively conserved characteristics. We found, for the first time, that a single A mismatch appeared in both the trnS1 (anticodon arms) and trnR (acceptor arms) genes in the five newly sequenced mitogenomes. Some common overlapping and intergenic spacer regions across the five mitogenomes were identified as well, such as a 7-bp long overlapping region between atp8 and atp6, an 8-bp long overlap between trnC and trnW, and a 2-bp long intergenic spacer region present in both trnP-nad6 and trnT-nad4L. Moreover, we found that P. sexmaculata, which is the first Processina species with a sequenced mitogenome, has a 23-bp long intergenic spacer located between trnH and nad4. A 24-bp intergenic spacer was found between cytb and trnS2 in P. sexmaculata mitogenome but not in the mitogenomes of other Mileewinae species.

We further generated the BI and ML trees with concatenated alignment of PCGs. Our results indicated that each subfamily well separated, and their monophyly weresupported. The phylogenetic relationships were as follows: (Deltocephalinae + ((Hylicinae + ((Megophthalminae + (Smiliinae + (Aetalioninae + Centrotinae))) + (Macropsinae + (Coelidiinae + Iassinae)))) + (Typhlocybinae + (Cicadellinae + (Mileewinae + (Idiocerinae + (Evacanthinae + Ledrinae)))))) (Figs. 3,4). The BI-PCGS tree had more higher support values than the ML-PCGS tree; this topology is not completely consistent with the findings of recent studies18,42,45,51, possibly due to the difference of the data and phylogeny models used. Furthermore, 11 Mileewinae species clustered in a monophyletic clade (PP = 1, BS = 100), which is consistent with the studies by Yu et al., Dietrich et al., and our previous study18,19,20,21,22,52. We also found that Mileewinae was at the apex of the phylogenetic tree, while Deltocephalinae appeared at the basal branch of the trees, which was partly the same result as that reported by Yu et al. In addition, Cicadellinae, Typhlocybinae, and Mileewinae were each recovered as a monophyletic group. In contrast to the studies by Yu et al. and Dietrich et al.18,50, based on the phylogenetic trees from this study (Figs. 3, 4), we considered that Typhlocybinae is more ancient than Mileewinae and Cicadellinae and suggested that Typhlocybinae is not a sister group to Mileewinae. Within the Mileewinae subfamily, all species maintained the same relationships and topologies according to both the BI and ML analyses. The phylogenetic relationships with high support values (PP > 0.8, BS > 83) within the subfamily Mileewinae were as follows: (M. sharpa + (U. puerana + ((M. ponta + (M. mira + M. lamellata)) + ((M. albovittata + (M. margheritae + M. amplimacula)) + (M. rufivena + (P. sexmaculata + M. alara)))))) (Figs. 3,4). These results are also different from those presented in a recent study of Yu et al., who mentioned the topology as (U. puerana + (M. ponta + (M. rufivena + M. alara) + (M. albovittata + M. margheritae))). Among the studied topologies, the monophyly of the genera Processina, Mileewa and Ujna were not supported. These three genera species of Mileewinae were not clearly separated. Nevertheless, we only used one species from Processina that raises concerns about its representativeness. This study provides a referenceable framework to understand the relationships between Mileewinae and other subfamilies and further enriches the mitogenome database of the tribe Mileewini. Molecular data of more variable genera of Mileewinae are required to determine the phylogenetic relationships within this group and define the monophyly of Mileewinae.