Introdution

The potato cyst nematodes (PCN), including Globodera rostochiensis and G. pallida, originates in South America and are the most harmful plant-parasitic nematode of potato yield worldwide1. PCN has been reported in most potato-producing regions of Europe, Africa, Asia, North, Central and South America and Oceania2. According to statistics, PCN has caused 10–12% yield loss in global potato production3,4 and more than 60% yield loss, as high as 80–90% or even no harvest when the disease is serious in potato-planted areas1,5,6,7,8,9. PCN continues to be treated as a quarantine pest by regional plant protection organisations worldwide10.

Nematodes are multicellular eukaryotes. Mitochondria exist in almost all known eukaryotes11 and contain a set of genomes independent of the nucleus, namely, the mitochondrial genome (mtDNA). Compared to nuclear genes, mtDNA has the characteristics of smaller size, faster evolution rate, and matrilineal inheritance, which makes it widely used in classification, evolution, phylogeny, population genetic structure and other fields12,13,14. It is also an effective molecular target for the classification and identification of nematodes15. Plant-parasitic nematodes, whose mtDNA sequencing had been completed, are economically important species, including G. rostochiensis, G. pallida, G. ellingtonae, Aphelenchoides besseyi, A. medicagus, Bursaphelenchus xylophilus, B. mucronatus, Heterodera glycines, Meloidogyne incognita, M. javanica, M. arenaria, M. enterolobii, M. graminicola, M. chitwoodi, Xiphinema americanum and Hoplolaimus columbus16,17,18,19,20,21,22,23,24,25,26,27,28.

The mtDNA structure in metazoans is generally a highly conserved single cyclic molecule29 that contains a specific set of genes whose order is highly conserved throughout the phylum30. However, among the reported mtDNAs of plant nematodes, those of several species of the genus Globodera are multipartite, whereas those of other genera are monopartite16,17,18,19,20,21,22,23,24,25,26,27,28. The first metazoan reported to have multipartite mtDNA circles is G. pallida, which has at least six subgenomic mitochondrial circles (scmtDNAs), approximately 6–9 kb in size16,18. Subsequently, G. rostochiensis was reported to contain at least seven scmtDNAs of similar sizes19,31, whereas G. ellingtonae contains two scmtDNAs, 17,757 and 14,365 bp in size26.

At present, COX1 gene in the mtDNA of nematodes has attracted much attention for its use in the identification of related species32, and for resolution at lower taxonomic levels such as species and subspecies groups33,34,35. Ohki et al.36 showed that the G. pallida population from Japan was close to the populations from Europe and North America, compared the cytochrome b gene (CYTB). Picard et al.37 and Plantard et al.38 confirmed that G. pallida was introduced into Europe from North America then spread to various parts of the world by combining its phylogeography and phylogenetic analysis of CYTB. Subbotin et al.35 collected 148 populations of Globodera species from 23 countries and conducted a phylogenetic analysis of cytochrome c oxidase subunits I gene (COX1) and CYTB genes and the internal transcribed spacers of ribosomal DNA (ITS). The analysis depicted that Globodera species are mainly divided into two branches: one branch from South America and North America, which mainly parasitizes Solanaceae plants, and the second branch from Africa, Europe and New Zealand, which parasitizes Asteraceae and other plants35. These studies indicated that mtDNA data could not only construct an effective framework for the phylogeny of nematodes in higher order but also effectively evaluated the genetic variation among related species to explore their genetic relationship and taxonomic status.

One species of Globodera was successively found in the potato fields of Guizhou, Sichuan and Yunnan provinces in China in 2018. Jiang et al.39, Gu et al.40 and Peng et al.41 identified this species as G. rostochiensis mainly based on its ITS sequences. However, Xu et al.42 found that although the ITS sequence of this species was very similar to that of G. rostochiensis, its morphological characteristics, such as the obvious vela on the spicules of males, were substantially different from those of G. rostochiensis. Moreover, they were substantially different in the host range, pathogenicity, field damage and symptom performance. Therefore, this Globodera species was identified as a new species, named as Globodera vulgaris (Gv)42. In the present study, the mtDNA of Gv collected from the potato fields in Guizhou, was sequenced and analysed. Based on the sequencing results, the origin and phylogenetic relationship of Gv were analysed and established, and the role of the mitochondrial genome in the classification and identification of the Globodera genus was discussed.

Results

The mtDNA assembly and PCR verification after high-throughput sequencing

The mtDNA of Gv was assembled, and a 42,995-bp mtDNA was obtained, which was assembled into five scmtDNAs with a size of 7–9 kb and named scmtDNA-I to V (GenBank PP407208-PP407212) (Fig. 1). A total of 31 genes were encoded in the whole mtDNA, including 12 protein-encoding genes, namely, one ATP synthase subunits gene (ATP6), three cytochrome c oxidase subunits genes (COX1, COX2 and COX3), one cytochrome b gene (CYTB), seven nicotinamide adenine dinucleotide (NADH) dehydrogenase subunits genes (ND1, ND2, ND3, ND4, ND4L, ND5 and ND6), two ribosomal RNA (rrnL and rrnS) and 17 transporter RNA genes (tRNAs). The A + T content of each scmtDNA sample ranged from 63.2 to 69.2%, indicating a significant AT bias. No gene overlap was detected in the scmtDNAs, and gene spacer regions were present. To verify the assembly results, specific fragments of scmtDNA-I to -V were successively amplified with the corresponding primers using total DNA as the template and amplified fragments of approximately 3 kb. The sequences were determined to be 3096 and 2884 bp, 3063 and 2961 bp, 3042 and 3006 bp, 3016 and 2943 bp, 2987 and 2915 bp in size, respectively, which were consistent with the estimated sizes (S1 Fig.).

Figure 1
figure 1

Annotation and structure of subgenomic mitochondrial circles (scmtDNA) in Globodera vulgaris. The transcription direction of the in-circle genes is clockwise, whereas that of the out-circle genes is the opposite. Different functional genes were identified using different colours. The built-in dark grey histogram show the GC content of the genome, and the middle grey line represent the 50% threshold.

Protein-encoding genes in the mtDNA of Gv

Twelve protein-encoding genes were annotated in the mtDNA of Gv. All protein-encoding genes had a complete initiation codon, whereas two genes (COX1 and ND1) had an undetermined termination codon (Table 1). The codons of the protein-encoding gene in mtDNA of Gv showed significant AT bias, and most of them exhibited significant AT negative skew (0.56–0.39) and GC positive skew (0.29–0.65) (Table 1).

Table 1 Nucleotide composition and bias of protein-encoding genes in the mitochondrial genome of Globodera vulgaris.

The tRNAs and rRNAs of the mtDNA of Gv

The MITOS WebServer online tool was used to predict tRNA in the mtDNA of Gv, and a total of 17 tRNAs with a length range of 46–79 bp were identified, 11 (trnA, trnE, trnF, trnI, trnK, trnL1, trnL2, trnM, trnP, trnS and trnV) of which were multiple copies in different scmtDNAs and had different secondary structures (Fig. 2). The secondary structures of most of the tRNAs were typical clover structures (including TΨC arm, DHU arm and anticodon arm), 11 of which lacked TΨC arms. In addition, trnA in scmtDNA-III(trnA′) lacked DHU rings, and trnV in scmtDNA-II (trnV″) lacked DHU arms. The remaining trnI in scmtDNA-IV and scmtDNA-IV (trnI′), trnK in scmtDNA-III, trnL2 in scmtDNA-IV (trnL2′), trnM in scmtDNA-II, scmtDNA-III and scmtDNA-V, trnP in scmtDNA-IV (trnP′), trnR in scmtDNA-IV, trnS2 in scmtDNA-II and scmtDNA-III, trnV in scmtDNA-IV (trnV″) and trnW in scmtDNA-I present a complete clover structure. The rRNA in the mtDNA of Gv was obtained by comparison with that of similar species. The sizes of rrnL and rrnS were 846 and 673 bp, respectively, which were not different from the rRNA lengths of other plant-parasitic nematodes, and all had high A + T content.

Figure 2
figure 2

The secondary structure prediction of tRNAs in the mitochondrial genome of Globodera vulgaris trnA: trnA in subgenomic mitochondrial circle I of G. vulgaris (scmtDNA-I); trnA′: trnA in scmtDNA-III; trnC: trnC in scmtDNA-I; trnE: trnE in scmtDNA-I; trnF: trnF in scmtDNA-I; trnG: trnG in scmtDNA-I; trnI: trnI in scmtDNA-II; trnI′: trnI in scmtDNA-IV; trnK: trnK in scmtDNA-II; trnK′: trnK in scmtDNA-III; trnL1: trnL1 in scmtDNA-II; trnL1′: trnL1 in scmtDNA-IV; trnL2: trnL2 in scmtDNA-I; trnL2′: trnL2 in scmtDNA-IV; trnM: trnM in scmtDNA-II, scmtDNA-III and scmtDNA-V; trnP: trnP in scmtDNA-I; trnP′: trnP in scmtDNA-V; trnR: trnR in scmtDNA-V; trnS1: trnS1 in scmtDNA-I; trnS2: trnS2 in scmtDNA-II and scmtDNA-III; trnT: trnT in scmtDNA-III; trnV: trnV in scmtDNA-I; trnV′: trnV in scmtDNA-II; trnV′′: trnV in scmtDNA-IV; trnW: trnW in scmtDNA-I.

Non-coding regions in the mtDNA of Gv

The mtDNA of Gv consisted of 46 non-coding regions of variable sizes, ranging in length from 1 to 5327 bp. Each scmtDNA of Gv had a long non-coding region with lengths of 5327, 2223, 2297, 3472 and 4490 bp, located between trnW and trnL in scmtDNA-I, trnM and trnS in scmtDNA-II, trnS and trnM in scmtDNA-III, trnL and trnI in scmtDNA-IV, and tRNA and trnP in scmtDNA-V, with AT contents of 65.3%, 59.8%, 59.9%, 60.0%, and 59.4%, respectively. The circles shared a high sequence identity region, in their longest non-coding regions. There were 98% identical sites in the ~ 2.3 kb shared sequence region between mtDNA-III position 983–2297 and mtDNA-II position 4700–6914 and mtDNA-V position 6002–8218.

Comparison of the mtDNA of four species of the genus Globodera

The mtDNA of Gv obtained in this study was compared to that of three other species of the genus Globodera. The results (Table 2) showed that the length of the mtDNA of Gv was close to that of G. rostochiensis and G. pallida, with lengths of 42,995, 4160 and 45,071 bp, respectively, but different from the number of scmtDNAs, with five, seven and six, respectively. The mtDNA of G. ellingtonae was the shortest, with a length of 32,122 bp and only two subgenomic circles.

Table 2 Comparison of the mitochondrial genomes (mtDNA) of Globodera nematodes.

The A + T content, AT skew and GC skew were used to measure differences in the base composition. AT skew and GC skew were used to describe the difference in the contents of A and T, and the difference in the contents of G and C, respectively43. The A + T content in the mitochondrial genomes of the four species of Globodera was between 65.6 and 67.0% (Table 3), with an average of approximately 65%, showing a significant AT bias, and the differences between species were not significant. However, Gv mitochondrial genome showed positive AT skew and negative GC skew, whereas the other three species showed significantly negative AT skew and positive GC skew, indicating that the base composition of Gv mitochondrial genome was different from that of other species of Globodera, and the nucleotide usage of Gv was more inclined to A and C.

Table 3 Base content (%), AT skew and GC skew of the mitochondrial genome of Globodera.

Comparison of protein encoding genes in the mtDNA of four Globodera species

Compared to Gv, G. rostochiensis lacks ND2 and ND6, whereas G. pallida lacks ND2, ND4L and ND5, and ATP6 with incomplete sequences16,18,19,26,31. There were eight protein-coding genes in the mtDNA of these four species of Globodera, namely ATP6, COX1, COX2, COX3, CYTB, ND1, ND3 and ND4. However, there were some differences in gene length and similarity (Table 4). The lengths of COX3, CYTB and ND3 were similar among the four species, whereas those of ATP6, COX1, COX2, ND4 and ND1 varied from 45 to 511 bp. The results of the gene similarity comparison showed that the ATP6, COX1, COX2, ND1 genes of Gv were quite different from those of the other three species. At the nucleotide level, the most differentially expressed gene was the ND6, and the least differentially expressed one was CYTB. Most of the protein-encoding genes of Gv had the highest similarity with those of G. rostochiensis, but the lowest similarity with those of G. pallida. In conclusion, significant differences are present in the category, size and sequence similarity of the protein-encoding genes in the mitochondrial genomes of the different species of Globodera.

Table 4 Comparison of the length (bp) and similarity of the mitochondrial protein-encoding genes in Globodera.

Codon usage bias of the mtDNA of Globodera species

ENC refers to the number of effective codons used in a gene and is a quantitative value of the bias of the codon usage frequency of a gene from the average usage frequency of synonymous codons, which is often used to reflect the bias in codon usage. The smaller the ENC value, the stronger the codon usage44. A gene with an ENC value lower than 35 is generally considered to have an obvious codon usage bias45. The results of the ENC value calculation and analysis of the four species of Globodera showed that the ENC values of the protein-encoding genes for Gv, G. rostochiensis, G. pallida and G. ellingtonae were 44.1 (28.8–52.8), 41.2 (35.8–47.1), 43.6 (38.8–47.7) and 39.1 (32.1–48.2), respectively (Table 5). The ENC values of COX1, COX2, CYTB and ND4L in Gv were the lowest among the four species, which indicated that Gv had a relatively strong codon usage bias in these genes, whereas the ENC values of ATP6, COX3, ND2, ND4, ND5 and ND6 were the highest, indicating that the codon usage of those genes was relatively weak.

Table 5 The effective number of codons (ENC) of the mitochondrial protein-encoding genes in Globodera species.

Phylogenetic analysis

The phylogenetic tree was constructed based on the COX1 gene of Gv and other 27 species (or groups) of Tylenchida, with Trichuris ovis as the outgroup (Fig. 3). Globodera spp., H. glycines, M. chitwoodi, M. incognita, Pratylenchus vulnus and Radopholus similis were clustered in separate branches. The closest relationships with Globodera spp. were with H. glycines, followed by M. chitwoodi, M. incognita, P. vulnus and R. similis. Within this branch of Globodera spp., different geographical populations were clustered into separate branches. Gv had the closest relationship with G. rostochiensis, particularly in the German population, followed by G. tabacum. The phylogenetic tree was constructed based on the ND1 gene of Gv and other 18 species, with Trichuris ovis as the outgroup (Fig. 4). Aphelenchida and Tylenchida were clustered in separate branches. The ND1 gene in scmtDNA-V had the closest relationship with G. ellingtonae and clustered with other Globodera spp. as well as Radopholus similis into a branch. The experimental results showed that different target genes constructed different phylogenetic trees. Both of the sequences of COX1 and ND1 in scmtDNA-V were clustered with Globodera spp., but there were differences in the closest species.

Figure 3
figure 3

The phylogenetic tree inferred from the Bayesian method (BI) based on COX1 sequences.

Figure 4
figure 4

The phylogenetic tree inferred from the Bayesian method (BI) based on ND1 sequences.

Discussion

In metazoans, mtDNA generally has a single circular structure, and the division of mtDNA into multiple particles is a rare phenomenon29. Currently, the majority of mtDNA from plant-parasitic nematodes that have complete mtDNA sequencing have a single circular structure, except for species of the genus Globodera16,17,18,19,20,21,22,23,24,25,26,27,28. The mtDNAs of the three species of Globodera reported were multipartite structures with different scmtDNA numbers. G. rostochiensis had at least seven scmtDNAs, whereas G. pallida had six, and G. ellingtonae had two16,18,18,26,31. This study showed that Gv had five scmtDNAs, and that its mtDNA size was different from that of the other three species of Globodera. Therefore, it was speculated that the multipartite structure of mtDNA was an important phylogenetic feature and important basis for determining the taxonomic status of the genus Globodera, distinguishing it from other genera. The size of mtDNA and number of scmtDNAs were the important characteristics for species identification and differentiation in the genus Globodera.

The order of mitochondrial genes in metazoans is generally conserved within the same genus or even within the same order46. However, a comparison of mitochondrial protein-coding genes among the four species of Globodera revealed obvious differences in arrangement order and size. Therefore, mitochondrial genes might have important application value in species identification of the genus Globodera.

Seventeen tRNAs were annotated, and five tRNAs (trnD, trnH, trnN, trnQ and trnY) that exist in other nematodes were missing from the mtDNA of Gv. The trnS1and trnH were absent in G. rostochiensis and G. pallida16,18,19,31. Multiple copies of trnA, trnE, trnF, trnI, trnK, trnL1, trnL2, trnM, trnP, trnS and trnV occurred in the mtDNA of Gv, whereas multiple copies of trnD, trnE, trnI, trnK, trnP, trnQ, trnT and trnY in the mtDNA of G. rostochiensis, and trnD, trnH, trnI, trnK, trnP, trnT and trnV in the scmtDNA of G. pallida were also observed16,18,19,31. No multiple copies of tRNA were found in the mtDNA of G. ellingtonae26. It was reported that trnS2 and trnV were two copies in the mtDNA of two populations of M. graminicola, with substantial differences between the two copies25. It has been speculated that copies of trnS2 and trnV may have independently originated or may have been “involved” in the evolution of other exogenous tRNAs 47,48,49. The mtDNAs of Gv, G. rostochiensis, G. pallida and G. ellingtonae have multipartite structures with many scmtDNAs, suggesting that the multi-copy phenomenon of tRNA in the mtDNA of Globodera might be the result of gene recombination between the subgenomes. However, the multiple copies of some tRNA genes in Gv were quite different among subgenomes, which were not found in G. rostochiensis and G. pallida, and might be the result of exogenous tRNA intervention. In addition, except for the fact that eight tRNAs of Trichinella spiralis could be folded into a typical clover structure50, no tRNAs of other nematodes have been reported as having clover structures, including G. ellingtonae, where none of their 22 tRNA structures had a typical clover structure26,51, and the tRNAs structures of G. rostochiensis and G. pallida have not been reported. In this study, it was demonstrated that nine tRNAs (trnI′, trnK, trnL2′, trnM, trnP′, trnR, trnS2, trnV″, and trnW) of Gv had folded into the complete clover structure. Thus, the tRNA in the mtDNA of Gv was quite different from that in other nematodes, including G. rostochiensis, G. pallida and G. ellingtonae. Nematode mitochondrial genomes are known to exhibit a strong nucleotide compositional bias52. The base composition bias of mtDNA was mainly caused by the asymmetric mutation and selection pressure of four bases (A, T, G and C), which were mainly owing to DNA replication and gene transcription; In other words, mitochondrial genes in some species undergo frequent rearrangements, mutations and selection pressure during replication and transcription, resulting in a biased reversal of mitochondrial base composition; that is, the base composition changes from A and C–T and G53. The nucleotide skew is attributed to mutation pressure and selection pressure54. Interestingly, closely related species that have evolved in significantly different environments demonstrate similar base usage strategies. For instance, one species may exhibit an excess of thymine (T) over adenine (A), while a closely related species may show a bias towards using A over T55. In the case of the mtDNA of Gv and three other species of Globodera, it was observed that Gv displayed AT-positive and GC-negative skew, whereas the other three species showed AT-negative and GC-positive skew. This suggests that Gv may have adopted a unique adaptive evolutionary strategy.

Methionine-encoding ATG is commonly used as the initiation codon for translation in the nuclear genome; however, owing to obvious AT bias, the genetic code of mtDNA in nematodes mainly consists of AT. The initiation codons of protein-encoding genes in nematodes are ATT and TTG, ATA, ATC, CTT, GTT and GTG. TAA and TAG have been used as transcription termination signals in most nematodes56. Five sets of initiation codons were used in the mtDNA of Gv, among which ATA was used six times; ATT, ATG and TTG were used twice; and ATC was used once. The stop codons of two genes (COX1 and mt5-ND1) could not be determined, whereas TAG was used four times, and TAA was used seven times for genes with clear stop codons. In G. ellingtonae, the initiation codon TTG was used eight times, whereas TTA was used twice, and ATA and ATT were used once. The stop codons of three protein-encoding genes (COX1, COX2 and ND1) could not be determined, whereas TAG was used seven times, and TAA was used twice for genes with clear stop codons26. Therefore, differences were present in the codon usage rules of protein-coding genes in the mtDNAs of Gv and G. ellingtonae, although both had an obvious AT bias. The codon usage of protein-encoding genes in the mtDNA of G. rostochiensis and G. pallida has not yet been analysed and reported.

Generally, different species have different codon usage patterns, and genes of the same species may adopt similar codon usage strategies57. Species with closer phylogenetic relationships or more similar living environments might have similar codon usage patterns58, which might be the result of natural selection and species adaptation and widespread in organisms, reflecting some evolutionary phenomena59,60. Genes with higher codon usage bias tend to have higher expression levels58. The codon usage bias of the mitochondrial gene of Gv was weaker than that of the other three species of Globodera, indicating that its gene expression and evolution levels were lower and suggesting that Gv might be a more primitive group.

The base composition and codon usage of mtDNA of Gv differed from those of the other three species of Globodera, with more primitive characteristics in base selection and codon bias. In addition, the results of field investigations and experimental tests have shown that Gv could parasitize not only Solanaceae plants, such as potatoes and tomatoes, but also many other families, such as Caryophyllaceae, Asteraceae and Amaranthaceae42. Therefore, we speculated that Gv was an indigenous and more primitive group than G. rostochiensis. Gv might initially parasitize a variety of weeds. With the introduction of potato into China and its large-scale cultivation, potato has become its dominant host, and its characteristics, which are similar to those of G. rostochiensis might be the result of host adaptability.

Conclusion

In this study, the mtDNA of Gv was assembled, as well as the comparison of its mtDNA with those of other plant nematodes, revealed that the multipartite structure of mtDNA was the phylogenetic feature of Globodera spp. and the taxonomic feature of the genus. The number of scmtDNA was the distinguishing feature of the species in the genus, and the number, arrangement and size of mitochondrial protein-encoding genes have important application value in species identification for the genus Globodera. The regularity of codon usage and base composition bias of mitochondrial genes might also hold significant value for studying the origin and phylogenetic relationships of Globodera. Furthermore, the number, variety, and structure of tRNAs in the mitochondrial genome might have application value for classification, origin, and phylogenetic studies in the Globodera genus. This study revealed that Gv bases composition were biased towards A and C without bias inversion, and mitochondrial gene codon usage was weaker, suggesting that they might have originated from a native and more primitive group of existing Globodera.

Materials and methods

Nematodes

Gv used in this experiment was collected from the potato root system and soil in the potato fields in Hezhang County, Guizhou Province, China, and was separated, identified by the Plant Nematode Laboratory of South China Agricultural University. It was preserved and propagated in potato root. According to the method of Zasada et al.56, sodium orthovanadate solution (0.1 mg/mL) was prepared as an inorganic salt incubator for hatching the second-stage juveniles (J2s). The cysts were sterilised with 0.5% sodium hypochlorite solution for 3 min, then washed with sterile water 5 times, soaked in sterile water for 2 days, then transferred into sodium orthovanadate solution and incubated in a constant temperature incubator at 22 °C. The hatched J2s were collected every 2 days into a 1.5-mL centrifuge tube, disinfected with 0.15% sodium hypochlorite solution then washed with sterile water several times. Afterwards, they were suspended and stored at − 80 °C for subsequent use.

Molecular identification of nematodes

To ensure that all materials used in the experiment were derived from the spheroid cysts of the same species, three J2s hatched from each cyst were collected using a pick needle under a microscope for PCR identification of a single nematode. The DNA from a single nematode was extracted using the method described by Xu et al.61 for ITS sequence amplification using rDNAF/rDNAR primers62, and the products were sent to Sangon Biotechnology Co., Ltd. (Shanghai, China) for sequencing. Sequence similarity analysis was performed using MEGA 11 software, and it was determined that all cysts belonged to the same species.

High-throughput sequencing of mitochondrial DNA, mtDNA assembly and annotation

A total of thirty thousand J2s, hatched and isolated from cysts, were used to for the total DNA extraction and obtained 0.5 μg DNA with a concentration ≥ 20 ng/μl for high-throughput sequencing of mitochondrial DNA. The total DNA and mtDNA of Gv were sequenced and assembled respectively using an Illumina NovaSeq and Nanopore, which were entrusted to Huitong Biotechnology Co., Ltd. (Shenzhen, China). When total DNA was sequenced using an Illumina NovaSeq, the original image data were converted into sequence data via Base Calling. Some low-quality data were screened using the fastqc online service website: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/, which comprised the following steps: removing an adapter sequence in the reads, shearing and removing a base containing a non-ATGC at the 5′ end, trimming and sequencing the reads with a mass value less than Q20, removing the reads with the proportion of N reaching 10%, deleting and removing the adapter and small fragments with a mass trimmed length of less than 50 bp. Nanopore sequencing of the total DNA was completed according to the standard protocol provided by Oxford Nanopore Technologies, which comprised the following steps: purity, concentration, integrity detection of genomic DNA by Nanopore, Qubit, and 0.35% agarose gel electrophoresis; recovery of large fragments of DNA by the BluePippin automatic nucleic acid recovery system; and library construction using the SQK-LSK109 connecting kit.

De novo assembly was used to complete the initial concatenation with SPAdes 3.11.0, and the careful mode was used to self-correct the hammer sequence and spliced postback correction sequence. The mitochondrial genome and protein-encoding gene sequences of related species (G. rostochiensis and G. pallida) published in NCBI were used as reference comparison sequences for BLASTN and ExONERATE alignments, with the alignment thresholds set as E-value 1e−10 and protein similarity threshold of 70%. Sequences with longer matching sequences and higher similarity were screened from the comparison results as target sequences. The PRICE and MITObim software were used to iteratively extend, merge and splice the collected fragmented target sequences. The number of iterations was set to 50. For the iterative splicing, bowtie2 was used to post back the original sequenced reads, and the paired reads were screened. SPAdes was used for re-splicing, and VelvetOptimiser was used to optimise the splicing results. After splicing, sequence integration was performed to complete the assembly of the annular genome.

The sequences were aligned with the NCBI NT library to determine the species annotation gene types and lengths of the near-source sequences. Gene annotation was performed on the Mitos (http://mitos.bioinf.uni-leipzig.de/index.py) and MFannot online sites (https://megasun.bch.umontereal.ca/cgi-bin/MFANNOT/MFANNOTINTERFACE.pl). The genetic codes were 05-InVertebrate and 5-Invertebrate Mitochondrial. The initiation and termination codons were manually corrected for the annotated genes using two annotation methods. The corrected annotated genes were aligned using the NCBI NT library for secondary corrections.

Bioinformatic analysis of mtDNA

The structural diagram of the scmtDNAs of Gv were drawn using graphics softwares, such as Photoshop CS2 9.0 and CorelDraw 12chs. The nucleotide composition of the mitochondrial genome and codon usage of the protein gene were determined using MEGA 11 and DNA star softwares, respectively. The nucleotide composition bias of the gene was calculated by DNA star software and formula, where AT skew value (AT skew) = [A−T]/[A + T], and GC skew value (GC skew) = [G−C]/[G + C]42. The effective number of codons (ENC) for mitochondrial protein genes was calculated and analysed using Clustalx 1.83 software and EMBOSS Explorer online (https://www.bioinformatics.nl/emboss-explorer/)63. The similarity of different sequences was compared using Clustalx 1.83 and DNA star.

PCR verification of sequencing results

To verify the results of high-throughput sequencing, two specific segments of approximately 3 kb, referring to the results of high-throughput sequencing, were selected from each scmtDNA, and primers were designed (S1 Table.). Total DNA was used as a template for PCR amplification, and the products were sequenced and analysed.

Phylogenetic analysis

Owing to the large nucleic acid sequences of the mtDNAs of Globodera and the variable arrangement of each gene in different species, a relatively well-conserved gene (COX1 gene) and a variable gene (ND1 gene) among species were selected to construct phylogenetic trees. Primers of COX1 gene and ND1 gene were designed according to the sequencing results, and the DNA of Gv was used as a template for COX1 gene and ND1 gene PCR amplification, respectively. The PCR products were sequenced and analysed. The nucleotide composition of the sequence was analysed using EditSeq software, and the A + T content was calculated. COX1 gene sequences of 26 nematode species (or populations) (S2 Table.) and ND1 gene sequences of 18 nematode species (or populations) (S3 Table.) published in the NCBI were used as reference sequences to construct phylogenetic trees. The sequences were aligned by MEGA 11 software with default parameters, and the conserved region was selected and the redundancy were removed by Gblocks 0.91b software with half gap. The parameter values of each model were calculated by PAUP software, and the best model setting parameters were determined by MrModel Test software. Phylogenetic analyses were performed using Bayesian Inference (BI) approach64 with the operation basis of AIC (Akaike information criterion). The BI approach was performed by MrBayes v3.2.1 software and the specific parameters were as follows: Four independent Markov Chain Monte Carlo (MCMC) models were used, running for two million generations, sampling once every one hundred generations. After discarding the first 25% aged samples, the parameters were summarized, and the remaining samples were used for checking Bayesian posterior probability (PP). When the frequency average standard deviation was less than 0.01, the operation could be stopped.

Ethics statement

No specific permissions were required for the nematodes used in this study, and these nematodes were plant pests and not protected by the government. This research is carried out in accordance with relevant designated guidelines and regulations. Manuscripts complied with the Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines.