Introduction

The Chinese Porcupine (Hystrix brachyura) is a member of the rodent family Hystricidae. As a common terrestrial wild animal, it occurs in a wide range of habitats, including primary and secondary forest, cultivated areas and plantations1. Artificial breeding of porcupines used to be very popular in China for its high medicinal and ornamental value2. Porcupines are relatively well-protected from predators but are subject to a degree of parasitic infestation. Thus, clarifying the species of parasites that porcupines are infected with is important for the control of parasitic diseases. Wellcomia compar, the so-called porcupine pinworm that belongs to Oxyuridae nematode, was first reported and the detail of its morphological structure was recorded by Leidy3. Similar to that of other pinworms, the life cycle of W. compar is simple and typically entails the ingestion of infective eggs orally, and then the larvae infiltrating the glandular fossa of the cecum mature into adult worms. The pinworm can adversely impact porcupine hosts by causing indigestion, weight loss, and poor coat condition. However, so far, few studies have reported the genetics, systematics, and biology of this pinworm.

The phylogeny and classification of nematodes have remained subjects of debate, and the poor resolution conferred by morphologic characteristics has resulted in incongruent classification systems. 18S rRNA was the first genetic marker used for phylogenetic classification of nematodes4. This and subsequent analyses based on 18S rRNA have led to the recognition of major lineages of nematodes5,6,7. Recent molecular analyses of mt genome sequences have provided selectable hypotheses for the phylogenetic relationships of nematodes based on 18S rRNA.

Mitochondria (mt) synthesise their own proteins and are involved in various physiological functions8. After a long period of evolution, the genetic composition of mt in most metazoans is now present consistently, suggesting that mt genes are important for maintaining the basic functions of mt9. The mt genome accumulates mutation rates at a faster rate than the nuclear genome and is characterised by fast evolution, maternal inheritance, and a simple structure. Therefore, it is considered a reliable marker in cladistic systematics10,11 and has been widely employed as a genetic marker for identifying and distinguishing organisms, particularly for population genetics among nematode species12,13,14. The 12 protein-coding genes (PCGs) of the mt genome are also often used to elucidate the phylogenetic relationships between species15,16,17,18,19,20,21.

Despite the availability of numerous advanced sequencing and bioinformatic methods, the 18S rRNA gene remains the primary genetic marker used for phylogenetic construction and molecular identification of Oxyuridomorpha. Additionally, other genes such as Cox1, ITS1, ITS2, and 28S rRNA are also employed for this purpose22,23,24,25,26. However, the use of complete genome mt for constructing and characterising Oxyuridomorpha phylogenies is infrequent. To date, only six Oxyuridomorpha species have had their complete mt genomes published17,18,19,20,21. Therefore, more mt genomes of Oxyuridomorpha nematodes are needed to provide useful molecular markers for population genetic and phylogenetic studies of pinworms.

In this study, we aimed to sequence and analyse the complete mt genome of W. compar and compared it to the gene arrangement order of other pinworms. In addition, we used the concatenated coding sequences (CDS) of 12 PCGs from 33 Spirurina nematodes and an Enoplea nematode to reconstruct a more reliable phylogeny.

Methods

Ethics statement

No animal was harmed during the course of this research. This study was approved by the Animal Ethics Committee of the Zunyi Medical University (approval no. LVRIAEC [2016] 2-059).

Parasite isolates and DNA extraction

Adult W. compar parasites were collected from infected porcupines in Zunyi, Guizhou Province, China. The samples were washed with sterile saline, preserved in 100% ethanol and stored in a − 20 °C refrigerator until use. The isolation of total genomic DNA from the samples was carried out using the Wizard Genomic DNA Purification Kit (Promega, Madison, Wisconsin, USA), following the instructions provided by the manufacturer. The DNA concentration and quality of the extracted samples were assessed using a Nanodrop 2000 spectrophotometer. The total DNA was detected by agarose gel electrophoresis using an Agilent 2100 Bioanalyzer.

Mitochondrial genome sequencing and assembly

The whole genome shotgun strategy was used to construct libraries containing DNA fragments of 400–500 bp that were randomly interrupted by a covaris ultrasonic crusher. Paired-end genomic library (approximately 400 bp inserts) was constructed according to the manufacturer’s instructions (Illumina), and were sequenced by next-generation sequencing technology based on the Illumina HiSeq sequencing platform. A5-miseq v2015052227 and SPAdesv3.9.0 (http://cab.spbu.ru/software/spades/)28 were used to assemble high-quality second-generation sequencing data to construct contig and scaffold sequences. The contig was compared with the nt library on NCBI using BLAST v2.2.31+, and the contig from mt was screened. Then, collinearity analysis was carried out using the mummer v3.129 software to determine the positional relationship and to fill the gaps between contigs. Finally, A5-miseq software was selected for splicing results, and Pilon v1.1830 software was used to correct the results to obtain the final mt sequence.

Gene annotation and analysis

The complete mt genome sequence was uploaded to the MITOS web server (http://mitos.bioinf.uni-leipzig.de/) for functional annotation31. The genetic code was set to 05 invertebrate, and the remaining were set according to the default parameters set by MITOS. For an undetectable tRNA gene, the approximate position was determined by comparison with the corresponding tRNA sequence in the reference genome32. The secondary structure was manually drawn to determine the final sequence and structure. The base compositions (%) of the complete mt genome, PCGs, and rRNA genes were calculated using Mega X33. The CG View visualisation software was used to map the entire mt genome34. The figure of Gene rearrangement was drawn using PhyloSuite v1.2.2 (http://phylosuite.jushengwu.com/installation_packages/PhyloSuite_v1.2.2_Win64_with_plugins.rar)35.

Phylogenetic analysis and genetic distance

The 33 mt genomes of nematode parasites were downloaded from NCBI (supplementary material Table S1), including 32 of Spirurina nematodes (mostly published papers) as ingroups and one of an Enoplea nematode, Trichuris suis, as an outgroup. Phylogenetic relationships among the 33 representative nematode species sequences and that of the W. compar mt DNA obtained in this study were reconstructed based on the CDS of 12 PCGs excluding the stop codon, and were concatenated. 12 PCGs were aligned in batches with MAFFT36 using the '–auto' strategy and codon alignment mode. Each gene was translated into amino acid sequences using the invertebrate mt genetic code in MAFFT and aligned based on its amino acid sequence using the default settings. The alignments were back translated into the corresponding nucleotide sequences, and refined using the codon-aware program MACSE v. 2.0337 with the invertebrate mt genetic code, which preserves the reading frame and allows the incorporation of sequencing errors or sequences with frameshifts. Ambiguously aligned fragments of 12 alignments were removed in batches using Gblocks38 with the following parameter settings: minimum number of sequences for a conserved/flank position (16/16), maximum number of contiguous non-conserved positions (8), minimum block length (10), and allowed gap positions (with half). The final sequences of the 12 PCGs were concatenated into single alignments for phylogenetic analyses using phylosuite35. ModelFinder39 was used to select the best-fit partition model (edge-linked) using Bayesian information criterion (BIC). Best-fit partition model for BI and ML trees are listed (supplementary material Tables S2, S3) according to BIC. Bayesian inference phylogenies were inferred using MrBayes v3.2.7a40 under the partition model (2 parallel runs, 4 independent Markov chains run for 3,000,000 metropolis-coupled MCMC generations, sampling a tree every 100 generations), in which the initial 25% of sampled data were discarded as burn-in. ML phylogeny was inferred using IQ-TREE v1.6.12 using an edge-linked partition model for 50,000 ultrafast bootstraps41,42. Phylograms were drawn using Interactive Tree Of Life (iTOL) v6 (https://itol.embl.de/itol.cgi)43. The calculation of pairwise genetic distances between phylogenetic trees was conducted using MEGA X33. Variance estimation was performed utilising concatenated PCGs sequences, employing the Bootstrao method with 1000 generations, and modeling the distances using the P-distance.

Ethics approval and consent to participate

No animal was harmed during the course of this research. We confirm that all experiments were performed in accordance with relevant guidelines and regulations. All experimental procedures involving animals were reviewed, approved and supervised by the Animal Ethics Committee of Zunyi Medical University (approval no. LVRIAEC [2016] 2-059).

Results and discussion

Genome structure, organisation and composition

The complete mt genome of W. compar (GenBank accession No. MW059037) was a 14,373 bp closed-circular molecule (Fig. 1), encoding an entire set of 36 genes, which included 12 PCGs (cox1-3, nad1-6, nad4l, atp6, and cytb), 22 tRNA genes, two rRNA genes (s-rRNA and l-rRNA), and a non-coding region (NCR)) (Table 1). However, this was different from the Enoplea nematode (such as Trichuris_suis) gene set, which has an atp8 gene44. All genes were transcribed in the same direction on the N-strand. The distribution of genes in the mt genome was identical to those of Wellcomia siamensis17 and Syphacia obvelata18. The overall base composition of the mt genome of W. compar was as follows A: 25.7%; T: 53.0%; G: 16.9%; C: 4.4%; G + C: 21.3%; and A + T: 78.7%. The overall A-T and G-C skews values were -0.346 and 0.586, respectively (Table 2). The intergenic spacer sequence was 613 bp in total and included 28 regions ranging in size from 1 to 147 bp. The overlapping nucleotide fragments were scattered in one place, located between trn H and rrn S (-2 bp).

Figure 1
figure 1

Circular map of the mt genome of W. compar. Scale is approximate. 12PCGs and 2 rRNA have standard nomenclature. The 22 tRNA genes are designated by a single-letter code for the corresponding amino acid, with numbers used to distinguish the two leucine and serine designated tRNAs. Grey region rerensents NCR. From the inside to the outside of the circle diagram, the first circle represents the scale, the second circle represents G-C skew, the third circle represents GC content, and the fourth circle represents the arrangement of PCGs, tRNA genes, and rRNA genes on the genome.

Table 1 Characterisation of the mt genome of W. compar.
Table 2 Base composition of the complete mt genome, PCGs, and rRNA genes of W. compar.

Gene annotation and analysis

The mt genome of W. compar encodes 12 PCGs, which contain 3420 codons with a total length of 10,260 bp. The content of A + T was 77.5%, which was far higher than G + C. The nucleotides in metazoan mt genomes are not randomly distributed, and this nucleotide bias is often linked to the unequal usage of codons45. The nucleotide usage of the 12 PCGs in the mt genome of W. compar is shown in Table 3. The codons TTT (phenylalanine, 19.2%), TTA (leucine-2, 8.2%), ATT (isoleucine, 7.5%) usage were most frequent. Therefore, the nucleotides in PCGs were biased towards A and T. Most of the PCGs had the start codon TTG, except for cox 1 that uses ATA, and the other three genes (nad 1, nad 4 and cox 2) that use ATT (Table 1). Three types of stop codons were used: TAG (nad 1, cob and nad 5), TAA (nad 6, nad 4 and nad 4l) and an abbreviated stop codon T (cox 1, atp 6, nad 2, cox 3, nad 3 and cox 2), which is an incomplete TAA stop codon and is completed by the addition of 3` A residues to the mRNA (Table 1).

Table 3 Nucleotide codon usage of the 12 PCGs of the mt genome of W. compar.

W. compar mt DNA contains 22 tRNA genes, which range from 53 bp (trnS1) to 64 bp (trnI and trnT). The two rRNA genes, rrnL and rrnS, were 959 bp and 732 bp in size, respectively (Table 1). rrnL is located between trnC and trnM, and rrnS is situated between trnH and trnA. The A + T content of rrnL and rrnS was 80.9% and 76.5%, respectively (Table 2).

The NCR, located between trnS2 and trnN, was 530 bp in length with a higher A + T content (97.2%) than of any other region of the mt genome.

Gene rearrangement in Oxyuridomorpha

The Oxyuridomorpha infraorder comprises seven families: Thelastomatoidae, Travassosinematidae, Hystrignathidae, Protrelloididae, Oxyuridae, Pharyngodonidae, and Heteroxynematidae5. To date, the mt genomes of many Oxyuridomorpha nematode infraorder lineages are still underrepresented or not represented, except for those of Oxyuridae and Heteroxynematidae. Oxyuridae includes seven species (including W. compar) and Heteroxynematidae, and we thus compared the gene rearrangement between the seven species of Oxyuridomorpha.

Gene rearrangement is mainly caused by mutations in the mt46, and for which the TDRL model is possibly the most widely accepted explanation hypothesis47,48 The mt gene arrangements in the seven species were not the same (Fig. 2), with W. compar being consistent with W. siamensis and S. obvelata but different from Oxyuris equi, Enterobius vermicularis, Aspiculuris tetraptera, and Passalurus ambiguus. The main difference was the occurrence of a transposition event (a position change of trnI for O. equi and E. vermicularis; a position change of trnY for A. tetraptera) or the number of NCR. trnI was inserted between NCR and trnN, and trnY was inserted between trnQ and trnC. P. ambiguus has two NCRs, consistent with other nematodes44,49,50,51, and an extra NCR in P. ambiguus was found between trnA and trnS2. Based on the order of gene arrangement, it can be inferred that W. compare has a closer evolutionary relationship with W. siamensis and S. obvelata.

Figure 2
figure 2

Gene orders in the seven species in Oxyuridomorpha. All genes are transcribed in the same direction on the N-strand.

Phylogenetic analyses and genetic distance

Pairwise genetic distances in the phylogenetic tree are displayed in Data S1 of the Supplementary Material, where we find that the genetic distances between species are smaller within the same superfamily than the distances with species from other families. In Oxyuridomorpha, the genetic distance between W. compar and W. siamensis was 0.0805, which was lower than that between W. compar and other pinworms. Phylogenetic analyses of W. compar and the selected Spirurina nematodes were performed using maximum likelihood (ML) and Bayesian inference (BI) methods based on concatenated mitochondrial CDS of 12 PCGs (BI/ML [Fig. 3]). The mt genome sequences may provide reliable genetic markers for examining the taxonomic status of nematodes, particularly when PCG sequences are used as markers for comparative analyses15,16,17,18,19,20,21. Because some superfamilies were represented by a single species, this topology should be interpreted with caution. Phylogenetic analysis showed that the BI and ML trees both divided Spirurina into two clades: Spiruromorpha formed one separate clade, and Oxyuridomorpha + Rhigonematomorpha + Gnathostomatomorpha + Dracunculoidea + Ascaridomorpha formed another clade. The second clade was further sub-divided into two clades, Rhigonematomorpha + Gnathostomatomorpha + Dracunculoidea + Ascaridomorpha and Oxyuridomorpha. These results are consistent with a recent study in Spirurina47.

Figure 3
figure 3

A phylogenetic tree of Spirurina inferred by BI and ML using Trichuris suis as the outgroup. Bayesian posterior probability (left) and bootstrap support (right) values of each clade are displayed. Wellcomia compar has been bolded. Inconsistent branches in ML were listed separately(right).

Within Spiruromorpha, many nodes were well-supported. Of note, our results indicate that Tetrameres grusi is a sister to Spirocerca lupi. Although S. lupi and Thelazia callipaeda belong to the superfamily Thelazioidea, T. callipaeda did not cluster together with S. lupi, but instead showed an early diverging position to S. lupi. These results are identical to those previously reported50,52. Within Ascaridomorpha, the superfamily Ascaridoidea forms a well-supported clade, whereas Rhigonematoidea and Heterakoidea form a sister branch53,54, and they are the sister group of Seuratoidea. Notably, Ascaridoidea, Heterakoidea, and Seuratoidea belong to the common infraorder: Ascaridomorpha. The topology of this clade indicated that Rhigonematoidea is closely related to the superfamily Ascaridomorpha. In many early morphology-based classification systems, Rhigonematomorpha was considered to be related to Oxyuridomorpha (pinworms)55. Subsequent research based on 18S rRNA phylogeny separated Rhigonematomorpha from Oxyuridomorpha as a distinct group6. Our results are inconsistent with those of previous research, and we suggest that it may belong to Ascaridomorpha.

Our results show that Dracunculoidea is more closely related to Ascaridomorpha than Spiruromorpha, which is in agreement with recent studies42. In our study, Gnathostomatomorpha forms a sister branch with Rhigonematoidea + Ascaridomorpha, which is consistent with previous studies47,51,56. However, previous studies have also suggested that Gnathostomatomorpha is nested within Ascaridomorpha57, or that Gnathostomatomorpha forms a sister branch with Ascaridomorpha + Oxyuridomorpha + Dracunculoidea + Spiruromorpha58.

The phylogenetic trees derived from two analytical methods exhibited a high degree of similarity in their topological structure. Notably, the branching patterns of the trees within the Gnathostomatomorpha clade displayed confuseing characteristics. Furthermore, dissimilar topologies were observed in the Oxyuridomorpha clade. Our results revealed that W. compar was sister to W. siamensis, with high statistical support (Bayesian posterior probability = 1.00; bootstrap support = 100). In Oxyuridomorpha, W. compar and W. siamensis are clustered together and have closer relationships with S. obvelata than with E. vermicularis in both trees, which is further supported by the comparison of gene orders and the genetic distance (PCGs) in this study. In previous research, it was observed that A. tetraptera and S. obvelata exhibited clustering, while P. ambiguus was identified as the sister species to O. equi18. Another study confirmed that W. siamensis was the sister species to P. ambiguus, as mentioned in a prior investigation19. Additionally, a recent study47 found P. ambiuguus to be sister to A. tetraptera. However, our results differ from these findings, possibly due to the limited availability of the complete mt genome of pinworms. In the present study, preference was given to BI tree classification owing to its higher support values. Despite pinworms are nematodes of human and animal health significance, the sequencing and publication of mitochondrial genomes have been limited to only seven species. Consequently, there is a need to increase taxon sampling in order to facilitate future phylogenetic investigations of this infraorder using mt genomic datasets.

The systematics of the suborder Spiruromorpha (Spiruromorpha, Oxyuridomorpha, Rhigonematoidea, Gnathostomatomorpha and Ascaridomorpha) has been controversial for decades. However, sequence analysis based on mt genomes presents roughly similar phylogenetic trees, which differ from those constructed based on 18S rRNA and single mitochondrial genes. To date, phylogenetic analysis based on 18S rRNA genes has accumulated more than 2700 sequences59, covering many but not all nematode taxa. However, it lacks the resolution needed to elucidate the phylogenetic relationships of the entire nematode tree, as different taxa exhibit different gene mutation rates, meaning that any single marker is unlikely to perform equally well in inferring the phylogeny of all nematode taxa. A better strategy for analysing nematode phylogeny involves targeting multiple genes rather than one gene. A large number of genes can help offset the adverse effects of recalcitrant genes in the analysis60,61. An analysis based on mt genomes fits this trend better, perhaps representing a better future for systematics.

Conclusions

The present study identified the complete mt genome sequence of W. compar, enriching the mt genome database of Oxyuridae. This is the first complete mt genome isolated from the body of Chinese porcupines and belongs to the same Wellcomia genus as W. siamensis previously isolated from Korea. We have compared the results with the mt genome of other Oxyuridomorpha pinworms. The gene sequence arrangement results show that W. compar is more closely related to W. siamensis and S.obvelata. Based on the mt genome sequence, we have determined the phylogenetic relationships within Spirurina, and the phylogenetic analysis indicates that W. compar is the sister group of W. siamensis, which is consistent with the gene arrangement results. In the phylogenetic analysis, we found that the species support of Oxyuridomorpha was low and inconsistent. Therefore, we believe that it is necessary to further investigate additional taxonomic groups and sequencing data to understand and document the evolution of this important group. This mt genome provides a new genetic marker for studying the molecular epidemiology, population genetics, and systematics of parasitic pinworms in animals and humans, and is of great significance for the diagnosis, prevention, and control of parasitic diseases in porcupines and other animals.