Gnathostoma spinigerum Mitochondrial Genome Sequence: a Novel Gene Arrangement and its Phylogenetic Position within the Class Chromadorea

Human gnathostomiasis is an emerging food-borne parasitic disease caused by nematodes in the genus Gnathostoma. In spite of their significance as pathogens, these parasites remain poorly understood at the molecular level. In the present study, we sequenced the mitochondrial (mt) genome of G. spinigerum, which infects a range of definitive hosts including dogs, cats, tigers, leopards and humans. The mt genome of G. spinigerum is 14,079 bp in size and shows substantial changes in gene order compared to other nematodes studied to date. Phylogenetic analyses of mt genome sequences by Bayesian inference (BI) revealed that the infraorder Gnathostomatomorpha (represented by G. spinigerum) is closely related to the infraorder Ascaridomorpha. G. spinigerum is the first species from the infraorder Gnathostomatomorpha for which a complete mt genome has been sequenced. The new data will help understand the evolution, population genetics and systematics of this medically important group of parasites.

Oxyuridomorpha 10 . The phylogenetic relationships among the infraorders of the Spirurina have been assessed using nuclear small subunit (SSU) rRNA (five infraorders) gene and mitochondrial (mt) gene/ genome sequences (four infraorders) and there are inconsistencies between the nuclear phylogeny and the mt phylogeny [11][12][13][14][15][16][17][18] . In the phylogeny inferred from SSU rRNA gene sequences, the Ascaridomorpha is sister to the Rhigonematomorpha, and the Spiruromorpha is sister to the Ascaridomorpha + Rhigon ematomorpha, the Oxyuridomorpha is sister to the Spiruromorpha + Ascaridomorpha + Rhigonemato morpha, and the Gnathostomatomorpha is sister to the Oxyuridomorpha + Spiruromorpha + Ascarido morpha + Rhigonematomorpha 12,19 . In the mt gene/genome phylogeny, however, the Ascaridomorpha is sister to the Rhabditomorpha + Diplogasteromorpha in most analyses, and the Spiruromorpha is sister to the Rhabditomorpha + Diplogasteromorpha + Ascaridomorpha + Rhigonematomorpha + Panagrolaimo morpha + Tylenchomorpha + Oxyuridomorpha [15][16][17] . More recently, Kim et al. 20 inferred the phylogeny with mt genome sequences and showed that the Rhigonematomorpha is sister to the Ascaridomorpha. Taxon sampling was limited in both the nuclear SSU rRNA and the mt gene/genome phylogenetic analysis; furthermore, no species from the Gnathostomatomorpha was included in any of these mt analyses.
Animal mt genomes are typically a circular DNA, 15-20 kb in size, containing 36-37 genes: 12-13 protein-coding genes (PCGs), 22 transfer RNAs (tRNA) and two ribosomal RNA (rRNA) genes 21,22 . Mt genome sequences are commonly used for phylogentic, population genetic and taxonomic investigations of animals 23,24 . To understand the phylogenetic relationship of the infraorder Gnathostomatomorpha with other infraorders of the class Chromadorea, we sequenced the mt genome of G. spinigerum.

Results and Discussion
General features of the mt genome of G. spinigerum. The complete mt genome of G. spinigerum (GenBank accession no. KP410547) was 14,079 bp in size (Fig. 1). This genome contains 12 PCGs (cox1-3, nad1-6, nad4L, atp6 and cytb), 22 tRNA genes, two rRNA genes (rrnL and rrnS) and two non-coding (AT-rich) regions. All genes are transcribed in the same direction. As in most other nematodes of the class Chromadorea, atp8 gene is not present in the mt genome of G. spinigerum (Table 1). The mt genome sequence of G. spinigerum is biased toward A and T (71.1%), similar to that of other nematodes in the suborder Spirurina [25][26][27][28] . This nucleotide composition of the 12 PCGs of G. spinigerum was strongly skewed away from A, in favour of T (AT skew between − 0.58 and -0.27), and the GC skew was between 0.41 and 0.77 (Table 2). Codons composed of A and T were more frequently used in PCGs, reflecting the high A + T content in the mt genome of G. spinigerum. The most frequently used amino acid was TTT (Phe; 13.14%), followed by TTG (Leu; 8.79%), ATT (IIe; 5.96%) and GTT (Val; 5.46%) (Table 3). ATA, TTG and ATT were used as initiation codons and TAA and TAG as termination codons; incomplete All genes are on the same DNA strand and are transcribed clockwise. Protein-coding and rRNA genes are indicated with the standard nomenclature. tRNA genes are indicated with the one-letter code of their corresponding amino acids. There are two tRNA genes for leucine: L 1 for codons CUN and L 2 for UUR; and two tRNA genes for serine: S 1 for codons AGN and S 2 for UCN. "NCL" refers to the large non-coding region. "NCR" refers to a small noncoding region.
Scientific RepoRts | 5:12691 | DOi: 10.1038/srep12691 termination codons (T or TA) were also identified ( Table 1), which is consistent with the arrangement in the mt genomes of other nematodes [29][30][31] . Twenty-two tRNA genes were identified in the mt genome of G. spinigerum, which range from 54 to 68 bp in size. The secondary structures inferred for the 22 tRNAs of G. spinigerum are similar to those of other nematodes [29][30][31] (Fig. 2). rrnL is located between trnH and nad3 in the G. spinigerum mt genome; rrnS is between trnE and trnS 2 ( Table 1). The two non-coding regions in the mt genome of G. spinigerum were located between trnI and trnN (designated NCL, 750 bp), and between nad1 and atp6 (designated NCR, 77 bp) respectively. No repetitive sequences were detected in the non-coding regions of G. spinigerum, as in other Spirurina nematodes 26,27 . Gene arrangement in the mt genome of G. spinigerum. The 32,33 ). In this study, we compared the 30 gene arrangement patterns in nematodes, and found that the gene order for all of the protein-coding and rRNA genes were principally conserved, but tRNA genes/clusters were not conserved in nematodes. Furthermore, mt gene arrangement events among the 30 patterns observed in nematodes were analyzed with CREx 34 ; numerous transposition and tandem-duplication-random-loss (TDRL) events could be inferred. Our results support the view that the evolution of mt gene order in nematodes was mostly driven by transposition and TDRL 14 ; inversion and reverse-transposition played a minor role relatively. Compared to the most common pattern of mt gene arrangement seen in nematodes (i.e. GA3) 17 , a block of 12 genes (from trnV to trnK, Fig. 3) in G. spinigerum has been broken and moved to four locations. Seven genes (trnV, nad6, nad4L, trnW, trnE, rrnS and trnS 2 ) between trnP and trnN in GA3 pattern, was found between apt6 and cox1 in GA26 pattern in G. spinigerum. trnN was located between trnS 2 and trnY in GA3 pattern but was between trnI and trnR in GA26 pattern. Three genes (trnY, nad1 and atp6) between trnN and trnK in GA3 pattern were located between nad4 and trnV in GA26 pattern. Additionally, trnK, which was between atp6 and trnL 2 in GA3 pattern, was located between trnC and trnM in GA26 pattern. CREx analysis modeled that one transposition and two TDRL events were required to convert GA3 to GA26. Two rearrangement events involved PCGs while the other two events involved only tRNA genes.
Phylogenetic analyses. We inferred the phylogenetic relationship between G. spinigerum and other 57 species of Chromadorea nematodes with concatenated amino acid sequences of the 12 mt PCGs (Fig. 4). Phylogenies of the Chromadorea nematodes were inferred with mt genome sequences in previous studies [14][15][16][17][18] ; however, several major lineages including the infraorder Gnathostomatomorpha were not represented.
Our Bayesian analysis showed that G. spinigerum was most closely related to Cucullanus robustus with moderate support [Bayesian posterior probabilities (Bpp) = 0.88, Fig. 4]. Our maximum likelihood (ML) and maximum parsimony (MP) analyses also recovered this relationship but the bootstrapping frequency (Bf) was weak (not shown). This grouping was inconsistent with those from morphological and molecular studies 12,19,[35][36][37] . The infraorder Ascaridomorpha was monophyletic in the present study (Bpp = 0.67, Fig. 4). The two species from the two families, Cucullanidae and Ascaridiidae, of the infraorder Ascaridomorpha were more closely related to G. spinigerum (infraorder Gnathostomatomorpha) and R. thysanophora (infraorder Rhigonematomorpha) than they were to other six species from the infraorder Ascaridomorpha. R. thysanophora was most closely related to Ascaridia galli (Bpp = 0.67). A recent study based on mt genome sequences also indicated that R. thysanophora was most closely related to an Ascaridia species 20 . Using nuclear SSU rRNA gene sequences, Meldal et al. 12 showed that the infraorder Ascaridomorpha was sister to the group that included the infraorders Ascaridomorpha, Rhabditomorpha, Spiruromorpha, Oxyuridomorpha, and Gnathostomatomorpha. These controversial results surrounding phylogenetic placement of members in Spirurina may reflect the different evolutionary rates of the nuclear and mt genomes 38,39 .
Our analysis also showed that the infraorder Rhabditomorpha was paraphyletic with respect to the Diplogasteromorpha. Two species from the families Rhabditidae and Heterorhabditidae of the Rhabditomorpha were more closely related to Pristionchus pacificus (Neodiplogasteridae) than they  Table 3. Codon usage of Gnathostoma spinigerum mitochondrial protein-coding genes. were to the other 22 species from the Rhabditomorpha. The close relationship between the species of the families Rhabditidae and Heterorhabditidae and P. pacificus was strongly supported in BI (Bpp = 1, Fig. 4). The results were consistent with that of a previous study using nuclear SSU rRNA gene dataset 11 . The Oxyuridomorpha (2 species) and the Spiruromorpha (12 species) were both monophyletic with strong support in the present analysis (Bpp = 1, Fig. 4). The nine species of the suborder Tylenchina included in this study were from two infraorders: Panagrolaimorpha (2 species), and Tylenchomorpha (7 species). Both of these infraorders were paraphyletic in the present analysis (Bpp ≥ 0.67, Fig. 4). The Oxyuridomorpha and the Spiruromorpha were both monophyletic with strong support in the current analyses (Bpp = 1 for Oxyuridomorpha and Bpp ≥ 0.79 for Spiruromorpha, Fig. 4). For decades, there have been controversies surrounding the systematics of the suborder Spirurina (infraorders Ascaridomorpha, Spiruromorpha, Rhigonematomorpha, Gnathostomatomorpha and Oxyuridomorpha) 10 . Given the demonstrated utility of mt datasets, there is now an opportunity to test the phylogenetic relationships of a wide range of Spirurina nematodes using expanded mt datasets. Analyses of mt genome sequences in the current study and several previous studies [14][15][16][17][18] have provided insights into the phylogenetic relationships among major lineages of the Spirurina nematodes. However, some lineages of the suborder Spirurina are underrepresented or not represented in these analyses. So, more mt genome data from the suborder Spirurina would be required in future analyses to understand the phylogeny of the suborder Spirurina.
In summary, this is the first determination of a complete mt genome of a parasite belonging to the infraorder Gnathostomatomorpha. Although the length, gene and AT content are similar to other nematode mt genomes, the mt genome of G. spinigerum exhibits some interesting features. The gene order of G. spinigerum is distinct from that of other nematodes. Phylogenetic analysis shows that G. spinigerum was most closely related to Cucullanus robustus with moderate support, which is inconsistent with that from morphological and molecular studies. Our results provided insights into the phylogenetic relationships among several major lineages of nematodes.

Methods
Ethics statement. Specimens of G. spinigerum were collected from an Asian swamp eel, in accordance with the animal ethics procedures and guidelines China. All experimental protocols were approved by the Animal Ethics Committee of Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences.

Collection of G. spinigerum and DNA isolation.
Larval specimens of G. spinigerum were collected from an infected Asian swamp eel, Monopterus albus, imported from Indonesia, and were identified to species morphologically 40 . The specimens were fixed in ethanol and stored at -20 °C until use. Total genomic DNA was isolated from individual worms using small-scale sodium dodecyl-sulphate (SDS)/proteinase K digestion and spin-column purification (Wizard ® SV Genomic DNA Purification System, Promega). The identity of G. spinigerum specimens (coded GS5) was also verified by sequencing regions of ITS-2 and cox1 genes 41 ; both regions had 100% similarity with those of G. spinigerum from Thailand and Indonesia (GenBank accession Nos. AB181155 and JN408304).
Long-PCR amplification and sequencing. Fragments of cox1 and nad1 genes of G. spinigerum were amplified by PCR with primer pairs JB3-JB4.5 42 and JB11-JB12 43 (Table 4). After we obtained partial cox1 and nad1 sequences for G. spinigerum, we designed specific primers from these fragments for long PCR amplification. The complete mt genome of G. spinigerum (coded GS5) was amplified by long-PCR as two segments (~10 kb and ~4 kb) using genomic DNA extracted from a single specimen; the gaps between the two segments were filled by the short cox1 and nad1 fragments amplified initially. PCR was conducted in 25 μ l using 2 mM MgCl 2 , 0.2 mM each of dNTPs, 2.5 μ l 10 × Taq buffer, 2.5 μ M of each primer and 0.5 μ l LA Taq DNA polymerase (5 U/μ l, Takara) in a thermocycler (Biometra). The cycling conditions were: 92 °C for 2 min (initial denaturation), then 92 °C for 10 s (denaturation), 56 °C (10 kb) or 54 °C (4 kb) for 30 s (annealing) and 60 °C for 10 min (extension) for 10 cycles, followed by 92 °C for 10 s, 56 °C (~10 kb) or 54 °C (~4 kb) for 30 s (annealing), and 60 °C for 10 min for 20 cycles, with a cycle elongation of 10 s for each cycle and a final extension at 60 °C for 10 min. PCR products were sequenced at Sangon Company (Shanghai, China) using a primer walking strategy 44 .
Sequence analyses. Sequences obtained from the PCR amplicons of G. spinigerum were assembled manually and aligned with the mt genome sequences of roundworm and pinworm (GenBank accession numbers: NC_016128 and NC_011300) using the program MAFFT 7.122 45 to identify gene boundaries. The sequence of each protein-coding gene was translated into amino acid sequence using the invertebrate mt genetic code in MEGA 5 46 ; the amino acid sequences were aligned using default settings. tRNAscan-SE 47 and ARWEN 48 were used to identify all of the tRNA genes except trnS 2 , which was identified manually by sequence comparison with trnS 2 of other nematodes reported previously 49 . The two rRNA genes were identified by BLAST searches and were verified by sequence comparison with these two genes of other nematodes reported previously 14 . Tandem repeats in the non-coding regions were found using Tandem Repeat Finder program (http://tandem. bu.edu/trf/trf.html) 50 . The rearrangement events in the mt genomes were modelled with CREx (http://pacosy.informatik.uni-leipzig.de/crex) 34 .

Phylogenetic analyses.
We combined the mt genome sequence of G. spinigerum with those of selected 57 other Chromadorea nematodes (Table S1) retrieved from the GenBank for phylogenetic analysis; Trichuris suis (GenBank accession number GU070737) was used as the outgroup 51 . Amino acid sequences inferred from the sequences of 12 mt PCGs were aligned individually first using MAFFT 7.122 and were then concatenated to form a single dataset; ambiguously aligned regions were excluded using Gblocks 0.91b (doc) 52 with the default parameters (allow smaller final blocks, allow gap positions within the final blocks and allow less strict flanking positions). Phylogenetic analyses were conducted using Bayesian inference (BI). The MtArt + I + G + F model of amino acid evolution was selected as the most suitable model of evolution by ProtTest 2.4 53 based on the Akaike information criterion (AIC). As MtArt model is not implemented in the current version of MrBayes, an alternative model, CpREV, was used in BI and four chains (three heated and one cold) were run simultaneously for the Monte Carlo Markov Chain. Two independent runs for 2,000,000 metropolis-coupled MCMC generations, sampling a tree every 100 generations in MrBayes 3.1.1 54 ; the first 5,000 trees represented burn-in and the remaining trees were used to calculate Bpp. Bayesian analysis was run until the potential scale reduction factor approached 1 and the average standard deviation of split frequencies was less than 0.01. All sites were coded as unordered and equally weighted characters. The topology was reconstructed using the 50% majority rule and the support values were assessed by 1000 bootstrap replicates. Phylograms were drawn using the program FigTree v.1.4 (http://tree.bio.ed.ac.uk/software/figtree).