Mitochondrial genome diversity in dagger and needle nematodes (Nematoda: Longidoridae)

Dagger and needle nematodes included in the family Longidoridae (viz. Longidorus, Paralongidorus, and Xiphinema) are highly polyphagous plant-parasitic nematodes in wild and cultivated plants and some of them are plant-virus vectors (nepovirus). The mitochondrial (mt) genomes of the dagger and needle nematodes, Xiphinema rivesi, Xiphinema pachtaicum, Longidorus vineacola and Paralongidorus litoralis were sequenced in this study. The four circular mt genomes have an estimated size of 12.6, 12.5, 13.5 and 12.7 kb, respectively. Up to date, the mt genome of X. pachtaicum is the smallest genome found in Nematoda. The four mt genomes contain 12 protein-coding genes (viz. cox1-3, nad1-6, nad4L, atp6 and cob) and two ribosomal RNA genes (rrnL and rrnS), but the atp8 gene was not detected. These mt genomes showed a gene arrangement very different within the Longidoridae species sequenced, with the exception of very closely related species (X. americanum and X. rivesi). The sizes of non-coding regions in the Longidoridae nematodes were very small and were present in a few places in the mt genome. Phylogenetic analysis of all coding genes showed a closer relationship between Longidorus and Paralongidorus and different phylogenetic possibilities for the three Xiphinema species.

The nucleotide composition of the mtDNA genomes studied showed an A + T content similar among dagger nematode species (66.50%, 68.86% and 68.50% for X. americanum, X. rivesi and X. pachtaicum, respectively), but slightly lower in needle nematode species (L. vineacola and P. litoralis, 63.64% and 63.89%, respectively) ( Table 1). These levels were lower than that for other members of the class Enoplea such as Romanomermis culicivorax (79.34%) or Hexamermis agrotis (78.42%). The GT-rich sequences in Xiphinema species were 54.00% and 56.63%  (G + T) in one of the strands for X. rivesi (GT rich strand not containing the cox1 gene) and X. pachtaicum (strand containing the cox1 gene), respectively; while, for Longidorus and Paralongidorus these differences were minimal (50.51% and 50.75%, respectively). These differences influence the coding genes, rRNA and tRNA distribution in the genome. In X. rivesi, the GT rich strand had sense sequences of 8 protein coding genes (PCGs), 12 tRNAs and the 2 rRNA genes whereas for X. pachtaicum 10 PCGs, 12 tRNAs and the 2 rRNA genes were detected. These differences were minimal for Longidorus and Paralongidorus between AC-rich vs GT-rich strands with 6 vs 6 PCGs, 10 vs 11 tRNAs and 1 rRNA gene in each of the strands in the case of P. litoralis (GT-rich strand containing the cox1 gene) and 5 vs 7 PCGs proteins, 14 vs 8 tRNAs, and the 2 vs 0 rRNA genes in the case of L. vineacola (GT-rich strand containing the cox1 gene). The mtDNA genomes of L. vineacola, P. litoralis, X. pachtaicum and X. rivesi contained 12 PCGs, viz. cox1-3, nad1-6, nad4L, atp6 and cob; and two ribosomal RNA genes (rrnL and rrnS) but the atp8 gene was not detected ( Fig. 1; Table 2). The gene arrangement within Longidoridae was very different within dagger and needle nematode species (Fig. 1), with the exception of X. americanum and X. rivesi, in which it was identical. This is in concordance with the very high degree of variation in mtDNA genome gene arrangements across the Metazoa 22 . A comparison of closely related species with different gene orders suggests that there are several types of "elementary" rearrangement events 5 : inversions, transpositions, inverse transpositions (i.e. a transposition in which the re-inserted fragment is inverted), and tandem duplications followed by the random loss of one of the copied genes.
Protein encoding genes and codon usage. Protein encoding genes were transcribed from both strands in the four mt genomes sequenced. Genes nad4L and nad3 were always together and separated by a non-coding region in all the studied species (Fig. 1). The gene order of two genes in nad5-nad6, atp6-nad4 and nad1-cox1 were conserved between X. americanum/X. rivesi vs X. pachtaicum. On the other hand, the gene order between Longidorus and Paralongidorus species was kept in two regions: cob-nad4L-nad3-cox1 and cox2-cox3-nad2. Only the gene associations of nad5-nad6 (inverted gene sense in L. vineacola) and atp6-nad4 were kept between Paralongidorus and Xiphinema species. We could not find coincidences for arrangement of PCGs between the genus Longidorus and Xiphinema. For instance, the association between nad5-nad6 and atp6-nad4 seem derived from an inversion between Longidorus and Paralongidorus.
All PCGs shared an ATA start codon. The PCGs all terminated with a potential TAA stop codon with the exception of cox3 and nad3 for X. rivesi; cox1, cox3, nad4, nad4L, nad6 and cob for X. pachtaicum, and nad1 and nad3 for P. litoralis (Table 2). Some genes in all the nematode species studied partially overlapped and probably terminated with T/TA (Table 2). These features were not conserved in these genes among the studied species. Only the cox2 termination codon was conserved in X. americanum, X. rivesi, X. pachtaicum and L. vineacola. However, they were partially conserved between X. americanum and X. rivesi for cox1 and atp6 genes. Additionally to these termination codons, some genes overlap several tRNA codons and a few bases (1 or 2) with other genes ( Table 2). Coding gene overlapping seems to be a common feature in Longidoridae, since the five sequenced species showed this feature in their genomes. Overlaps were detected in the same or in the opposite direction. However, gene overlap in the same sense strand was not detected in P. litoralis. In X. americanum, X. rivesi and X. pachtaicum a 1 bp overlap was detected between nad2 and cox2 in the same sense strand, X. pachtaicum has a 4 bp overlap between nad4 and cox3, and L. vineacola has a 1 bp overlap between cox2 and cox3 in the same sense strand in both cases. In the case of overlapping coding genes in humans (atp8/atp6 and nad4L/nad4) both  pairs of genes result in dicistronic messengers 23 . In Ascidian species, two cases of overlapped genes, expressed as transcripts with no poly(A) tails downstream of the first open reading frame, suggested the presence of mature bicistronic transcripts 24 . The majority of genes kept the transcription sense with 2 or more consecutive genes, but there are exceptions where unique genes have a different sense to other genes in the mitochondria, as cox1 in the case of X. americanum and X. rivesi, and nad1 in the case of L. vineacola and P. litoralis. It is suggested that gene arrangement is affected by the transcription mechanism, for example by the need to co-regulate the expression of some genes or the required stoichiometry of the gene products 22 .

Protein genes
As in other published nematode mtDNAs, the mtPCGs of our sequenced species were notably biased toward using amino acids encoded by T-rich codons 13 (Table 3). The three most frequently used codons were TTT, TTA and ATT (Table 3) in Xiphinema species. These T-rich codons accounted for 21.97% (X. americanum), 24.37% (X. rivesi) and 23.68% (X. pachtaicum) of the total codons used. In the case of Longidorus and Paralongidorus the most used triplets were TTA, TTT and ATA. These codons account for 18.61% (P. litoralis) and 17.01% (L. vineacola) of the total codons used. However, these differences in T-rich codons were smaller in comparison to other Chromadorean nematodes such as B. xylophilus or P. vulnus, which account for 40.30% and 31.44% of the total codons used, respectively, while our range of the most used codons was similar to other Enoplean nematodes such as Trichinella or Trichuris species 25,26 . The higher frequency of amino acids encoded by T-rich codons, and unequal synonymous codon usage with bias against C-rich codons is consistent with the high percentage of A + T content in the nucleotide composition of PCGs as in other mt genomes for nematodes 13 .

AA Codon
No. * % ***  Transfer RNA (tRNA) and ribosomal RNA genes. The mt genome of metazoans typically encodes 22 tRNAs 8 . However, only L. vineacola showed the typical 22 tRNAs, whereas 21 tRNAs were identified in X. rivesi and in P. litoralis in which we could not identify the mt-tRNA N gene, and 19 tRNAs were identified in X. pachtaicum in which we could not identify the mt-tRNA N , mt-tRNA A and mt-tRNA R genes. The position predicted for mt-tRNA N in mt genomes of X. rivesi and P. litoralis based on sequence similarity in the genes predicted by Jühling et al. was deep inside the l-rRNA annotation and in the same sense strand 27 . The annotation of our l-rRNA was based on similarity to sequences deposited in GenBank and mainly with the X. americanum l-rRNA annotation done by He et al. using mRNA sequencing 18 . The same situation was also found in X. pachtaicum for mt-tRNA N where additionally, the other two mt-RNAs, mt-tRNA A and mt-tRNA R , were not found. These tRNAs would therefore need to be imported from the nucleus, implying that a mechanism of this sort exists in nematodes. Import of tRNA from the nucleus to the mitochondrion has been demonstrated in marsupials 28 and in a protozoan (Trypanosoma bruceii) 29 . Another possibility is that the tRNA detection methods used in this study (Mitos-MITFI and Arwen v1.2) could not identify them. The tRNA structures detected in the four studied mtDNA genomes shared some features with those of other metazoans 30 including a 5 bp anticodon stem, a 7 base anticodon loop with a T always preceding an anticodon as well as a purine always following an anticodon (Figs S1-S3). Four secondary structures have been found in Longidoridae (Table 4; Figs S1-S3): (i) typical nematode tRNAs structure with a D-arm but no T-arm; (ii) structure retaining their T-arm and lacking the D-arm; (iii) structures lacking both the T-arm and the D-arm; and (iv) the intact clover-leaf structure is also present. The conventional cloverleaf structure it is also present in the mt-tRNAs of Trichuris ovis, Trichuris discolor 26 and Trichinella spiralis 31 . All of the tRNAs with a clover-leaf structure found in the species included in this study were coincident with those found in T. spiralis and only the tRNA Y which showed a clover-leaf structure in X. rivesi, does not appear with this secondary structure in T. spiralis. These diversity of structures in our mt tRNAs support the hypothesis expressed by Jühling et al., that mt-tRNA not only evolve rapidly at the sequence level but also exhibit a variety of deviations from the common clover-leaf structure 27 .
Non-coding regions. The sizes of the non-coding regions in Longidoridae nematodes in this study were very small and in few places in the mt genome (Table 5). In the mt genome of X. americanum, the longest noncoding region was just 96 bp in length between the nad3 and the nad4L genes with an A + T content of 72%, and inverted or direct repeats were not present 32 . A sequence motif (5′ -GAGACCTGAGCCCAAGATA-3′ ) was present in this 96-bp noncoding region for X. americanum 32 that was similar to the conserved promoter element sequence (5-CA(G)ACC(G)CC(A)AAAGATA-3) around the transcription start site in the D-loop region of the human mt genome 33 . We could not find a clear similarity in our non-coding sequences to this promoter element sequence. However, the position of a non-coding region and a stable gene arrangement between nad3 and  nad4l within all the studied Longidoridae nematodes pointed out the importance of this region in the viability of the nematode mitochondria. This feature, the strong secondary structure (Figs 2 and S4) and the location of the Control Region (CR) for X. americanum were the basis for annotating this sequence as CR. The CRs were of different size and composition in comparison to the A + T rich sequence found in X. americanum. Sizes of     We could not find a good explanation for these ratios, but in the case of P. litoralis and L. vineacola, this region was accompanied with strong secondary structures and a similar number of coding genes in both strands. Probably, this composition has strong effects on replication and transcription. Lewis et al. found that mtDNA synthesis in the C. elegans gonad produces branched-circular lariat structures with multimeric DNA tails; they were able to detect multimers up to four mtDNA genome unit lengths 34 . These authors have raised the possibility that the rolling circle mtDNA replication mechanism may be an ancestral trait among metazoans. However, C. elegans has two well-defined non-coding regions and with our data only one was detected and in a similar position in the studied Longidoridae species. Many species of Arthropoda, Nematoda, Mollusca and Annelida harbor palindromes and inverted repeats in their CRs 35 . Length of the CR in the nematode mtDNA genome could be variable because of the presence of repeated sequences 22 . We only found a partially conserved palindrome sequence in the CR of P. litoralis (5′ -TAGTAATAACTATTTTCAGTA-3′ ) applying the procedure explained in Arunkumar and Nagaraju 35 . Another interesting aspect of these mt genomes was in the number of short non-coding regions and in their distribution in different positions in the genome ranged from 1 to 6. We found important differences between the studied species (excluding CR) ( Table 5). The number of non-coding regions longer than 30 bp ranged from 1 to 6 non-coding regions. Non-coding regions longer than 50 bp were only found in L. vineacola and P. litoralis (4 and 1, respectively; Table 5). Secondary structures were drawn in Figs S5 and S6. Long non-coding regions showed strong structures, but they were located in different regions in L. vineacola and P. litoralis. Some non-coding regions were at the same position in all the studied species, as it is the case between tRNA N and tRNA Q , with a similar size (from 36 to 42 bp). This is coincident with the non-coding region having a strong arm structure (Figs S5 and S6). It is possible that these regions function as splicing recognition sites during processing of the transcripts. Mitochondrial genomes in Longidoridae have the important trait of genome size economy and in having fewer non-coding regions. These features were retained even with the important differences in gene arrangements between the studied sequences.
Mitochondrial phylogeny of Enoplea nematodes. We conducted BI and ML phylogenetic analyses of an amino acid sequence dataset (13 protein-coding genes) for 25 nematode species. Phylogenetic analysis separated four different clades and subclades including Trichinellidae, Trichuridae, Mermithidae and Longidoridae (Figs 3 and 4). In our analysis, Trichinellidae and Trichuridae formed a well-supported group which is sister to Mermithidae, and all these three groups formed a clade weakly supported in both analysis (BI and ML). This phylogenetic position is mainly consistent with recent reports based on mt genome analysis 11,36 , but those were only based on one sequence for Longidoridae (viz. X. americanum). Meanwhile, Longidoridae clade, with 5 representatives in our analysis, was a sister clade with the other groups in the Bayesian inference (BI) and Maximum likelihood (ML) analyses in the amino acid dataset (Figs 3 and 4). In Longidoridae, two well-supported subclades were formed, one of them with Longidorus, Paralongidorus and X. pachtaicum with Bayesian probability values (BPP) of 0.97 in the BI analysis, while the position of X. pachtaicum was not associated to any specific clade in the ML analysis, and another with the other Xiphinema species. Xiphinema americanum and X. rivesi were closely related phylogenetically and this is shown with the strong support of their relationship in both analyses (100% bootstrap support (BS) and 1.00 BPP), while the position of X. pachtaicum was well resolved in this clade by BI (1.00 BPP), but not by ML (65% BS). Nucleotide phylogenetic analysis showed a similar pattern of well-supported phylogenetic clades as in the aminoacid dataset forming two sister clades, one with Trichinellidae and Trichuridae and another with Mermithidae and Longidoridae in BI and ML analysis (Figs S7 and S8). However, this phylogenetic position with Longidoridae as a sister clade to Mermithidae was well-supported in BI and weakly supported in ML analysis. Longidoridae clade was formed by two subclades, one for Paralongidorus and Longidorus and the other for Xiphinema species. This clade was strongly supported in the BI analysis and weakly supported in the ML analysis. Xiphinema americanum and X. rivesi were closely related phylogenetically as in the case of amino acid dataset (100% BS and 1.00 BPP). This different position of the Longidoridae clade was shown by other researchers with only one sequence from Longidoridae (X. americanum) 11,36 . The results of Kim et al. 36 were similar to our phylogenetic analysis using a BI approach and using amino acid and nucleotides datasets, while the phylogenetic approach used by Humphreys-Pereira and Elling 11 was similar in the amino acid dataset and different in the nucleotide dataset excluding the third codon position. In order to assess whether the phylogenetic relationships recovered by the mt genome sequences were influenced by the use of an entirely maternal marker, we assessed the phylogeny of the Enoplea using an available nuclear marker, partial 18S rRNA. Enoplea partial 18S (Fig. S9) showed two main well-supported clades (100% BPP) for Enoplia and Dorylaimia. Longidoridae species were closely related phylogenetically and forming a well-supported clade (100% BPP) with Nordiidae, Qudsianematidae, Dorylaimidae and Aporcelaimidae. However, no member of these families has a complete mt genome currently available. The Longidoridae clade was formed by two subclades, the superior subclade is well-supported (BPP = 0.99) comprising four different genera, Longidorus, Paralongidorus, Xiphinema americanum group and Xiphidorus. Longidorus species were phylogenetically related with Paralongidorus species forming a well-supported clade (BPP = 1.00) and the Xiphinema americanum group which formed a sister-clade with Xiphidorus, however the BPP values for this sister-clade is moderately supported (BPP = 0.95). The second subclade is well-supported (BI = 1.00) and was formed by Xiphinema non-americanum group species. The relationships between these three groups were not well-defined in these analyses or in other phylogenetic analysis using other phylogenetic markers with more Longidoridae species 37 . Unfortunately, we could not obtain a complete mt genome sequence for a Xiphinema non-americanum group species, so this point could not be resolved. Taking into account the sequence evolution in both clades (Longidorus-Paralongidorus) and (Xiphinema-Xiphidorus) neither of these clades seems more ancient than the others. But, clearly there are parallels in gene arrangements, phylogenetic relationships and non-coding regions between Longidorus and Paralongidorus. Additionally, L. vineacola has the longest non-coding regions followed by P. litoralis. For this reason, we hypothesized a less evolved mt genome or different selection pressure in the evolution of these species. Another observation found in recent phylogenetic analysis, is the extreme diversity of some regions (cox1 gene) with Longidorus orientalis showing incongruence of phylogenies inferred from ITS1 rRNA and cox1 genes 38 while other species show low differences within the populations sampled (i.e. X. pachtaicum and X. index) 39 . The high variation observed in the cox1 priming sites in L. orientalis can adversely affect the certainty of the nematode identification by barcoding 38 and thus integrative taxonomical approaches are needed for an accurate identification of these and other plant-parasitic nematodes [40][41][42] . Our complete mt sequences for Xiphinema, Longidorus and Paralongidorus genera could help in resolving these problems by comparing sequences for primer design.
While mapping the gene arrangement with the species in this study, we found an important variability in Enoplea (Figs 3 and 4). Some genera such as Trichinella and Trichuris were homogenous with their PCG arrangements, however, the genus Romanomermis showed an important variability in its arrangement in a similar way to Longidoridae species. We could not find similarities in the PCG arrangement between the species studied, with the exception of very closely related species (X. americanum and X. rivesi).

Material and Methods
Samples and nematode extraction. Soil samples from which nematodes were extracted were collected in 2015 in Spain from several crops and wild habitats ( Table 6). Soil samples were collected with a shovel discarding the upper 5-cm topsoil profile, from a depth of 5-to 40-cm, in the close vicinity of active roots. Nematodes from the soil were extracted from a 500-cm 3 sub-sample using the magnesium sulphate centrifugal-flotation method 43 . Xiphinema pachtaicum, X. rivesi, L. vineacola, and P. litoralis were identified using integrative taxonomy as described in previous studies 37,42,44,45 . Only live and individual nematodes were used for DNA extraction. No pure populations were multiplied in pots in greenhouse and nematodes were extracted from original sampling points.
Mitochondrial DNA extraction and amplification. For the molecular analyses, in order to avoid complications from mixed populations in the same sample, at least two live nematodes from each sample were temporarily mounted in a drop of 1 M NaCl containing glass beads (to avoid crushing the nematode) and diagnostic morphological characters were observed and measurements were taken to confirm the species identity. The slides were then dismantled and DNA extracted. Nematode DNA was extracted from single individuals and PCR assays were conducted as described by Subbotin et al. 46 . A portion of the cox1 gene was amplified as described by Lazarova et al. using primers COIF (5′ -GATTTTTTGGKCATCCWGARG-3′ ) and COIR (5′ -CWACATAATAAGTATCATG-3′ ) 47 and PCR cycling conditions as described by He et al. 32 . PCR products were purified using ExoSAP-IT (Affmetrix, USB products) and used for direct sequencing in both directions. The resulting products were run on a DNA multicapillary sequencer (Model 3130XL genetic analyzer; Applied Biosystems, Foster City, CA, USA), using the BigDye Terminator Sequencing Kit v.3.1 (Applied Biosystems, Foster City, CA, USA), at the Stab Vida sequencing facilities (Caparica, Portugal).
For all mt DNA amplification, the DNA extraction protocol was similar to that described in Subbotin et al. 38 with the exception that several live nematodes previously identified under microscope were used for each extraction and the proteinase K digestion was performed at 50 °C. Primers were designed using the cox1 sequences for each species. The primer design was performed using Primer3 48 (Table 6) with the correct sense for long-range PCR which was carried out using Advantage ® 2 PCR Kit (Clontech, Takara Biotechnology, Japan). Each reaction contained 0.3 μ M primer, 1X BD Advantage 2 PCR buffer and 2 μ l of DNA nematode extraction in a final PCR volume of 25 μ l. Long-range PCR conditions were as follows: initial denaturation at 94 °C for 3 min followed by 40 cycles of 94 °C for 15 s, annealing between 55 and 57 °C depending on the primer (Table 6) for 30 s and extension at 68 °C for 15 min during the first 10 cycles and after 15 min + 15 s/cycle. The product was visualized using an agarose gel; gel purified using Cut & Spin (Grisp, Portugal) and quantified using a Nanodrop spectrophotometer.
Ion-torrent Sequencing and read processing. Ion-torrent sequencing platform was performed at Stab Vida sequencing facilities (Caparica, Portugal). The 200-300 bp insert size library was constructed using enzymatic fragmentation of amplified DNA, Ion-torrent specific adapter ligation, size selection and amplification.
The concentration and size distribution of library DNA fragments was determined using Qubit ® fluorometer 2.0 (Invitrogen) and Bioanalyzer 2100 (Agilent Technologies). The numbers of reads obtained were 351,698; 665,110; 339,777; and 596,021 for X. rivesi, X. pachtaicum, L. vineacola and P. litoralis, respectively. Raw data obtained were analyzed and trimmed using CLC Genomics Workbench 7.5.1 (Qiagen) following standard procedures described by the manufacturer in Stab Vida facilities.
Mitochondrial genome assembly and annotation. Filtered data was de novo assembled using CLC Genomics Workbench 7.5.1 (Qiagen). Prediction of protein-coding genes and rRNA genes was done by using  a combination of BLAST searches in Artemis v. 16.0.0 49 , and MITOS online software 50 . Putative tRNA (transfer RNA) genes were identified in the MITOS online software 50 , which uses the strategy presented in Jühling et al. 27 . Additionally, these tRNA predictions were checked using the program Arwen v1.2 51 . The assembled genomes were annotated using Artemis v. 16.0.0 49 . Annotated sequences were submitted to GenBank with the accession numbers KU746820, KU746821, KU746818 and KU746819 for X. rivesi, X. pachtaicum, L. vineacola and P. litoralis, respectively. Codon usage was studied on-line using the server http://www.bioinformatics.org/sms2/codon_ usage.html.
Phylogenetic analyses. Phylogenetic analyses were performed on amino acid (AA) and nucleotide data sets and the nuclear partial 18S rRNA. The newly obtained and published sequences for mt coding genes in complete Enoplea mt genomes were used for phylogenetic reconstruction. Lithobius forficatus (NC002629) and Limulus polyphemus (NC003057) were used as outgroups, according to previous studies 11,36 . For multiple alignments of AA sequences, the nucleotide sequences of each of the protein coding genes (PCG) were initially translated into AA with MEGA6 52 using the invertebrate mt genetic code setting. The amino acid sequences for each PCG were aligned individually using ClustalW 53 , implemented in MEGA6 under default settings. Conserved regions in the alignments of the 13 PCGs were selected using the Gblocks v 0.91b server set at the "less stringent" option (allow smaller final blocks, allow gap positions within the final blocks and allow less strict flanking positions) 54 . Each individual gene alignment was tested for the best-fit substitution model using ProtTest 2.4 55 based on the Akaike information criterion (AIC). All AA alignments of the 13 PCGs were concatenated into a single alignment using Mesquite v3.04 56 . The final alignment included 2468 out of 3926 AA, representing 63% of the original sequence alignment. Similarly, aligned nucleotide sequences from the PCG using the aligned amino acid sequences were trimmed, concatenated in a similar procedure as for AA dataset. The best fitted model of DNA evolution for each individual gene alignment was obtained using jModelTest v. 2.1.7 57 with AIC. Model used in both datasets are shown in Table S1. Selected models from the available programs in MrBayes 3.1.2 and RAxML 8.2.2 which best fit our dataset from the ranked models in Protest 3.2 and jModelTest 2.1.7. were used in the phylogenetic analysis The partial 18S rRNA data set consisted of 88 Enoplea sequences from GenBank and was 1666 bp in length, and comprised representatives of the main families. Lithobius forficatus (EU024571) and Limulus polyphemus (HQ588741). Sequences for partial 18S were aligned using MAFFT v. 7.205 58 . Conserved regions in the alignments were trimmed using the Gblocks v 0.91b server set at the "less stringent" option 54 . The best fitted model of DNA evolution was obtained using jModelTest v. 2.1.7 57 with AIC.
Phylogenetic analyses of the AA data sets were performed based on maximum likelihood using the rapid bootstrap algorithm in RAxML v. 8.2.2 59 with 200 bootstrap replicates. BI was performed using MrBayes 3.1.2 60 . In both cases, the models were included in the analysis. BI was performed including the model in a partition setting with the model for each PCG and for nucleotides and was run under a general time reversible of invariable sites and a gamma-shaped distribution (GTR + I + G) model with four chains for 1 × 10 6 generations. After discarding burn-in samples and evaluating convergence, the remaining samples were retained for further analyses. The topologies were used to generate a 50% majority rule consensus tree. Trees were visualized using TreeView 61 and FigTree v1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).