Mitochondrial coding genome analysis of tropical root-knot nematodes (Meloidogyne) supports haplotype based diagnostics and reveals evidence of recent reticulate evolution

The polyphagous parthenogenetic root-knot nematodes of the genus Meloidogyne are considered to be the most significant nematode pest in sub-tropical and tropical agriculture. Despite the crucial need for correct diagnosis, identification of these pathogens remains problematic. The traditionally used diagnostic strategies, including morphometrics, host-range tests, biochemical and molecular techniques, now appear to be unreliable due to the recently-suggested hybrid origin of root-knot nematodes. In order to determine a suitable barcode region for these pathogens nine quickly-evolving mitochondrial coding genes were screened. Resulting haplotype networks revealed closely related lineages indicating a recent speciation, an anthropogenic-aided distribution through agricultural practices, and evidence for reticulate evolution within M. arenaria. Nonetheless, nucleotide polymorphisms harbor enough variation to distinguish these closely-related lineages. Furthermore, completeness of lineage sorting was verified by screening 80 populations from widespread geographical origins and variable hosts. Importantly, our results indicate that mitochondrial haplotypes are strongly linked and consistent with traditional esterase isozyme patterns, suggesting that different parthenogenetic lineages can be reliably identified using mitochondrial haplotypes. The study indicates that the barcode region Nad5 can reliably identify the major lineages of tropical root-knot nematodes.


Results
Sampling and isozyme electrophoresis. Among the 80 populations of root-knot nematodes examined 10 different esterase profiles and three different malate dehydrogenase patterns were identified (Table 1). These profiles represent 11 lineages of Meloidogyne, of which two appear new to science (see Fig. 1). In total seven populations of M. enterolobii were identified, including specimens originating from the type localities of M. enterolobii and its junior synonym M. mayaguensis 37 . The most frequently-occurring lineages in the dataset were M. incognita (28 populations) and M. javanica (26 populations), originating from a range of host plants and a wide geographical distribution (Table 1). Meloidogyne incognita was represented by both the I1 and I2 phenotype, although the I1 phenotype only occurred when esterase bands were weakly visible, indicating that the absence of the secondary esterase band represents an analysis artifact. For this reason all M. incognita esterase phenotypes in this study are defined as I1. Meloidogyne arenaria is represented by three isozyme profiles: the A2N1 type (4 populations), the A2N3 type (4 populations) and the A3N1 type (1 population). Three populations of M. luci, two populations of M. inornata and one population of M. ethiopica had an L3, I3 and E3 esterase phenotype, respectively. In addition to these known isozyme patterns two previously unrecorded esterase profiles were discovered (Fig. 1); one occurring in two populations, one originating from China from Ficus and one from Tanzania from Heliconia. The Mdh pattern of these populations was characterised as the N1 phenotype (Fig. 1b). The esterase phenotype displays three clear bands, of which the two fast migrating bands are positioned in the same location as for the M. arenaria A2 phenotype, while the slowest migrating band and its slightly faster migrating weak band occur in a similar position as the S1-M1 phenotype (Fig. 1a). Both the A2 and S1-M1 phenotype have previously been associated with M. arenaria 9 indicating that our combined A2-S1-M1 pattern should be considered an atypical, possibly hybrid, M. arenaria pattern. A second novel pattern was associated with two Meloidogyne populations, one originating from Spain (Beet) and one from Guatemala (Tomato). The Mdh activity displayed a N1 phenotype (Fig. 1d) and the esterase phenotype consists of two bands of which the faster migrating band is more pronounced (Fig. 1c). This esterase phenotype is not directly related to any other described Meloidogyne esterase phenotype indicating a new, undescribed lineage. The slow migrating band is in the S1 position 9 while the fast migrating band is in a new position, herein named A1a. This new esterase phenotype is therefore referred to as A1a-S1.
Identification of polymorphic sites within the mitochondrial genome. From the 80 geographically widespread populations, 305 sequences and 22 mitochondrial haplotypes were generated (Table S3), corresponding to 11 isozyme based lineages of clade I root-knot nematodes. Identical results with different primer combinations and different DNA polymerases. Sequence results for multiple individuals within one population (tested for 11 populations, see Table S3). Cloning data revealed limited heteroplasmy within a single individual, but never associated with informative species specific nucleotide positions, i.e. 0.16% variation (1 nucleotide position) in one clone of T417, 0.16% variation in two clones of T520, 0.33% variation in one clone of T540, while no variation was detected in T515.
Meloidogyne enterolobii. The seven populations of M. enterolobii showed identical sequences for the eight analysed gene fragments (Table S3) The Nad3 fragment was identical for all 17 representative sequenced populations (Table S3). Conversely, with 15 polymorphic positions, the Nad5 fragment is the most variable with a variation of 2.46% and representing 13 haplotypes out of 78 sequenced populations (Fig. 2). The M. incognita I1 esterase type is represented by three closely-related haplotypes differing in only one or two positions from each other. Of 27 M. javanica populations, 25 shared the same haplotype, while two populations had closely-related haplotypes differing in only one position. Another common haplotype was shared by M. inornata, Meloidogyne sp. 1, M. arenaria H3, M. ethiopica and two of the three M. luci populations. This haplotype most likely corresponds to the haplotype G, as defined and recovered by Pagan et al. 24 based on restriction fragment analysis of the intergenic region between 16S and Cox2. The third M. luci population included in the current study had a slightly different haplotype, differing in two positions from the other two populations and also Meloidogyne sp. 1, which is associated with a second distinct haplotype. Additionally, two M. arenaria haplotypes were recovered, each differing in one or two nucleotide positions. However, no link was observed between isozyme phenotypes and mitochondrial haplotypes among the different populations of M. arenaria. Meloidogyne floridensis is associated with the most divergent haplotype, differing in at least four positions to its closest relatives. Finally for the Nad5 fragment, a lineage-specific haplotype was recorded for Meloidogyne sp. 2.
Multi-gene haplotype network. A haplotype network was constructed using a concatenated alignment of the Cox1, Cox2, Cox3, 16S, Nad2 and Nad5 gene fragments. This multi-gene haplotype network clearly separates the major lineages of root-knot nematodes (Fig. 3). Both M. javanica, M. incognita, M. floridensis and Meloidogyne sp. 2 each occur in a clearly-separated position, supported by several lineage-specific sites. This separated position is confirmed by the low intra-/inter-lineage variability ratio of these lineages (Fig. 3). This intra-/ inter variability ratio is higher for M. arenaria and M. luci, suggesting that intra-lineage variability is lower relative to inter-lineage variability with the nearest neighboring lineage. High P ID (liberal) values 38 indicate a high probability of correctly identifying these lineages using BLAST, DNA Barcoding or tree placement. Interestingly, however, both M. incognita and M. javanica show some intraspecific mitochondrial variability. Furthermore, the closely related isozyme phenotypes of M. ethiopica, M. inornata and M. luci occupy separate positions in accordance to their mitochondrial haplotypes. Two distinct haplotypes for M. luci were also observed, occurring in a paraphyletic position, one shared between two populations originating from Iran and Slovenia and another haplotype from Guatemala. All ten included M. arenaria populations form a largely unresolved cloud of closely related haplotypes (Fig. 3). Surprisingly, the A3 esterase phenotype shared a haplotype with an A2 phenotype population and the haplotype extracted from the mitochondrial genome of M. arenaria 39 for which the isozyme profile is unknown. This indicates that different isozyme phenotypes are not necessarily associated with different mitochondrial haplotypes. The slightly different haplotypes of the two Meloidogyne sp. 1 populations are closely associated with the M. arenaria cloud, indicating that Meloidogyne sp. 1 should be considered an M. arenaria variant, as already indicated by its esterase isozyme phenotype. Overall, the concatenated mitochondrial haplotype network clearly separates different lineages of parthenogenetic root-knot nematodes, demonstrating a clear link with isozyme phenotypes.

Discussion
Although the appointment of lineage-specific barcodes for MIG root-knot nematodes is known to be problematic, since several nuclear and mitochondrial candidate genes were found to be unsuitable 28,30,35 , we were able to find consistent differences between 11 isozyme lineages of root-knot nematodes based on nucleotide polymorphisms originating from nine coding genes of the mitochondrial genome. While non-coding genes have been shown to contain insertions prone to heteroplasmy 24,36,39,40 , we found no evidence for variable insertions within coding genes, only a very limited amount of heteroplasmic positions within a single individual were recovered. However, this variation was not associated with species specific nucleotide positions, indicating that barcode accuracy is not influenced by heteroplasmy.
As previously highlighted in various studies 30,35,39,41 the only clearly divergent species in clade I is M. enterolobii, differing in all seven sequenced mitochondrial gene fragments (7.5%-18% divergent). Consequently M. enterolobii can easily be identified using any of the sequenced mitochondrial coding gene fragments. Moreover, its haplotype is identical to the mitochondrial genome sequence of M. enterolobii 39 with virtually no mitochondrial variation between the seven geographically widespread populations of M. enterolobii observed. Between the type locality of M. mayaguensis 42 and M. enterolobii 43 , just one single mutation in Cox3 was observed. However, this single mutation is considered insufficient to re-erect M. mayaguensis as a separate taxon, and thus further supports the synonymisation between M. mayaguensis and M. enterolobii 37 based on host range, isozyme phenotype and morphological data.
Except for M. enterolobii, clade I comprises extremely closely-related parthenogenetic lineages, known as the MIG 30,34,39,41,44 . This close relationship is here confirmed, based on the mitochondrial coding genes, such as the Nad3 gene fragment, which is completely identical for all MIG lineages. Also, the widely used barcode gene Cox1 is too conserved 35 , as it can only reliably differentiate Meloidogyne sp. 2 from the other MIG lineages. The limited diversity of mitochondrial coding genes confirms the recent origin of these MIG root-knot nematodes 30 , yet our study reveals that most of the mitochondrial coding genes exhibit some degree of diversity, generally varying between 0 and 1.5%, resulting in informative mitochondrial haplotypes. Remarkably, these mitochondrial haplotypes correspond clearly with isozyme patterns. This reflects earlier restriction fragment analysis of the intergenic region between 16S and Cox2 34 , indicating that the low, but consistent, diversity between different haplotypes can be informative in lineage identification.
Pagan et al. 24 described one M. incognita-specific and one M. javanica-specific site within the 16S gene, which were subsequently used as lineage-specific restriction sites, the specificity of which was confirmed using numerous root-knot nematode populations from Africa. In the current study, we further confirm these lineage-specific sites, and additionally identify four more M. incognita-specific sites and five M. javanica-specific sites, which are directly connected to I1 (I2) and J3 esterase phenotypes, respectively. The specificity of these sites was confirmed based on 30 M. incognita and 27 M. javanica populations of widespread geographic origin, indicating that lineage sorting is complete. Moreover, the most common M. incognita mitochondrial haplotype was identical to the haplotype from the mitochondrial genome sequence of M. incognita generated by Humphreys-Pereira & Elling 45 and to the mitochondrial haplotype extracted from the complete genome sequence of M. incognita 46 . Also, the most common M. javanica haplotype corresponded with its recently published mitochondrial genome 39 . We also found that the unique three banded M. floridensis esterase phenotype is associated with a lineage-specific mitochondrial haplotype containing seven lineage-specific sites and a separate position in the haplotype network. This confers with its aberrant meiotic parthenogenetic mode of reproduction and its isolated position according to RAPD data 47 .
Furthermore, our analysis indicates that even very closely-related esterase phenotypes can be reliably identified using mitochondrial haplotypes. For example, the E3, I3 and L3 phenotypes of M. ethiopica, M. inornata and M. luci, respectively, were for the first time connected to distinct but related haplotypes. Surprisingly though, we recovered two separate haplotypes for M. luci, which were both more closely related to M. inornata than to the other M. luci haplotype. Both haplotypes occurred in populations originating from separate geographical regions (e.g. Guatemala, Iran and Slovenia), indicating either that the L3 phenotype evolved convergently from the I3 pattern, or alternatively, that the I3 phenotype is the result of reticulate evolution, which was recently suggested to play an important role during the evolution of the MIG 29 . This latter scenario seems plausible as the E3, I3 and L3 phenotypes are associated with striking variations in somatic chromosome numbers, varying from 2n = 36-38 over 2n = 42-46 to 3n = 54-58, respectively 48 , inferring that these haplotypes and associated isozyme profiles originate from different hybridization events. To clarify the precise origin of these closely-related parthenogenic lineages and their taxonomic status, additional assessment of a broader range of populations from across a wide geographic distribution, in combination with genomic analysis, is necessary.
The three isozyme patterns observed for Meloidogyne arenaria (A2N1, A3N1, A2N3) are represented by a largely-unresolved cloud of related haplotypes in our network, where some level of variability is observed. Specifically the A2N3 phenotype occurs in two separate positions in the network, some displaying an identical mitochondrial haplotype with an A3N1 phenotype population, while the A2N1 phenotype appears to be linked to different mitochondrial haplotypes, indicating that Mdh phenotypes are not lineage-specific. Interestingly, intraspecific variability of both H1 and H3 Mdh phenotypes, has already been reported for M. mali 49 . That different isozyme phenotypes can be associated with the same mitochondrial haplotype in M. arenaria is consistent with a wide variation in karyology, with chromosome numbers varying from 30-38 over 40-48 to 51-56 9,50 , indicating several levels of polyploidy. Consequently, the available evidence combined indicates that different lineages of M. arenaria have been involved in recent hybridization events. This assumption is further supported by the fact that the A3 phenotype appears to be associated with higher (52-54) chromosome numbers 9 , while sharing all its esterase alleles with the A1 phenotype (absent in our analysis) and the A2 phenotype. Moreover, the A2 S1-M1 phenotype of Meloidogyne sp. 1 (haplotypes are close to the M. arenaria cloud) appears to be a combination of two previously reported M. arenaria phenotypes (A2 and S1-M1) 9 . This suggests that M. arenaria actually comprises a random combination of lineages with different isozyme phenotypes and mitochondrial haplotypes.
The newly described esterase isozyme phenotype (Fig. 1c,d) of Meloidogyne sp. 2 shows a very distinct mitochondrial haplotype, displaying 16 lineage-specific mutations, establishing it as the most divergent lineage of the MIG to date (Fig. 3). Additional information on this deviating lineage, which appears to have a wide geographical distribution (Spain and Guatemala) is necessary, including its mode of reproduction, cytogenetic composition and host range, in order to understand its divergent phylogenetic position. The observation that the two newly determined isozyme patterns also relate to distinct mitochondrial haplotypes provides a strong indication of the high potential value of mitochondrial haplotypes for separating lineages and root-knot nematode diagnostics. With the exception of M. arenaria A3, it is further demonstrated that each esterase phenotype is associated with a specific mitochondrial haplotype and that our multi-gene haplotype network shows a high degree of resemblance with the phylogenetic tree of Esbenshade and Triantaphyllou 8 as derived from the evaluation of isozymic data. This accordance between biochemical and molecular identification techniques provides a great opportunity to evaluate Meloidogyne species concepts, especially in combination with upcoming genomic information.
Due to the mostly parthenogenetic nature and suggested hybrid origin of Meloidogyne lineages in clade I (Lunt et al. 2014), it is difficult to link mitochondrial haplotypes with actual speciation events. Haplotype variation can occur among lineages with the same isozymic pattern (e.g. Meloidogyne sp. 1, M. Luci, M. incognita, M. javanica), while in rare cases, reticulate evolution enables species to possess the same mitochondrial haplotype but different isozymatic patterns, indicating a different genomic composition (e.g. M. arenaria A3 and A2, see above). Haplotype variation may be a consequence of accumulated mutations following hybridization and can be considered intraspecific variability. Alternatively, these nucleotide polymorphisms could reflect the diversity generated by crosses within an ancestral gene pool, which were later fixed by hybridization and parthenogenesis 41 . In the latter case, individual mitochondrial haplotypes can be considered separate hybrid lineages, possibly each with a distinct genomic composition. Arguably, both explanations may have played a role in shaping the presently-observed genetic diversity within root-knot nematode mitochondrial genomes. To unravel the precise origin and diversity of clade I mitochondrial haplotypes, additional knowledge on the structure and origin of their genome is crucial towards revealing whether hybridization in this group is traceable to unique hybridization events or, alternatively, that hybridization is rampant and constantly leads to new lineages of parthenogenetic root-knot nematodes. Nevertheless, in both scenarios a mitochondrial haplotype-based identification is preferable over nuclear gene-based identification or morphological determination 5,24,34 , especially since mitochondrial haplotypes are unequivocally linked with isozyme phenotypes, which continue to be considered a superior diagnostic strategy for root-knot nematodes 11,12 . Furthermore, it is suggested that while the species conundrum within the MIG continues to be resolved, 'lineages' should be used as a preferred term over 'species' . Yet, for convenience, the term species remains useful for the well-established 'species' , although in effect they represent a more or less random combination of lineages.
Interestingly, most root-knot nematode lineages identified in the current study have a global distribution ( Table 1). The observation that identical multi-gene mitochondrial haplotypes can have a global distribution favors the hypothesis that this distribution was caused by humans through agricultural practices and does not pre-date human crop exchange and agricultural development 5 . If worldwide distribution would pre-date agricultural development a much larger variation in mitochondrial haplotypes between lineages from distant locations could be expected, especially because parthenogenetic reproduction most likely implies that single nucleotide polymorphisms remain separated between different populations.
The current study demonstrates that root-knot nematodes from clade I can be reliably identified using mitochondrial haplotypes. The Nad5 gene fragment contains the largest number of variable positions and is therefore the preferred barcoding gene for clade I Meloidogyne spp. Sequencing the Nad5 fragment allows a reliable identification of the most common MIG lineages, i.e. M. incognita, M. javanica, most populations of M. arenaria but also M. floridensis and Meloidogyne sp. 2. However, the relatively uncommon, closely-related lineages i.e. M. ethiopica, M. inornata, M. luci, Meloidogyne sp. 1 and some M. arenaria are clustered in one Nad5 haplotype. In comparison, most of these lineages were also grouped in the same haplotype by Pagan et al. 24 (therein described as haplotype G) based on restriction fragment analysis of the intergenic spacer between Cox2 and 16S. To separate these closely-related lineages requires sequencing of an additional gene (Cox2), preferably in combination with isozyme electrophoresis. In comparison with other diagnostic strategies the proposed DNA barcoding method has several distinct advantages: (i) it is not life stage dependent, which is vital in studying root-knot nematodes, as second-stage juveniles (and males) represent the only free-living stage 5,19 ; (ii) a single individual is sufficient, which is important as species mixtures are common; (iii) barcoding does not provide a yes/no answer but does help to identify unforeseen plant threats or unknown lineages; (iv) the protocol can be performed in a relatively short time span, in combination with the suggested quick DNA extraction method, which omits a time-consuming proteinase K step, enabling sequence-based lineage identification within a single day; (v) the resulting sequences can be analysed in a comparative population genetic framework using haplotype networks; (vi) barcoding using coding genes does not suffer from heteroplasmy between or within single individuals; (vii) and possibly most importantly, barcoding can produce highly reproducible results between laboratories.

Collection of populations, morphological identification and culturing. For this study 85 separate
Meloidogyne populations were examined. Thirty-seven populations were obtained from pure cultures originating from the National Plant Protection Organization (Wageningen, the Netherlands), while the other populations were collected during four field surveys in three countries. Sampling of field-cultivated crops was undertaken in Tanzania, Pakistan, and Nigeria between 2012 and 2013, providing 48 populations. Comprehensive information on the geographical origin and the host plant species was collected for all populations (Table 1), which were all morphologically characterised based on second-stage juveniles 51 and perennial patterns, when females were available, in order to ensure clade I species were included. Subsequently, each population was inoculated onto Lycopersicon esculentum cv. Moneymaker plants, individually, in pots containing sterile potting media, using a few egg masses or juveniles. Populations were maintained in the greenhouse at Wageningen at 23 °C. Isozyme analysis. To confirm the morphological identification and purity of the cultures, esterase and malate dehydrogenase isozymes were analysed according to Karssen et al. 52 . First, ten young females of each culture were isolated from roots in isotonic (0.9%) NaCl solution. Individual females, after desalting in reagent-grade water on ice for 5 minutes, were loaded to sample wells containing 0.6 μl extraction buffer (20% sucrose, 2% triton X-100, 0.01% Bromophenol Blue), and subsequently macerated using a glass rod. This mixture was homogenised, and protein extractions were loaded onto a (8-25) polyacrylamide gradient gel and electrophoretically fractioned using a PhastSystem (Pharmacia Ltd, Uppsala, Sweden). In addition to the ten test samples, two M. javanica protein extractions were added to the centre of each gel to serve as a reference. After electrophoresis, gels were stained to examine for malate dehydrogenase (Mdh) and esterase (Est) activity for 5 and 45 minutes, respectively, rinsed with distilled water, and fixed using a 10% glycerol, 10% acetic acid, distilled water solution.
Mitochondrial DNA analysis. Genomic DNA of crushed individual females was extracted using worm lysis buffer and proteinase K 53 . Genomic DNA of individual second-stage juveniles or males was extracted using a quick alkaline lysis protocol adapted from Schneider et al. 54 ; individual nematodes were transferred to 10 μl 0.05N NaOH, with 1 μl of 4.5% tween added. The mixture was heated to 95 °C for 15 min, and after cooling to room temperature 40 μl of double-distilled water was added.
PCR amplification was carried out using the standard Taq DNA polymerase mixture (Qiagen, Germany), employing 2 μl genomic DNA extraction and 0.4 mM of each primer. The PCR amplification conditions were: initial desaturation for 2 min at 94 °C, followed by 40 cycles of 60 secs at 94 °C, 60 secs at 45 °C, 90 secs at 72 °C, and finally an extension for 10 min at 72 °C. For NADH dehydrogenase subunit 1, Cytochrome c oxidase and Cytochrome b the annealing temperature was increased to 55 °C. PCR products were electrophoretically fractioned on a 1% agarose gel and stained with ethidium bromide. Successful reactions were purified and sequenced commercially by Macrogen Inc. (Europe) in forward and reverse direction. Consensus sequences were assembled using GENEIOUS R6. All contigs were subjected to a BLAST search on the NCBI website (http://www. ncbi.nlm.nih.gov) to check for possible contaminations. Reliability and reproducibility of PCR amplification was tested by sequencing Nad1 twice using a different primer combination NAD1F1 (TCA AAT TCG TTT AGG ACC AAC) and Nad1R1 (CGA ATT GTT TAT CCT CGT TTT C) and by substituting Taq     and aligned by its amino acid translation using MAFFT 7.157 59 . Haplotype networks were calculated using the median joining algorithm as implemented in Network 4.6 60 , http://www.fluxus-engineering.com/), gaps were coded as unknown characters. Haplotype diagrams were redrawn in ADOBE ® ILLUSTRATOR ® CS3. Liberal P ID values, inter-and intra-lineage species variability were calculated with the species delineation plugin of GENEIOUS R6 38 using a UPGMA tree, and distances were calculated according to the Jukes-Cantor model. In all analyses the generated sequences in the current study were complemented with mitochondrial haplotypes extracted from the mitochondrial genome sequences of M. incognita 45 , M. javanica, M. enterolobii and M. arenaria 39 and haplotypes extracted from whole genome sequences of M. incognita 46 and M. floridensis 29 .