Rapid identification of nine species of diphyllobothriidean tapeworms by pyrosequencing

The identification of diphyllobothriidean tapeworms (Cestoda: Diphyllobothriidea) that infect humans and intermediate/paratenic hosts is extremely difficult due to their morphological similarities, particularly in the case of Diphyllobothrium and Spirometra species. A pyrosequencing method for the molecular identification of pathogenic agents has recently been developed, but as of yet there have been no reports of pyrosequencing approaches that are able to discriminate among diphyllobothriidean species. This study, therefore, set out to establish a pyrosequencing method for differentiating among nine diphyllobothriidean species, Diphyllobothrium dendriticum, Diphyllobothrium ditremum, Diphyllobothrium latum, Diphyllobothrium nihonkaiense, Diphyllobothrium stemmacephalum, Diplogonoporus balaenopterae, Adenocephalus pacificus, Spirometra decipiens and Sparganum proliferum, based on the mitochondrial cytochrome c oxidase subunit 1 (cox1) gene as a molecular marker. A region of 41 nucleotides in the cox1 gene served as a target, and variations in this region were used for identification using PCR plus pyrosequencing. This region contains nucleotide variations at 12 positions, which is enough for the identification of the selected nine species of diphyllobothriidean tapeworms. This method was found to be a reliable tool not only for species identification of diphyllobothriids, but also for epidemiological studies of cestodiasis caused by diphyllobothriidean tapeworms at public health units in endemic areas.

Human diplogonoporiasis refers to a tapeworm infection caused by Diplogonoporus balaenopterae (syn. Diplogonoporus grandis 10 ), which usually infects minke whale, sei whale and hump back whale 11 and has been found almost exclusively in Japan 10,12,13 . The details of its life cycle are not clear, but various marine fish, such as raw whitebait, are suspected as potential sources of human infection 14,15 . Human sparganosis is a harmful infective zoonosis caused by plerocercoid larvae (e.g., sparganum) belonging to the genus Spirometra or Sparganum. Sparganosis is classified into two forms, nonproliferative and proliferative. The former is caused by Spirometra erinaceieuropaei and Spirometra decipiens in China 16 , Japan 17 , Korea 18,19 , Taiwan 20 and Thailand 21 and Spirometra mansonoides in the USA 22 . The latter is caused by the more pathogenic and disseminating Sparganum proliferum, which is reported as occurring only sporadically in Japan 23 , Taiwan 24 and Thailand 25 and very rarely in Paraguay, Venezuela, and the USA 26,27 . Spirometra species require two intermediate hosts for the completion of their life cycle. Copepods serve as the first intermediate host for the procercoid larvae. Vertebrates (reptiles, amphibians, birds, and mammals) serve as the second intermediate and paratenic hosts for the plerocercoid larvae. Human infection occurs through the ingestion of water polluted with copepods carrying procercoids or through ingesting raw or undercooked meat such as frog, snake or chicken infected with plerocercoids 21,28 . The plerocercoid migrates predominantly to subcutaneous tissue, but it also affects the brain and eyes. Sparganum of Spirometra species can rarely develop into adult worms in the human intestine 29 . The details of the Sp. proliferum life cycle are not clear and its adult stage is unknown.
Since the eggs, larvae and adult tapeworms of the genera Diphyllobothrium, Adenocephalus and Spirometra are morphologically similar, their identification in human tissues is exceedingly difficult and requires special expertise. Only one study has successfully used morphometric and ultrastructural (surface morphology) egg features to distinguish among the 8 species of diphyllobothriids 30 . For these reasons, many molecular methods using different genetic markers have been developed for the identification of diphyllobothriid tapeworms: cox1 for Diphyllobothrium [31][32][33][34] and S. erinaceieuropaei 16,35 , sdhB for Sp. proliferum and S. erinaceieuropaei 36 , and nad3 for Sp. proliferum 5 . These molecular methods are reliable and valuable tools for the identification of diphyllobothriidean tapeworms. Pyrosequencing is a DNA sequencing method that utilizes enzyme-coupled reactions and bioluminescence to monitor the pyrophosphate (PPi) release accompanying nucleotide incorporation using the synthesis of short nucleotide fragments that are directed by the sequence in real-time 37 . The method is principled on the real-time monitoring of 4 enzymes during DNA synthesis by luminescence using a step that results in a detectable light signal upon nucleotide incorporation. The detection is based on the PPi released when a nucleotide is incorporated into the DNA strand. The signal can be measured according to the number of bases added. This method can be used for mutation detection and single-nucleotide polymorphism (SNP) genotyping of large samples of screening material and high-throughput DNA analysis techniques 38 . The method, an alternative method, has been used for species-specific discrimination of Entamoeba species 39 and for identification of various parasite taxa such as Plasmodium 40 , Trichinella species 41 , Paragonimus 42 and lymphatic filaria 43 . Here we developed a pyrosequencing methodology that characterizes cox1 gene amplicons for the identification of diphyllobothriidean tapeworms in humans and intermediate hosts.

Results
Based on sequence alignments of the cox1 gene from the nine diphyllobothriid species shown in Table 1, the 41 nucleotides following the 3′ end of the sequencing primer presented high accuracy and were used as the target region for species identification ( Fig. 1 and Table 2). The nucleotide sequence pattern in the target region of each diphyllobothriidean species is shown as pyrogram results (Fig. 2). Of the 41 nucleotides, 12 positions were sufficiently variable to discriminate among the diphyllobothriidean species (Tables 2 and 3). Diphyllobothrium dendriticum and A. pacificus were found to show intra-specific variation ( Table 2). The pyrosequencing reproducibility was confirmed for the amplicons by Sanger sequencing conducted at First BASE Laboratories Sdn Bhd (Selangor, Malaysia) using the BigDye terminator v3.1 cycle-sequencing kit (Applied Biosystems (ABI), Carlsbad, CA), and both strands were directly sequenced using the PCR primers as sequencing primers (Model 310 or 3100, ABI), which yielded identical sequence data. No PCR products could be obtained from DNA samples of other parasites, human leukocytes or uninfected fish, proving the high specificity of the PCR primers.

Discussion
The taxonomy of the diphyllobothriidean tapeworms generally depends on morphological classification, but morphological characteristics are not always decisive criteria, particularly in larval and immature worms, or inadequately preserved specimens. Diphyllobothrium dendriticum, D. latum, D. nihonkaiense, A. pacificus and Spirometra species are of medical importance and have caused an emerging public health problem, especially in countries in which fish consumption is expanding and new cooking habits involving raw or uncooked processing are gaining ground 6,33,44 . For the differential detection of diphyllobothriidean tapeworms in diagnostic laboratories, rapid, specific and cost-effective tools for routine diagnosis of parasites have been developed 6,8,33 . Molecular methods for differentiation among diphyllobothriideans are important in situations in which the morphology and epidemiological information are similar 6,34,45,46 . Many molecular methods have been developed for the identification of diphyllobothriidean tapeworms, for example, multiplex PCR for D. latum, D. dendriticum, D. nihonkaiense, and A. pacificus 33 , PCR for study of the taxonomic relationship between Di. balaenopterae and Di. grandis 10 , and PCR-restriction fragment length polymorphism for Diphyllobothrium and Diplogonoporus species 8 . However, the limitation with these methods is that analyzing the results requires gel electrophoresis, which is imprecise, has restricted throughput and can only be used to differentiate up to eight species 30 . Moreover, the differentiation among D. ditremum, Di. balaenopterae, S. decipiens and Sp. proliferum has not been evaluated. This prompted us to seek out a new tool for elucidation, namely pyrosequencing 47 . The pyrosequencing operating system is divided into two modes, sequence analysis (SQA) and SNP. These are obtainable for handling, and each mode can detect sequenced nucleotides in a different manner. The SQA mode sequences short and medium-length DNA fragments (approximate size of 25-50 bp) 48 , whereas the SNP mode searches only for single base variations. Pyrosequencing has been used in various applications such as differentiation of Rickettsia species in hard tick samples 49 and detection of benzimidazole-resistance-associated β -tubulin SNPs in cattle nematodes 50 . It is also suitable for identifying, screening and/or predicting resistant strains of bacteria for antibiotic drugs 51 . Here, we report on a new tool for rapid identification of nine diphyllobothriidean tapeworms using pyrosequencing. This method can identify D. dendriticum, D. ditremum, D. latum, D. nihonkaiense, D. stemmacephalum, Di. balaenopterae, A. pacificus, S. decipiens and Sp. proliferum (Table 1) at the species level. As another advantage of pyrosequencing, short PCR products are sufficient for this analysis. Thus, DNA from formalin-fixed samples that has degraded due to formalin fixation is suitable for pyrosequencing. In this study, we found that pyrosequencing was able to identify formalin-fixed A. pacificus (Fig. 2G).
The pyrosequencing technique developed in this study is a high-throughput tool that can be completed within 4 hours (excluding the DNA extraction period) and can achieve results from 96 samples simultaneously. The approximate estimated cost is $5 per test. However, the cost can vary depending upon the commercial origin of there agents used. In addition to its medical applications, this pyrosequencing technique can be useful for identification of larvae isolated from copepods, eggs, and adults among different taxa that share similar morphology. Pyrosequencing is a promising alternative high-throughput method for molecular genotyping of nine species of diphyllobothriidean tapeworms and can have important implications for epidemiological studies. This not only applies to endemic countries but also non-endemic areas in which travellers coming from endemic areas bring in the parasites 5,52 .

Materials and Methods
Parasite materials. Specimens  Ethical Approval. All methods in the study protocol were approved and were performed in accordance with the relevant guidelines and regulations of the ethical approval in the Ethics Committee in the National Institute of Infectious Diseases, Tokyo, Japan (No. 177). No experiment involving human subjects was used. All parasites were derived from left over specimens (informed consent was obtained from all subjects and the data were analyzed anonymously) or from specimens bought at markets. We confirmed that the process did not involve endangered or protected species.  Table 2 for GenBank accession nos.) The following  primers were designed from A. pacificus sequence (GenBank accession no. AB548654): our primer sets (Psedo_F; 5′-TTTGGGTAGTGTTGTGTGGG-3′ , corresponding to positions 846-865 and biotinylated Psedo_R; biotin-5′-GGCTCACGTAAAGAAACACGACT-3′ , corresponding to positions 1010-988) and Psedo_S sequencing primer (5′ -GGGGTCATCATATGTTTA-3′ , corresponding to positions 863-880) were designed using PyroMark Q96ID software version 2.0 (Biotage, Uppsala, Sweden) (Fig. 1). Diphyllobothrium latum (GenBank accession no. AM712906) is the reference sequence that was used to determine the nucleotide positions.
DNA extraction, plasmid preparation, DNA amplification by polymerase chain reaction (PCR) and pyrosequencing. Genomic DNA samples from 29 diphyllobothriid specimens (Table 1) were extracted using a DNeasy Blood & Tissue kit (Qiagen, Hilden, Germany). In addition, positive control plasmids for all diphyllobothriidean species were produced by ligation of each species-specific amplicon into a pGEM-T easy vector (Promega, WI), as has been described previously in other reports 53 . Each inserted gene was Sanger sequenced in both directions and the resulting sequences were identical to the gene sequences from which the primers were designed. The cox1 gene fragments (165 bp) were PCR amplified from the diphyllobothriidean DNA. The total PCR reaction volume was 25 μ l and included 2 μ l of the DNA template, 2.  The actual sequence obtained by pyrosequencing is displayed below the panels following "Seq". The Y-axis shows the level of fluorescence emitted by the incorporation of a nucleotide base, and the X-axis shows total number of bases added at that point in time; G, C, T, A, nucleotide bases. The underlined letters indicate the nucleotides used for identification of diphyllobothriidean tapeworms. using 1.5% agarose gel electrophoresis. After PCR amplification, biotinylated PCR products were added into 96-well plates and processed as described elsewhere 40,42 . The pyrosequencing system included DNA polymerase I, ATP sulfurylase, luciferase and apyrase. Pyrosequencing is a DNA sequencing technique that is based on the detection of released PPi during DNA synthesis. In a cascade of enzymatic reactions, visible light is generated that is proportional to the number of incorporated nucleotides. The cascade starts with a nucleic acid polymerization reaction in which inorganic PPi is released as a result of nucleotide incorporation by polymerase. The released PPi is subsequently converted to ATP by ATP sulfurylase, which provides the energy to luciferase to oxidize luciferin and generate light. Because the added nucleotide is known, the sequence of the template can be determined. Light is only generated when a newly added nucleotide is complementary to the next unpaired base in the template strand. The intensity of light is proportional to the number of sequential identical bases in a homopolymer, but determining the exact length of a homopolymer can be a problem for this technique 37,53 . In cases which the target sequence showed up to four homopolymers, the readout was analyzed manually to ensure accuracy.  A/C885G,  T906C, C912T,  T921A   --------D. latum  4  A882G, A/C885G,  C912T, T921A  1  C906T  -------D. nihonkaiense  2  A/C885G, T921G  4  G882A, C906T,  T912C, A921G  3  G882A, T912C,  A921G  ------D. stemmacephalum 3  C912T, T918G,  T921A  4  G882A, G885A,  C906T, T918G  3  G882A, G885A,  T918G  4  G885A, C912T,  T918G,