Novel molecular approach to define pest species status and tritrophic interactions from historical Bemisia specimens

Museum specimens represent valuable genomic resources for understanding host-endosymbiont/parasitoid evolutionary relationships, resolving species complexes and nomenclatural problems. However, museum collections suffer DNA degradation, making them challenging for molecular-based studies. Here, the mitogenomes of a single 1912 Sri Lankan Bemisia emiliae cotype puparium, and of a 1942 Japanese Bemisia puparium are characterised using a Next-Generation Sequencing approach. Whiteflies are small sap-sucking insects including B. tabaci pest species complex. Bemisia emiliae’s draft mitogenome showed a high degree of homology with published B. tabaci mitogenomes, and exhibited 98–100% partial mitochondrial DNA Cytochrome Oxidase I (mtCOI) gene identity with the B. tabaci species known as Asia II-7. The partial mtCOI gene of the Japanese specimen shared 99% sequence identity with the Bemisia ‘JpL’ genetic group. Metagenomic analysis identified bacterial sequences in both Bemisia specimens, while hymenopteran sequences were also identified in the Japanese Bemisia puparium, including complete mtCOI and rRNA genes, and various partial mtDNA genes. At 88–90% mtCOI sequence identity to Aphelinidae wasps, we concluded that the 1942 Bemisia nymph was parasitized by an Eretmocerus parasitoid wasp. Our approach enables the characterisation of genomes and associated metagenomic communities of museum specimens using 1.5 ng gDNA, and to infer historical tritrophic relationships in Bemisia whiteflies.

Since its description by Gennadius in 1889, the taxonomy of the whitefly Bemisia tabaci has proven a challenge. There had been various significant taxonomic revisions 1-3 within this nominal species, but the lack of unique morphological features associated with these led to all being synonymised under the name 'B. tabaci' . The confusing nomenclature was revised using allozyme and DNA markers that indicated substantial sub-clustering, eventually giving rise to the biotype concept. This has more recently been superseded, by thorough integration of sequence and biological data which demonstrated that B. tabaci is in fact a complex of more than 43 cryptic biological species 4 .
The recognition that B. tabaci is a species complex presented a further challenge, which is to link collection specimens (all tagged with the nomenclature at the time of identification) with the newly adopted genetically-based structure 5 . Tay et al. 6 determined the identity of the original 1889 Gennadius B. tabaci whitefly specimen through Sanger sequencing of multiple PCR products. Characterisation of the partial mitochondrial DNA cytochrome oxidase subunit I (mtCOI) gene from a single museum individual originally collected by Gennadius in 1889 subsequently showed that the Mediterranean ('MED') member of the complex was the true 'B. tabaci' . Since then, there have been no further studies using museum specimens to address taxonomic issues in the remaining members of the B. tabaci complex.
One of the most important concerns in working with historical specimens is the finite amount of material available. PCR primers for the partial mtCOI region 7 can be inefficient, due to the large diversity within B. tabaci clades 4,8 , and multiple PCR amplification attempts can rapidly deplete the sample with no guarantee of success.

Results
The single 1912 B. emiliae 4 th instar nymph ("puparium") yielded a total of 4.62ng of double stranded gDNA, and the 1942 'B. tabaci' 4 th instar from Japan yielded 27.85 ng of double stranded gDNA. The Illumina MiSeq run generated 4,520,563 pair-end sequence reads (i.e., 9,041,126 reads) for B. emiliae, and 2,906,587 pair-end sequence reads (i.e., 5,813,174 reads) for the Japanese 'B. tabaci' . Pair-end sequences from both the 1912 B. emiliae and the Japanese 'B. tabaci' supported the mitogenomes as circular molecules. The draft mitogenomes of both museum specimens were assembled using the mitogenome of B. tabaci Asia I (KJ778614) as a reference. We recovered a complete mitogenome for B. emiliae (15,515 bp from 37,089 reads) (GenBank KX714967) and a partial mitogenome (GenBank KX714968) for the 1942 'B. tabaci' individual from 2,607 NGS genomic fragments (Fig. 1). An estimated 3,635 bps distributed across six regions of the mitogenome were missing from the 1942 Bemisia specimen, impacting eight protein coding genes (PCGs), eight tRNAs, and one rRNA ( Fig. 1; Supplementary  Table 1). Alignment with the assembled B. tabaci Asia II-7 ( Fig. 1, Supplementary Table 1) indicated that these missing mitogenome regions included regions spanning partial ATP8 to partial ND5, part of Cyt b, part of the large subunit rRNA and small subunit rRNA, missing tRNA Asn , tRNA Arg , tRNA Ala , ND3, tRNA Gly , and also part of COIII. Gene orientation and gene order between the 1912 and 1942 Bemisia specimens were identical overall and as confirmed via the De Novo Assemble algorithm within Geneious version 8.0.5 (Biomatters Ltd, Auckland, NZ) (data not shown), although gene orientation for ATP6, ND3, and the missing tRNAs for the Japanese Bemisia specimen could not be ascertained. The assembled partial mitogenome of the Japanese Bemisia consisted of 2,255 sequences, with an estimated sequence genome length of 15,214 bp.
The 1912 B. emiliae cotype specimen had a 100% partial mtCOI (657 bp) sequence identity to three members (i.e., GQ139492, AJ748378, DQ174523) of the B. tabaci Asia II-7 clade, and ranged between 98% (AY686075) and 99% sequence identity (AM408899, DQ174523, AJ748372, DQ116650, DQ116661, DQ116660, DQ174521, AJ748375, DQ116662, AY686064) with other B. tabaci Asia II-7 members. All reported Asia II-7 members were of Asian origin (e.g., India, Taiwan, China). Similarly, based on partial mtCOI sequence (777 bp) identity, the 1942 Bemisia specimen matched 99% with members of the Bemisia genetic group of 'JpL' 13 (GenBank accession numbers AB308111, AB308114-AB308119, AB240967, accessed 02-Jun-2016), all of which are from Japan. Phylogenetic analysis 5, 13 based on the same partial mtCOI gene region indicated a basal position of the 'JpL' genetic group to the 'B. tabaci' species complex, providing support that this 1942 Japanese 'B. tabaci' was likely a non-'tabaci' species. Its basal position clusters with other members of Bemisia that were, until recently, members of Lipaleyrodes, which was synonymised with Bemisia in 2009 14 .  Metagenomics analysis showed that 90.4% of B. emiliae sequences was of bacterial origin, of which 84.3% belonged to the Gram-negative Proteobacteria phylum, and only 6% to Arthropoda. This contrasted significantly with the metagenomic compositions of the 1942 Bemisia individual where only 26% was of bacterial origin and 61% of arthropod origin (Table 1). Interestingly, both B. emiliae and the 1942 Bemisia specimens had low (0.2% and 0.1%, respectively) sequences that matched to Bemisia, and is reflective of the absence of a Bemisia genome. A total of 1.5% of sequences from B. emiliae matched sequences corresponding to Hymenoptera (i.e., Nasonia, Harpegnathos, Camponotus, Apis), while this was 36.5% for the 1942 Bemisia specimen (Table 1). We detected the primary (P)-endosymbiont Candidatus Portiera aleyrodidarum in both hosts, as well as similar proportions of the facultative secondary (S)-endobacteria Cand. Hamiltonella and Rickettsia, and a higher proportion of Wolbachia was detected in the 1942 Bemisia than in B. emiliae. Finally, both Arsenophonus (Enterobacteriaceae) and Cand. Cardinium (Bacteroidaceae) were detected only in B. emiliae (Table 1).
High proportions of sequences matching Hymenoptera (Table 2) in the 1942 Bemisia suggested either contamination, or alternatively parasitism by a hymenopteran parasitoid wasp. Mining the NGS sequence data successfully assembled a DNA contig of 8,951 bp (from 3,410 DNA fragments) that spanned the complete 16 S to 28 s ribosomal RNA (rRNA) genes and included the intergenic spacer (ITS) 1, the 5.8 s rRNA, and the ITS2 region (GenBank KX714966). A DNA contig of 1,926 bp (GenBank KX714952) was also assembled that included the complete mtDNA COI gene (1,536 bp; 512 amino acids), and two tRNAs (tRNA Lys and tRNA Met ) genes. The predicted tRNA Met is 68 bp ( Fig. 2A) and is within the typical tRNA lengths of 60-80 bp in Hymenoptera 15,16 and has the (TAT) anticodon. The tRNA Lys has the (TTT) anticodon reported in various Hymenoptera species including species within the Chalcidoidea superfamily (e.g., Encarsia formosa), and may represent mutation from the ancestral state of (CTT) anticodon 17 . Interestingly, this tRNA Lys is only predicted to be 48 bp in lengh and having a two-arms clover leaf secondary structure due to the absence of the TψC arm and loop (Fig. 2B). The prediction of this 48 bp tRNA Lys secondary structure is unlikely to have been affected by contig assemblies using short NGS DNA fragments due to its centrally located position within multiple pair-end sequences, some of which are  Fig. 1). Though unusual, two-arms clover leaf tRNA secondary structures are known in diverse organisms from mammals (e.g., bovine 18 ) to arthropods (e.g., trnS1 and trnS2 of B. emiliae, trnS2 of the 1942 Japanese Bemisia sp., this study; Habronattus jumping spiders 19 ; Scelionidae parasitic Hymenoptera 20 ), while the missing part of the TψC arm and loop is, to our knowledge, the first to be reported in an Aphelinidae parasitoid wasp species (see below). Various hymenopteran mtDNA partial genes were further assembled and included COII ( Extensive local rearrangements and translocations within Hymenoptera mitogenomes are known 17 , and while gene orders for tRNA Thr -tRNA Pro -NADH6-Cyt b were similar to those reported for various parasitoid hymenopteran wasp species 20, 21 , identification of tRNA Met -tRNA Lys -COI, as well as the NADH4-tRNA Arg gene orders would nevertheless suggest presence of novel mitogenome gene rearrangements in the parasitoid wasp, likely to be an Eretmocerus species (see below). Confirmation of such gene rearrangements will require future characterisation of the complete mtDNA genomes of Eretmocerus wasp species.
Phylogenetic analysis of Eretmocerus species. Sequence identity of the 779 bp partial mtDNA COI C-terminal region matched the Aleyrodidae parasitoid Eretmocerus cocois (EU017333) at 90% identity, and between 88-89% sequence identity with 10 other publicly available (ie., through GenBank) but unpublished reports of Eretmocerus species including Eret. desantisi, Eret. cocois, Eret. mundus, Eret. hayati, and an unnamed Eretmocerus sp. YBZ-2013 (sequence identity to Eret. hayati = 98%) from China Xinjiang province (KF859899). Phylogenetic analysis (best substitution model: HKY85+G+I+F, proportion of invariable sites: estimated (0.387); number of substitution rate categories: 6; Gamma shape parameter: estimated (0.319); Lkl: −5517.437; AIC: 11260.873; K = 113) of the aligned 657 bp partial mtCOI sequences, and included various Chalcidoidea wasp species indicated that our unknown hymenopteran entity was from a separate evolutionary lineage basal to the sister clade that included Eret. mundus, Eret. hayati, and Eret. sp. YBZ-2013. The two sister clades of Eret. mundus-Eret. hayati-Eret. sp. YBZ-2013, and the Japanese 1942 hymenopteran individual were clustered with 100% bootstrap node support that indicated a shared most recent common ancestor. A third Eretmocerus basal sister clade included the Caribbean/New World species of Eret. cocois and Eret. desantisi from the French overseas territory of Guadeloupe 22 (Fig. 3  Amino acid positions of the best matched hymenopteran species within the Chalcidoidea superfamily and the percentage identity are also shown. Note: (A) A total of 168 reads for the assembly of (i, ii, iii). (B) A total of 139 reads for the assembly of (vi, vii). (C) A total of 198 reads for the assembly of (xiv, xv). (D) A total of 8 reads for the assembly of (xv, xvi, xvii). (E) A total of 74 reads for the assembly of (xiv, xx).

Discussion
To our knowledge, this is the first investigation into tritrophic interactions in historical Bemisia specimens. This was made possible by developing a protocol using low amounts of input double stranded gDNA (only 1.5 ng). This novel approach offers enormous potential to resolve existing taxonomic questions and also contribute to genome-wide surveys of individual historical Bemisia specimens, and can potentially be adopted for solving nomenclatural and taxonomic confusion in other cryptic species complexes. Since the synonymising of Bemisia species including B. inconspicua Quaintance (1900; collected in Florida, USA, synonymised by Russell 3 ), B.   23 ), genetic data based on partial mtDNA COI gene 4,5,8,24 , and mating behaviour studies 4, 25-28 have increasingly supported the recognition of B. tabaci as being a complex of cryptic species. By analysing historical specimens collected from the same geographic localities and at similar time frames prior to mass global commodity movements, it is possible to help ascertain historical species habitat boundaries, and ultimately contribute to rectify incorrectly synonymised species. In this respect, the cotype 1912 B. emiliae individual with partial mtCOI gene matching at high percentage (98-100%) with B. tabaci Asia II-7 may be seen as the next stage of solving Bemisia whitefly nomenclatural confusion, since the identification of the true B. tabaci 6 . We further provided an example of cryptic species misidentification in the 1942 Japanese Bemisia species, and highlighted the on-going challenges in understanding the genetic diversity and species status of this global agricultural pest species complex. Metagenomic compostions of various B. tabaci cryptic species have been investigated 29-31 using gene-specific primers, with secondary endosymbionts such as Arsenophonus, Cardinium, and Wolbachia reported in Chinese B. tabaci ' Asia II-7' samples 29, 31 , while Bing et al. 29 also identified Rickettsia in their Asia II-7 material. In the 1912 B. emiliae individual, in addition to Arsenophonus, Cardinium, Rickettsia and Wolbachia, Hamiltonella was also detected. While we showed that both B. emiliae and the Japanese Bemisia species had, to a certain extent, similar P-and S-endobacterial compositions, there were also significant endosymbionts metagenomic profile differences. For example, the endosymbionts Arsenophonus and Cardinium were only detected in B. emiliae, as well as significant amount (14.9%) of Alphaproteobacteria, in contrast to only 4% detected in the 1942 Bemisia species (Table 2).
We identified two regions of a partial wsp Wolbachia gene (106 bp and 230 bp; GenBank KX714969) that showed between 98% and 99% sequence identity, to the B. tabaci Asia II-7 Wolbachia wsp gene (KJ600634) as reported by Ahmed et al. 32 . The 230 bp partial wsp gene region from our historical 1942 specimen also showed high sequence homologies (99-100%) with diverse organisms, including a 99% sequence identity with the Wolbachia wsp gene from B. afer, and with native members of the B. tabaci species complex from China (e.g., Asia  33 ), although reduced sequence identity (81.12-93.91%) were also detected between the historical wsp partial gene sequence and the partial sequences of 'B. tabaci' Wolbachia W4 and W6 strains ( Table 3) 33 . Similarly, the 106 bp partial wsp gene that were identified also shared 100% sequence identity with the Wolbachia wsp gene of B. afer (AJ291370), and between 98.11-100% with B. tabaci (GU968901, JN315980, HQ404797) ( Table 3), all of which were clustered at 73% on the same W1/W2 phylogenetic branch 33 . A lower sequence identity of 87.74% was also shared with the Wolbachia wsp gene (AJ291379) ( Table 3) from a B. tabaci host that was shown to be phylogenetically basal to the W1-W6 Wolbachia strains reported by Ji et al. 33 , and highlighted the difficulty of pin-pointing the identity of the Wolbachia strain(s) in our 1942 Miseq sequence data due to the short sequence nature of this partial gene region.
Alternatively, it is also possible that the Eretmocerus parasitoid larva within the 1942 Bemisia nymph was itself infected with a different Wolbachia strain. Exploring the MiSeq data indentified a 189 bp coxA sequence (GenBank KX714965) that shared between 91% and 94.7% sequence identity with the majority of B. tabaci cryptic species Wolbachia coxA gene 32, 34 but 100% sequence identity with B. afer coxA ST382 at this 189 bp region, as well as 100% sequence identity with the coxA partial gene of Wolbachia isolated from six diverse species including Philaenus spumarius (Hemiptera, Aphrophoridae) (KM377724), Hypoponera ant (Hymenoptera, Formicidae) (KF490396), Macrosteles fascifrons (Hemiptera, Cicadellidae) (HQ404763), Sogatella furcifera (Hemiptera, Delphacidae) (FJ713762), Teleogryllus taiwanemma (Orthoptera, Gryllidae) (DQ842303), and Acraea encedon (Lepidoptera, Nymphalidae) (DQ842269) (data not shown). Horizontal transfer of Wolbachia strains between phylogenetically distantly related arthropods has been reported previously [35][36][37] , and our partial coxA gene shared 100% sequence identity with the hemipteran (i.e., P. spumarius, S. furcifera) and orthopteran (i.e., T. taiwanemma) hosts that were also known to be present in Japan. The coxA locus between Bemisia species including B. tabaci cryptic species complex and B. afer species have been characterised (e.g., Ghosh et al. 34 ) and are highly similar. The 189 bp coxA partial gene from the 1942 Bemisia specimen captured only 23 SNPs of the total 40 SNPs present in the characterised coxA gene across a diverse groups of B. tabaci cryptic species, and shared 100% SNP identity with B. afer ST382 (Table 4). Given the partial coxA gene sequence matched both B. afer's Wolbachia partial coxA ST382 sequence as well as non-Bemisia hosts, it suggests that the increase in Wolbachia sequences detected in the 1942 Bemisia specimen may well be due to infections in both the Bemisia nymph and the parasitoid larva.
Primer efficacy issues may have contributed to the lack of detection of Hamiltonella by Bing et al. 29 and Zchori-Fein et al. 31 , although host plant utilisation and environmental factors including adaptation to pesticides could also play a role in influencing endobacterial communities in Bemisia [38][39][40] . Hamiltonella from B. tabaci MED/MEAM1 has been proposed to present no parasitoid-resistance due the inactivation of the APSE phage 41 . However, it can enhance hosts' survival rates through an endosymbiont-mediated defense against parasitoid wasps in aphids 42 , In addition, Hamiltonella can also increase host growth rates in nutrient-poor environments 40 , or enhance reproductive rates and nymph growth rates, as reported for the invasive B. tabaci MED 39 . It is possible that host association with the Hamiltonella endosymbiont may therefore diminish in nutrient-rich environments (i.e., applications of crop fertilisers). Similary, applications of pesticides could result in the reduction of beneficial insects including parasitoids, and therefore reduce the benefits provided by this endosymbiont to its host (i.e., defense against parasitoid wasps). These anthropogenic factors often associated with the green revolution   43 , this endosymbiont was not detected in both museum specimens analysed in this study. This again suggested potential differences in endobacterial metagenomic signatures between B. tabaci cryptic species from evolutionary diverse clades, as well as reflecting potential impact from both antropogenic (e.g. agricultural) activities and/or environmental/climatic differences. Within the bacterial sequences from B. emiliae, a total of 291,123 sequences (30.1%) showed high homology to the proteobacteria Acidovorax genus, a genus that included highly-damaging agricultural species capable of damaging seeds and fruit crop (e.g., A. citrulli is capable of causing seeding blight and bacterial fruit blotch in cucurbits 44,45 ). The metagenomic analysis also identified a total of 40,391 sequences (4.2%) with homology to Verminephrobacter species (MR-RAST ID 4661244.3 and 4690946.3), which was closely related to Acidovorax bacteria. Verminephrobacter species has been reported only in the Lumbricidae earthworm Eisenia foetida so far. It is difficult to provide plausible explanations for the detection of significant number (i.e., greater number of sequences than the p-endosymbiont Cand. P. aleyrodidarum) of such sequences. Although environmental contamination remained a possibility, the 4 th instar B. emiliae larva was immersed in 100% ethanol for 24 hours prior to gDNA extraction, a process that would likely have reduced potential bacterial contaminations on the surface of the whitelfy nymph. This metagenomic bacterial signature is absent in the 1942 Bemisia specimen, and its detection in B. emiliae could be due environmental to factors or possibly of saprophytic origins.
Three species of Eretmocerus are currently known from Japan 51 : Eret. aleurolobi (Ishii 1938), Eret. furuhashii (Rose and Zolnerowich 1994) and Eret. serius (Silvestri 1927), but none has yet been sequenced for mtCOI genes. Eret. aleurolobi has only been recorded from Aleurolobus marlatti. Eret. serius is a well-known parasitoid of Aleurocanthus species in the Oriental region, with two published records from B. tabaci, neither of which appears to be reliable 52,53 . The third species, Eret. furuhashii is known only from Parabemisia myricae. However, Hoelmer and Goolsby 54 record 'Eret. near furuhashii' as one of the species introduced from Taiwan into USA against B. tabaci. This identification would have been made by Rose, who clearly considered this B. tabaci parasitoid close to, but distinct from, Eret. furuhashii which he had co-described in 1994. There is thus strong circumstantial evidence that our historical Eret. sp. from Japan is this undescribed species from Taiwan that Rose designated 'near furuhashii' . In China, at least 19 aphelinid wasps have been recorded to parasitise Bemisia, with Eret. sp. nr. furuhashii being the most abundant, followed by En. bimaculata (Heraty and Polaszek), being responsible for 15-87.3% parasitisim in agricultural crops [55][56][57] . Given that the partial mtCOI phylogeny did not support our unknown parasitoid as being an Encarsia species, it would rule out En. bimaculata, leaving Eret. sp. nr. furuhashii (Fig. 4) as the most probable candidate. Two further pieces of supporting evidence that Eret. sp. near furuhashii is the likely candidate came from the partial mtCOII (cytochrome oxidae subunit II) gene (424 bp; GenBank KX714953) and the 28 s rRNA gene (GenBank KX714966) assembled and identified from the 1942 Japanese Bemisia nymph specimen, where these two partial hymenopteran genes (i.e., mtCOII, 28 s rRNA) matched Eret. sp. nr. furuhashii from Sanya China, between the mtCOII at nucleotide positions 64 and 332 (i.e., JF820015, 266 bp) with 98.14% sequence identity, and between the 28 s rRNA at nucleotide positions 4,348 and 4,804 (i.e., JF899345, 457 bp) with 100% sequence identity.
Although intriguing, it remains to be seen whether the 1942 Japanese native Bemisia species is a natural host of Eret. sp. nr. furuhashii, and whether the abundance of this aphelinid wasp is the same in Japan as in China. The identity of the Eretmocerus species that parasitized the Japanese 1942 Bemisia nymph will for now remain unconfirmed and intra-species genetic diversity survey, followed by morphological and molecular characterisation of aphelinid wasps in this native Japanese Bemisia species will be needed for its future identification.
Our study demonstrated the possibility of investigating tritrophic interactions between host, parasitoid and endosymbionts in museum specimens from just 1.5 ng of double stranded gDNA. While Tin et al. 58 demonstrated between 14-220 ng of gDNA would be sufficient for genome-wide SNP analysis via the Restriction-Associated DNA sequencing (i.e., RAD-Tag/RADseq) method for museum specimens (collected from 1910 to 1976), our protocol further reduced the required gDNA for NGS methods to approximately one-tenth of the 14 ng used by Tin et al. 58 . Our method also overcame the challenge of working with highly fragmented gDNA starting material without performing an initial 'end-repair' step 59 that may lead to the loss of gDNA. Although non-destructive extraction of gDNA is possible from a range of small insects 60 including Aphelinidae wasps 47 , it was not attempted for these Bemisia museum specimens, and could be adopted to provide a non-destructive NGS protocol for the recovery of these historical specimens. Our method described here will offer significantly improved opportunity to better investigate the genomics of historical specimens, and together with new genome-wide SNP generating methodologies that have been developed 58 , will deepen our knowledge regarding how current ecological and

Material and Methods
A single Bemisia emiliae pupa (cotype) was collected from Hakgala, Sri Lanka (formerly Ceylon), by E. E. Green in May 1912. It was formerly described 14 years later 61 . The specimen was deposited at the Smithsonian Institution, Washington D.C. The second sample is a single whitefly 4 th instar nymph collected by S. Kanda in Muroto, Shikoku, Japan on 08-Aug-1942 and morphologically identified as B. tabaci. All samples were made available for use in this study by Greg Evans, USDA APHIS NIS.
The extraction protocol consisted of first placing the pupa in 1,000 µL of analytical grade 99.9% ethanol for 24 hours in a sterile 1.5 mL Eppendorf tube. The ethanol was then removed and the specimen and air-dried for 15 minutes at room temperature. Once dried, the pupa was crushed in a sealed P200 sterile pipette tip and extracted using a modified protocol that combined the Qiagen Blood and tissue DNA extraction kit and the Zymo Research DNA concentrator protocol. The protocol was modified to improve the quality and yield of DNA extracted. It involved digesting the crushed pupa in 196 µL of AL (Qiagen) and 4 µL of proteinase K (Qiagen; >600 mAU/mL) for 24 hours at 56 °C. At the end of the digestion step, 4 µL of RNase A (Qiagen; 100 mg/mL) was added and incubated at room temperature for 5 minutes. The digestion buffer with the pupa was pulse-spun and 200 µL of wash buffer AW1 (Qiagen) was added. The liquid was then transferred to the Zymo Research genomic DNA concentrator column, and centrifuged as per the Zymo Research genomic DNA concentrator protocol (10,000 g, 1 min). The washing step was repeated and the gDNA was finally eluted in 15 µL of Buffer EB (Qiagen, Cat. # 19086). We used 2 µL of the eluted gDNA to estimate DNA concentration using Qubit v2.0 (Life Technologies, Grand Island, NY) dsDNA HS assay. The remaining amount of eluted gDNA (ca. 3.98 ng (B. emiliae); ca. 24.01 ng (1942 'Bemisia')) in Elution Buffer (~12.5 µL) was allowed to evaporate at room temperature for 16 hours (in a sterile fume hood, uncapped, but loosely covered by a piece of clean Kimwipe tissue). The dried gDNA from the pupa was then re-suspended in 5 µL nuclease-free water (Qiagen, Cat.129114) for 60 minutes at room temperature.
Preparation and normalisation of Nextera XT library to 2 nM of amplicon molecules. Although the Nextera XT protocol recommended only 1.0 ng of double stranded gDNA as starting input material, we used 1.5 ng of double stranded gDNA for very small museum specimens such as Bemisia nymphs, to compensate for the likely presence of very fragmented gDNA (i.e., <50 bp) and to allow for targeting of slightly larger fragments for 2 × 300 bp PE sequencing. The amount of 1.5 ng of double stranded gDNA was sampled from each of B. emiliae and the 1942 Bemisia specimens and processed using the Nextera XT DNA Library Preparation kit (Illumina, Cat. # FC-131-1096). Individual Bemisia pupae gDNA was tagmented (tagged and fragmented) by the Nextera XT transposome. The tagmented gDNA was amplified in a limited cycle PCR reaction to add indexes and Illumina adapter sequences (Illumina, Cat. # FC-131-2003) for sample tracking and cluster formation. For each Bemisia specimen, the amplified library was purified and size-selected using AMPureXP beads (Beckman Coulter, Cat. # A63881). Purified libraries were then quantified by Qubit dsDNA HS assay and the size distribution checked by HS D1000 screentape assay (Agilent, Cat. # 5067-5584) on an Agilent 2200 Tapestation. The insert sizes of our gDNA libraries were determined to range between 160-500 bp for B. emiliae and between 154-1,234 bp for the 1942 'JpL' Bemisia species, and both have a peak fragment size of 256 bp. The Illumina Nextera XT libraries were then normalized to a final concentration of 2 nM, denatured and diluted to a final concentration of 10 pM and combined with a Phi X control library (Illumina, Cat. # FC-110-3001), spiked in at 2.5%. The Illumina libraries were then sequenced on an Illumina MiSeq sequencer using a MiSeq reagent kit v3, 600 cycles (Illumina, Cat. # MS-102-3003) to perform a 2 × 301 bp paired-end sequencing run.
We used Geneious version 8.0.5 (Biomatters Ltd, Auckland, NZ) to assemble of the full mitogenome of B. emiliae, using the Bemisia tabaci cryptic species Asia I mitogenome (KJ778614) 11 as a reference. We used Illumina reads with quality checking performed within Geneious version 8.0.5 for mitogenome assembly and annotations. The assembly parameters consisted of 10% maximum mismatches per read, minimum overlap of 25 bp, maximum gap size of 3 bp, and minimum overlap identify of 80%. The mitogenome annotation was conducted using MITOS 62 , with manual fine-tuning of putative start codon position for all protein coding genes within Geneious. Pairwise mitogenome alignments between the 1942 'B. tabaci' , B. emiliae, and B. tabaci Asia I (KJ778614) were performed using MAFFT v7.017 63  We conducted a phylogenetic analysis to infer species relationships of the unknown Hymenoptera to selected Chalcidoidea parasitoid wasp species belonging to six genera (i.e., Eretmocerus, Nasonia, Philotrypesis, Coccophagoides, Encarsia, Eurytoma; Supplementary Table 2), based on partial mtCOI gene as obtained from GenBank nucleotide database (accessed 01-Jun-2016). Sequences were imported into CLC Sequence Viewer version 7.6 (Qiagen Aarhus A/S) and followed by alignment using default settings (Gap open cost = 10; Gap extension = 1.0). Alignment of the partial mtCOI gene is generally straight forward across diverse insect groups as it typically does not involve INDELs. Trimming of sequences therefore only involved removal of flanking 5′ and 3′ regions that extended beyond the region and length of interest. Trimmed sequences (657 bp) were used