Integrative taxonomy reveals hidden species within a common fungal parasite of ladybirds

Our understanding of fungal diversity is far from complete. Species descriptions generally focus on morphological features, but this approach may underestimate true diversity. Using the morphological species concept, Hesperomyces virescens (Ascomycota, Laboulbeniales) is a single species with global distribution and wide host range. Since its description 120 years ago, this fungal parasite has been reported from 30 species of ladybird hosts on all continents except Antarctica. These host usage patterns suggest that H. virescens could be made up of many different species, each adapted to individual host species. Using sequence data from three gene regions, we found evidence for distinct clades within Hesperomyces virescens, each clade corresponding to isolates from a single host species. We propose that these lineages represent separate species, driven by adaptation to different ladybird hosts. Our combined morphometric, molecular phylogenetic and ecological data provide support for a unified species concept and an integrative taxonomy approach.

species delimitation when phenotypic plasticity allows for overlapping morphologies in distinct species or when morphological traits have not yet arisen in the process of speciation or incipient speciation. For example, the genus Protoparmelia s. str. (Ascomycota, Lecanorales) consists of 12 species based on morphological and chemical features but a phylogenetic-coalescent approach recognizes 23 species 10 . Another widely cited example is that of Dictyonema glabratum (Basidiomycota, Agaricales), a single morphological species that constitutes 126 species using a Generalized Mixed Yule Coalescent (GMYC) analysis of a large dataset of the internal transcribed spacer (ITS) DNA region, and even more than 400 species based on predictive modeling 11 .
Many species of fungi form associations with other organisms and these associations may be critical in species recognition. As a result, fungal species may be circumscribed based on the property of host associations. Host specificity represents an ecological condition; it entails resource availability and niche specialization. The concept of "ecological species" generally refers to reproductive isolation evolved through adaptation to different environments. The micro-evolutionary process of natural selection among diverging populations or subsets of a single population acts in contrasting directions between environments and leads to the fixation of alleles, which may be advantageous in one environment but not in others [12][13][14] . The ecological species concept dates from the 1940s, when Theodosius Dobzhansky 15 wrote that "[s]peciation in Drosophila proceeds mainly through evolving physiological complexes which are successful each in its environment. " An interesting case is Leccinum (Basidiomycota, Boletales), a genus of ectomycorrhizal fungi forming associations with many plant hosts. A recent study 16 found high host specificity in all species included, except for the generalist L. aurantiacum. The authors raised the point that ecological information on its own ("the ability to grow on a new host") does not a priori provide evidence for a species hypothesis. More recently, three species were described within the ant-parasitic Ophiocordyceps unilateralis species complex (Ascomycota, Hypocreales) based on the combination of molecular, micro-morphological and ecological (host specificity) data 17 . All this is in line with de Queiroz's (2007) view 3 that multiple properties provide evidence for lineage separation, that is, divergence of populations, and, thus, speciation.
In this paper, we explore species limits in an enigmatic group of microscopic fungi, the Laboulbeniales. These are obligate, microscopic ectoparasites of arthropods. Around 2200 species in 141 genera are known to infect various groups in three arthropod subphyla (Chelicerata, Hexapoda, Myriapoda) and they are known from all continents except Antarctica [18][19][20][21] . Laboulbeniales never form mycelia; the ascospores do not form germ tubes but rather divide mitotically after attachment to the host to form thalli of up to thousands of cells by determinate growth. Hesperomyces virescens (Figs 1, 2) is a species of Laboulbeniales that has been reported to parasitize around 30 species of ladybirds (Coleoptera, Coccinellidae) in all continents but Antarctica. It grows exclusively on adult ladybird hosts in 21 genera in 5 subfamilies 21 . Since its discovery on the invasive ladybird Harmonia axyridis, biologists have discussed H. virescens as a candidate model for studying host-parasite co-evolution and biological control programs 22 . Contrary to what is the consensus for Laboulbeniales, H. virescens has been reported to have negative effects on its hosts 21,22 . Based on intra-and interspecific transmission experiments, Cottrell and Riddick 23 suggested that different lineages of H. virescens exist and that each of these lineages may have a high degree of host specificity. The question whether H. virescens truly is a single species or an assemblage of morphologically similar but genetically distinct species 19 has provided the starting point for the present research study. To identify H. virescens, mycologists have used morphological characters that can be compared across a range of different host species. Here, we combine morphological, molecular and ecological data as independent lines of evidence to infer the number of species within H. virescens, following the unified species concept proposed by de Queiroz 2,3 .

Results
Morphometric approach. Details of measurements and ratios for all 181 thalli included in the analysis are given in Supplementary Table S1. For a majority of variables, the best model to explain differences in measurements contained host species (Mod1 in Table 1). Inclusion of host species as an explanatory variable considerably improved model performance. Of the 35 studied variables, 10 did not differ significantly between host species: W cell I, L cell II, W cell II, L cell VI, lngst. proj., L/W cell II, total L rec./total L, tier II/L perith., tier III/L perith., and lngst. proj./L perith. We only considered ratios for PCA to focus on shape rather than natural variation in absolute size. Significant differences were observed for the following ratios: L/W cell I, L/W cell III, total L app./total L, L/W cell VI, tier IV/L perith., lobes/L perith., L/W perith., and L perith./total L. Statistical processing of these ratios revealed two principal components (PCs) that together accounted for 81.54% of the observed variation in thallus morphology of H. virescens between C. propinqua, H. axyridis and O. v-nigrum. PC1, 48.39% variation explained, represents L/W cell I, L/W perith., and L/W cell VI (Fig. 3a,b). PC2, 33.15% variation explained, represents L/W cell VI and L/W cell I (Fig. 3a,c). In the morphospace formed by the two first PCs, clouds of individuals from the 3 different host species overlap partly, but they also occupy a considerable part of the morphospace without overlap (Fig. 4).

Phylogenetic inferences.
The Hesperomyces clade has maximum support in the ITS and LSU datasets. In both datasets, each monophyletic clade within the H. virescens complex consists of isolates from thalli removed from a single host species. There is one exception, the clade consisting of isolates from two host species in the same genus, Adalia bipunctata and A. decempunctata. Henceforward, in the text we will refer to these distinct clades by the first letters of the host genus and species names: Ab for Adalia bipunctata, Ad for Adalia decempunctata, Ao for Azya orbigera, Cp for Cheilomenes propinqua, Cs for Cycloneda sanguinea, Ha for Harmonia axyridis, Hs for Halyzia sedecimguttata, Ov for Olla v-nigrum and Pv for Psyllobora vigintimaculata. In addition, figures are consistently color-coded by host species where appropriate.
In the ITS dataset, 8 clades are recognized, in addition to H. coleomegillae and H. palustris, which are positioned basally compared to the other clades (Fig. 5a). Of the 8 clades, 6 are strongly supported (BS ≥ 84): Ab+Ad, Cp, Cs, Ha, Ov and Pv. Only the singleton clades (Ao and Hs) lack support. In the LSU dataset, 6 clades are recognized (Fig. 5b). All of these clades have strong support (BS ≥ 81). In the three-gene dataset, again, the Hesperomyces clade has strong support and consists of 10 clades (Fig. 6). Of these, nine have high support (ML

Species delimitation analyses.
Results of the sequence-based methods for species delimitation are summarized in Fig. 6 and Table 3. The ABGD analysis resulted in 8 distinct groups, irrespective of distance metrics or gap width values. Eight species were identified within the Hesperomyces virescens clade from the bPTP analysis of the ITS topology. The bPTP analysis of the LSU topology resulted in 6 species (no LSU sequences were generated for isolates from Cycloneda sanguinea and Halyzia sedecimguttata). Support was lacking for the Cp clade in this analysis. The GMYC analyses of the ITS and concatenated trees resulted in 8 species, all with high support except for clades Ao and Hs (which comprised a single isolate only). The GMYC analysis of the LSU tree resulted in 6 recognized species, without support for Ao, Hs (each comprising a single isolate), and Cp and Ha.

Discussion
Our results illustrate that H. virescens encompasses several unique genetic lineages. Each of these lineages occurs on a single host species (or two host species in the case of Adalia), regardless of geographic origin of the collection. Some of the clades in our phylogenies have no or only moderate support but these are the clades for which only a single isolate is available (Ao and Hs). Some ladybird species were only recently discovered as hosts and others are not frequently found. For example, during fieldwork in Panama in 2015 we found Hesperomyces thalli on Azya orbigera, which had previously not been reported as a host. Out of 151 ladybirds, only 10 infected ones were found, each individual carrying a single thallus (1 individual carried 3 thalli). We tried two extraction protocols, each with 1 thallus. The extraction using the Extract-N-Amp PCR Plant Kit failed; the one with the REPLI-g Single Cell Kit was successful and we were able to generate SSU, ITS and LSU sequences (isolate D. Haelew. 928 g). In another case, a single Hesperomyces-infected individual of Halyzia sedecimguttata was found in the Netherlands in 2015. It bore 13 adult and 4 juvenile thalli. We chose to use 10 adult thalli for DNA isolation with the Extract-N-Amp Plant PCR Kit (isolate D. Haelew. 955b) but for this isolate we were only able to generate an ITS sequence. Since this report, no further infected specimens of H. sedecimguttata have been collected. Finally, we only had 3 infected individuals of Psyllobora vigintimaculata available for study, but these specimens carried sufficient thalli for both morphological study and molecular work.   Some of our host species were collected only from a single population. This is the case for Cheilomenes propinqua, Olla v-nigrum and P. vigintimaculata. However, specimens of O. v-nigrum, although from a single locality, were collected on multiple occasions in 2014, 2015 and 2016 (also from a laboratory colony for many generations). Adalia bipunctata and H. axyridis are the host species with the widest geographical range included in this study. Infected specimens of A. bipunctata were collected in Denmark, Italy and Sweden; specimens of H. axyridis were collected on different continents. Even so, both clades Ab+Ad and Ha each form two monophyletic lineages, in all datasets (ITS, LSU, SSU+ITS+LSU). In other words, there is no geographic signal. We conclude that phylogenetic structure is primarily determined by host specialization. Based on intra-and interspecific transmission experiments, Cottrell and Riddick 23 proposed that "isolates/strains of H. virescens may exist under field conditions and only infect closely related Coccinellidae or even a single species. " Based on the results of our species delimitations analyses, we propose that these lineages (or clades, as we refer to them) represent distinct species.
The clade Ab+Ad is the only clade consisting of H. virescens isolates taken from more than a single host species. In this case, the isolates originated from either Adalia bipunctata or A. decempunctata. Ladybirds in the genus Adalia are popular for studies in ecology and population genetics because they vary distinctively in their elytral and pronotum color patterns. Each of these color forms, or phenotypes, corresponds to a unique genotype 24 . Adalia bipunctata and A. decempunctata are very similar in ecology and habitat use; both coexist throughout temperate regions in Europe. They are also closely related genetically, and even some A. bipunctata × A. decempunctata hybrids have been reported 25,26 . This Ab+Ad clade makes for an interesting case because one could pose the question whether the specificity of H. virescens relates to host species-level or genus-level. Based on the currently available data we cannot solve this question, but we will continue to collect ladybirds and thus increase our dataset of ladybird host species.
Comparison of species delimitation methods. Whereas molecular data provide a valuable tool to validate morphology-based species descriptions, the application of species delimitation methods can increase confidence if several methods offer congruent estimates of species diversity within a given dataset. Incongruences in results imply that multiple methods differ in their delimitation power. Alternatively, it is also possible that users make incorrect assumptions when employing a given species delimitation approach 27 . In the event of incongruent results, it is better to be conservative, rather than to designate entities that do not actually represent evolutionary metapopulation lineages as species. In any case, the multiple species delimitation analyses that we used in our study identified congruent species boundaries. The combination of BS, pp and species delimitation support provides strong evidence for H. virescens being a complex of multiple species.
In the bPTP analysis of the LSU topology, the Cp clade was not supported. PTP models speciation in terms of number of nucleotide substitutions 28 . Upon manual inspection of the multiple alignment, for all clades it is the case that the number of nucleotide substitutions within the clade is zero. The only exception is the Cp clade, with 2 substitutions between the isolates of this clade. The number of substitutions between Cp and other clades ranges from 5 to 13. The PTP model did probably not interpret the Cp clade as a distinct species because of these within-clade substitutions for the Cp clade. Our PTP analyses based on single-gene trees are consistent with the results obtained by ABGD. We also performed a bPTP analysis on the concatenated SSU+ITS+LSU dataset (only Bayesian support values shown, Table 3). Although the number of species is the same as in the PTP analysis of single gene topology and the ABGD and GMYC approaches, the Bayesian support dropped significantly; none of the delimited clades have support higher than 0.52. We repeat earlier findings 28,29 that PTP is most accurate with single gene trees.
Kekkonen and Hebert 30 put forward that GMYC usually delimits more species compared to other methods. In our analysis, the results from GMYC are congruent with ABGD and bPTP of the ITS topology (and bPTP of the LSU topology, noting that two clades were missing in this analysis). Two clades that lack support -Ao and Hs -consist of single isolates only. GMYC looks at intraspecific branching versus interspecific branching, and thus it is no surprise that these singleton clades have no support from this approach. The low support for the Ha clade in the GMYC analyses of the LSU and three-gene topologies may be explained by the fact that for a number of isolates coming from Harmonia axyridis, sequences are incomplete (missing or only partial SSU, ITS or LSU). Because this clade holds many isolates, missing sequence data for a number of these isolates may influence generating an ultrametric tree, which is a computationally intensive and error-prone process. Since GMYC is dependent on the accuracy of this input tree, any alterations will strongly influence species delimitation analyses.
Upon the introduction of PTP, it was noted that delimited groups represent "putative" species 28 . PTP uses phylogenetic reconstructions inferred from single gene datasets, which are gene trees rather than species trees.  Also GMYC is based on a single-gene tree. As a consequence, more data should be collected to confirm and validate the species boundaries set by these delimitation approaches, in an integrative taxonomy framework across disciplines [31][32][33][34][35] . Note that this framework is in line with the unified species concept 3 .
Comparison of ITS and LSU as barcode markers. Molecular identification of fungi relies on the availability of good DNA barcode markers. Currently, DNA-based identifications focus on genes that code for ribosomal RNA (rDNA), because these regions have many copies in the genome and thus are well-suited target regions for PCR amplification. The internal transcribed spacer (ITS) region has been proposed as universal barcode for Fungi 36 . This means that for a majority of fungi, the interspecific variation at this marker should exceed the intraspecific variation, and for over 70% of fungi the ITS is indeed effective in recognizing species. A number of considerations have been made since the acceptance of this barcode marker 36-39 : (1) RPB1 is actually better in discriminating species but its amplification success is much lower; (2) whereas ITS performs best overall across the fungal kingdom, its identification power is equal to LSU for subphyla Pezizomycotina and Saccharomycotina (Ascomycota); (3) ITS does not contain enough variation to discriminate between species for some groups of fungi, such as Aspergillus and economically important plant pathogens in the genera Alternaria, Diaporthe, Fusarium and others; (4) arbuscular mycorrhizal-forming species in Glomeromycota are multinucleate and extremely intraspecific divergent in their ribosomal DNA. These challenges have driven mycologists to developing other, lineage-or genus-specific barcodes. These secondary barcodes are often more difficult to amplify (because of the lack of universal primers) but have a better delimiting power than the ITS 40 . Examples of secondary barcodes include calmodulin (CaM) for Aspergillus; the translation elongation factor (TEF1), topoisomerase I (TOP1) and phosphoglycerate kinase (PGK) for Fusarium; the LSU rDNA region for Amanita; and the Apn2-Mat1-2 intergenic spacer and partial mating type (Mat1-2) gene (ApMat) and glutamine synthetase (GS) combined for Colletotrichum 38,41,42 . We experience low amplification success for the ITS region with the Laboulbeniales using general fungal-specific primers. There are many possible reasons for failed ITS amplification, ranging from simple primer mismatch as is the case in the Archaeorhizomycetes 43 to significant intragenomic 44,45 or intraspecific variability 46 . Although the 5.8S region is highly conserved, both spacer regions (especially ITS1) appear to be rapidly evolving 47 . Previously generated sequences of Laboulbeniales suggest that the ITS differs significantly among genera and we have currently no idea of the extent of this variability. Still, ITS may be useful and important as a marker to study intrageneric relationships. During our studies of Hesperomyces we designed and are routinely using primers that specifically target conserved regions of the ITS (ITShespL, 5′-CTCCTGTAGAACCTACACATC-3′ and ITShespR, 5′-CAAATTTAAGCTTTTGCCGC-3′). During the course of this study, we constructed phylogenies based on single genes (SSU, ITS, LSU) and on a concatenated SSU+ITS+LSU dataset. The SSU gene is very conservative and has no discriminative power at the species level. But, both the ITS and LSU datasets result in high support for the individual clades of the H. virescens complex (Fig. 6). In addition to its discrimination power, amplification of the LSU region poses virtually no problem within the Laboulbeniomycetes so far investigated. The commonly used fungal primers for the LSU region, such as LR0R/LR5 and LIC24R/LR3 48,49 , generally work well for most species of Laboulbeniomycetes. Based on these preliminary results, the LSU region should be further investigated as barcode for species delimitation in Laboulbeniomycetes.
Hesperomyces virescens, a complex of cryptic species?. Recent molecular (phylogenetic) studies point at a dazzling diversity of the kingdom Fungi. However, it is not always possible to infer this diversity from morphological features. Species that "have been classified as a single nominal species because they Olla v-nigrum, Roberto Güller (Flickr); Psyllobora vigintimaculata, Katja Schulz (iNaturalist). To the right of the phylogeny, the results of species delimitation analyses are summarized, from left to right: ABGD analysis of the SSU+ITS+LSU alignment; bPTP analysis of the ITS topology; bPTP analysis of the LSU topology; and GMYC analysis of the ITS, LSU and SSU+ITS+LSU ultrametric trees (without outgroups) generated in BEAST. Hatching implies lack of support, whereas no coloration means that clade was absent in that analysis.  60 . The general habitus of thalli was stable, but size was related to host species, habitat of the host and position of the fungus on the host integument. Subsequently, De Kesel and Haelewaters 61 tested differences in thallus shape and dimension between two morphologically similar species of Laboulbenia. Most variables were significantly different between both species, particularly the shape of the receptacle was different regardless of size or growth position. In this study, we generated the largest morphometric dataset to date for any species of Laboulbeniales, including measurements and ratios of 181 thalli from 3 host species. Our PCA suggests that the shapes of cell I, cell VI and the perithecium contribute most to the observed variation within the dataset. If we were to formally describe clades Cp, Ha and Ov as separate species, we expect to find most descriptive features in cell I of the receptacle and in the perithecium and its basal cell (VI).
In both the ITS and three-gene phylogenies, we retrieved H. coleomegillae and H. palustris 62 (on Coleomegilla maculata) basal to the H. virescens clade. The basal-most clade within H. virescens is Ao in the ITS and three-gene analysis. In terms of morphology, the thalli on Azya orbigera are structurally quite different compared to thalli from other host species. For example, the appendage is only 3-celled with the third cell carrying two antheridia. This structure is completely different than thalli from any of the other hosts in our dataset and also than H. coleomegillae and H. palustris 21,[62][63][64] . The appendage of these thalli consists of a single row of at least 4 cells, and every cell starting from the second carries an antheridium (the distal-most cell carries 2 antheridia). In the LSU phylogeny, the Ov clade is basal-most positioned. However, this tree lacks internal support, except for the sister relationship between clades Ao and Pv. Also the ITS phylogeny lacks internal support and all internal clades collapse to a basal polytomy. On the other hand, incongruence between trees is possible as a result of incomplete lineage sorting. Gene trees are not equal to species trees; a multi-locus approach is preferred for investigating species divergence and gauging relationships between species 65 . In the three-gene phylogeny, the backbone relationships within the H. virescens clade are well resolved, as indicated by high pp values. The sister relationship between clades Cs and Ov is without support, as is that between clades Ab+Ad and Cp. This may indicate that there is a lack of taxon sampling, which is no surprise given the broad host range of H. virescens 21 . Given the available data, we accept the clades Ab+Ad, Ao, Cp, Cs, Ha, Hs, Ov and Pv to be part of the H. virescens complex, or H. virescens sensu lato, and to represent independent evolutionary lineages. Following the strict host specificity detected by this study, we propose to restrict H. virescens sensu stricto to those thalli found on Chilocorus stigma, the host species on which the fungus was originally described 64 . Conclusions. Through DNA isolation, PCR amplification, sequencing and analysis methods, thousands of characters became available for minute fungi that do not have many morphological features and do not grow in culture. These remarkable improvements in the collection of character data will help us answer questions about the validity of "worldwide" and "cosmopolitan" geographic distributions ascribed to many morphological forms of the Laboulbeniomycetes. Here, we provided answers in the case of Hesperomyces virescens, which we have shown to be a complex of different species, each with its own host (genus). We are only starting to unravel patterns of speciation in this group of fungi. The findings of this paper are not only promising for future studies, but they also emphasize the necessity for an integrative approach in taxonomic research. We hope with this contribution to include the Laboulbeniales ectoparasitic fungi in contemporary discussions considering molecular evolution and speciation patterns, rather than treating them as obscure fungi for specialists only.   Collection of Laboulbeniales. Preserved insects were examined for the presence of Laboulbeniales under a dissecting microscope at 10-50× magnification. Hesperomyces thalli were removed from their hosts using Minuten Pins (BioQuip, #1208SA, Rancho Dominguez, CA) inserted onto wooden rods. Following standard procedure by Benjamin 67 , we removed thalli or groups of thalli and mounted them in Amann's medium, a liquid solution. Before applying Amann's medium and to facilitate microscopic observations, thalli first had to be arranged and fixed onto the microscope slide. To make thalli a bit sticky, they were first placed in a droplet of Hoyer's medium (30 g arabic gum, 200 g chloral hydrate, 16 mL glycerol, 50 mL ddH 2 0). Next, thalli were individually picked up and arranged in one or two rows. After a brief period of drying, the slide was closed using a cover slip with a drop of Amann's medium (drop facing downward) and subsequently sealed with nail polish or B-72 in acetone (Gaylord, #AB72, Syracus, NY). We viewed mounted specimens at 400-1000× magnification using an Olympus BX40 microscope equipped with an XC50 camera (Olympus, Waltham, MA Morphological studies. To assess morphological variation in thalli we took measurements of 22 parameters per thallus (Fig. 1): total length of the thallus including haustorium (total L w foot, point a-point x in Fig. 1 app., f-g), total length of appendage (total L app., f-k), length of longest antheridium (L lngst. anth., h-j), length of longest antheridial neck (L anth. neck, i-j), length of cell VI (L cell VI, m-z), width of cell VI (W cell VI, p-y), perithecium length (L perith., w-z), perithecium width (W perith., r-x), length of second tier of perithecial wall cells (tier II, q-r), third tier (tier III, r-s), fourth tier (tier IV, s-t), length of lobes (lobes, t-w) and length of longest projection (lngst. proj., u-v). To correct for natural variation in length and width, these ratios were calculated: L/W cell I, L/W cell II, L/W cell III, total L rec./total L, total L app./total L, L/W cell VI, tier II/L perith., tier III/L perith., tier IV/L perith., lobes/L perith., L/W perith., L perith./total L and lngst. proj./L perith. Measurements were made at 400-1000× magnification with cellSens Standard 1.14 software (Olympus) using the Polyline measuring tool. We measured at least 30 adult thalli from each host population. Maturity was judged by the presence of ascospores within the perithecium. To exclude potential position-induced morphological variation 62 , only thalli from the elytra were measured and used in this study.
We analyzed variation in morphology of thalli from different host species and populations using generalized linear mixed models (GLMM), implemented in the R package lme4 70 . Random effects for insect specimen were included, because we measured several thalli from the same host individuals. Hypothesis testing was done using likelihood ratio tests, with P-values calculated based on chi-squared distributions, declaring an effect significant when P ≤ 0.05. Two models were compared for each variable, the null model (Mod0) and the model with host species as explaining variable (Mod1). Model selection happened using the Akaike Information Criterion 71 . For a selection of variables with significant differences between host species in the GLMMs, principal component analysis (PCA) followed by exploratory biplots were made. PCA was only done for ratios to visualize variation in shape and structure independent of size. PCA and biplots were obtained using the R package factoextra 72 .
DNA extraction methods. We extracted DNA from 1-18 Hesperomyces thalli either using the QIAamp DNA Micro Kit (Qiagen, Stanford, CA), a modified Extract-N-Amp Plant PCR kit (Sigma-Aldrich) procedure 73 or a modified REPLI-g Single Cell Kit (Qiagen, St. Louis, MO) protocol 74 . The QIAamp DNA Micro Kit protocol was followed as per the manufacturer's instructions. One major change we implemented was the increase of the incubation time at 56 °C for complete lysis, to several days. With the Extract-N-Amp Plant PCR kit, thalli were removed at the foot with a tiny drop of Hoyer's medium or glycerin at the tip of a Minuten Pin and placed in a 0.5 µL PCR tube filled with 20 µL of Extraction Solution. The tube was incubated at room temperature for 10-30 min and then at 95 °C for 20 min. The extract was then diluted with 60 µL of Dilution Solution (3% Bovine Serum Albumin). The REPLI-g Single Cell Kit is different from the previous protocols because it adds a whole-genome amplification (WGA) step to the DNA isolation, thus providing a considerable benefit when material is scarce. A Minuten Pin was submerged in glycerin to remove a single thallus from its host and place it in a droplet of glycerin on a microscope slide. The thallus was carefully placed in a 0.2 mL PCR tube with 2 µL of phosphate-buffered saline (PBS). These steps were done at 40× magnification under a stereomicroscope. After adding 1.5 µL of prepared D2 buffer, the tube was incubated at 65 °C for 20 min. Subsequent steps followed the manufacturer's instructions. All steps of this procedure were performed under a laminar flow hood to ensure sterile conditions. For a majority of our isolates, we applied pre-treatments to increase the likelihood of successful isolation and subsequent PCR amplification. These pre-treatments included subsequent cycles of freezing on liquid nitrogen and heating to 95 °C, prolonged incubation at 56 °C in 180 µL ATL buffer + 20 µL proteinase K or in 20 µL Extraction Solution using a Shake 'N Bake Hybridization Oven (Boekel Scientific, model no. 136400-2, Feasterville, PA) and homogenization in a FastPrep FP120 Cell Disrupter at 5.0 m/sec for 15 s (Thermo Fisher Scientific, Waltham, MA). For both the QIAamp DNA Micro Kit and the Extract-N-Amp Plant PCR kit, we often manually crushed thalli in 1.5 mL tubes using a 1.5 mL pellet pestle (Kimble, #749521-1500, Vineland, NJ). In the REPLI-g Single Cell Kit, the single thallus was often cut in half through the perithecium using a sterile no. 10 surgical blade on a disposable Bard-Parker handle (Aspen Surgical, Caledonia, MI) before placing it in the 0.2 mL PCR tube.
Maximum likelihood (ML) analyses were run using PAUP on XSEDE 84 in Cipres 81 . Nucleotide substitution models were selected statistically with the help of jModelTest 85 by considering the Akaike Information Criterion (AIC). For the ITS dataset, the TVM+G model was selected (lowest -lnL = 3566.8229). For the LSU dataset, the TIM1+G model gave the best scoring tree (-lnL = 3151.3620). ML was inferred for each dataset under the appropriate model; rapid bootstrap (BS) analysis was implemented with 100 replicates.
We performed ML and Bayesian analyses for the SSU+ITS+LSU dataset. For ML, the dataset was divided into three partitions. The best substitution model was selected using jModelTest 85 by considering the Akaike Information Criterion (AIC). The following models were selected: TPM2uf (SSU), TVM+G (ITS) and GTR+G (LSU). [We did not use all LSU isolates for the three-gene concatenated dataset, hence the selection of a different substitution model.] ML was inferred under partitioned models using IQ-tree 86,87 ; ultrafast bootstrap (BS) analysis was implemented with 1000 replicates 88 . Also for Bayesian inference, the three-gene dataset was divided into partitions. Analyses were done with a Markov chain Monte Carlo (MCMC) coalescent approach implemented in BEAST 89 , with an uncorrelated lognormal relaxed molecular clock allowing for rate variation across the tree. We selected the Birth-Death Incomplete Sampling speciation model 90 as tree prior with JC (for SSU), TPM2uf+G (for ITS) and TIM1+G (for LSU) nucleotide substitution models (considering the Bayesian Information Criterion from jModelTest) and a lognormal ucld.mean (mean = 5.0, stdev = 1.0). Five independent runs were performed from a random starting tree for 10 million generations, with a sampling frequency of 1000. Prior settings were entered in BEAUti 89 to generate an XML file, which was run using the BEAST on XSEDE tool in Cipres 81 (three runs) and locally from the command line (two runs). The resulting log files of the five runs were entered in Tracer v1.6.0 91 to check trace plots for convergence and effective sample size (ESS). Burn-in was adjusted to achieve ESS values of ≥200 for the majority of sampled parameters. While removing a portion of each run as burn-in, log files and trees files were combined in LogCombiner. TreeAnnotator was used to generate consensus trees with 0% burn-in and to infer the maximum clade credibility tree, with the highest product of individual clade posterior probabilities.

Species delimitation analyses.
Morphology-based identification of H. virescens thalli 63,64,68,69 may mask multiple species within a geographical context or with strict host specificity. Therefore, we used 3 species delimitation methods to validate species hypotheses 28,92,93 : Automatic Barcode Gap Discovery (ABGD) and General Mixed Yule Coalescent methods (GMYC), and a Poisson tree processes (PTP) model approach.
ABGD is based on the detection of a "barcode gap, " which is observed when nucleotide divergence among isolates of the same species is smaller than divergence among isolates of different species in a given multiple alignment. Gaps are identified and used to partition (or: split) the data into the maximum number of groups, which represent species hypotheses 92 . We used the web version of ABGD (at http://wwwabi.snv.jussieu.fr/public/ abgd/abgdweb.html) to identify barcode gaps in the SSU+ITS+LSU dataset. Genetic distances were calculated using both available distance metrics (JC69, K80) with the following parameters: Pmin = 0.001, Pmax = 0.01, steps = 10 and Nb bins = 20. To assess consistency of the species recognized by ABGD, we evaluated results for four gap width values (X): 0.1, 0.5, 1.0 and 1.5. In the PTP model approach, the number of nucleotide substitutions is directly used to model speciation rate. The underlying assumption is that the number of substitutions between species is significantly higher than the number of substitutions within species 28 . Compared with GMYC, Zhang and colleagues 28 found that PTP performs best especially when the evolutionary distances between species are small. PTP is intended for the delimitation of species in single-gene trees. As a result, we applied this method to both the ITS and LSU phylogenetic reconstructions separately. As input for the PTP model approach, we used phylogenetic trees generated by Bayesian analyses for the two datasets. The MCMC analyses were done under a strict molecular clock, with the Yule speciation tree prior and the appropriate nucleotide substitution model, as selected by the Bayesian Information Criterion from jModelTest 2.1. For the ITS dataset, the TVM2uf+G model was selected (lowest -lnL = 3573.5434). For the LSU dataset, the K80+I model gave the best scoring tree (-lnL = 1672.9035). Two independent runs were performed from a random starting tree for 10 million generations, with a sampling frequency of 1000. The two resulting log files were combined in LogCombiner with 1% burn-in. Consensus trees with 0% burn-in were generated and the maximum clade credibility tree was constructed in TreeAnnotator. We used the bPTP web server (http://species.h-its.org). The "b" in bPTP stands for the Bayesian support values that are added to delimited species. The different parameters were set as default (number of MCMC generations, thinning, burn-in, seed). For both analyses the outgroups were removed from the dataset prior to constructing the phylogenetic tree.
GMYC uses a fully-resolved ultrametric tree 93 inferred from a single marker to model processes at the population level (coalescence) and processes at the species level (speciation). As input we used the ITS and LSU maximum clade credibility trees generated for PTP. In addition, we reconstructed a maximum clade credibility tree in BEAST 89 using the concatenated SSU+ITS+LSU dataset. For this analysis we removed the outgroups from the dataset, because the inclusion of distantly related species makes it more difficult for GMYC to detect closely related species. The MCMC analysis was done under an uncorrelated lognormal relaxed molecular clock, with the Birth-Death Incomplete Sampling speciation model 89 tree prior, the GTR+I+G nucleotide substitution model (as selected under the Bayesian Information Criterion) and a lognormal ucld.mean (mean = 5.0, stdev = 1.0). Two independent runs were performed from a random starting tree for 80 million generations, with a sampling frequency of 8000. The two resulting log files were combined in LogCombiner with 10% burn-in. The maximum clade credibility tree was again constructed in TreeAnnotator. Species were delimited based on this generated ultrametric tree with the GMYC method using R packages rncl 94 and splits 95 .

Data Availability
Voucher specimens of infected ladybirds will be deposited at the Harvard Museum of Comparative Zoology, Cambridge, MA and the Brabant Museum of Nature in Tilburg, The Netherlands. Voucher slides of Hesperomyces virescens are deposited at Farlow Herbarium at Harvard University (FH) (barcodes available in Supplementary Table S1). All generated sequences have been uploaded to GenBank (accession numbers MG745336-MG745358, MG757798-MG757831, MG760581-760611). The following data are available from figshare https:// doi.org/10.6084/m9.figshare.c.3944749.v1: sequence alignments generated during this study (in NEXUS format), input XML and output log and trees files from Bayesian analyses, R code used for the GMYC analysis.