Elucidating speciation mechanisms is considered crucial for understanding the dynamics and function of biodiversity. Alternative speciation modes, such as allopatric, sympatric and parapatric speciation are increasingly understood with the help of phylogenetics1. A special process in this field is adaptive radiation, the phenomenon in which rapid speciation is combined with niche differentiation of the evolving species. Studying radiations has proven to be particularly promising to shed light on the causes and mechanisms driving speciation especially when dealing with species confined to a relatively closed system such as lakes2. One of the most prolific vertebrate radiations are the cichlid fishes (Teleostei, Cichlidae) of the East African Great Lakes3.

Lake Tanganyika, the oldest and deepest of these lakes, harbours the genetically and phenotypically most diverse cichlid community of these African lakes4. Its cichlid assemblage is subdivided into 12 to 17 mostly endemic tribes5. One of these tribes, the monophyletic Tropheini, is phylogenetically nested within the tribe Haplochromini and represents the sister group of the species flocks of Lake Malawi and the Lake Victoria region and of several East African riverine lineages6. Tropheini consists of 23 endemic nominal species. Although considerable knowledge gaps exist regarding their taxonomy and distribution7, their phylogeny is well-resolved and updated8,9. Most species are adapted to rocky shores and representatives of most genera occur sympatrically7,8. Tropheini contains generalist as well as specialist species that exhibit variable levels of genetic and phenotypic structuring, related to differences in habitat preference, dispersal ability and territoriality8. All these factors sparked substantial scientific interest and rendered the Tropheini radiation a “natural experiment” for species formation.

However, regardless of this showcase of biodiversity, the most spectacular radiations are found among parasites10. Mutual evolutionary pressures maintain genetic diversity in host and parasite and fuel the rate of genetic diversification11. Moreover, the availability of numerous niches across a host’s body is an additional factor fostering parasite within-host diversification12 and hence speciation. Organisms with a parasitic lifestyle account for most of Earth’s biodiversity13. However, biodiversity studies tend to focus on conspicuous faunas, ignoring the vast biomass and species-richness of helminths and other less sizeable animals14,15. As such, the potential to understand speciation through the study of parasite evolution remains almost unexplored12,16 and the contribution of parasites to the species richness of the African Great Lakes has remained largely overlooked17,18.

We combine speciation research on cichlid hosts and their monogenean flatworm parasites. Monogeneans are mostly ectoparasites of cold-blooded aquatic or amphibious vertebrates although some infect aquatic invertebrates or exhibit an endoparasitic lifestyle19. Cichlid monogeneans provide a good model for elucidating parasite speciation20,21. Their direct (single-host) life cycle makes them particularly interesting, as it may be difficult to discern host factors that influence parasite evolution for parasites with an intermediate host22. Previous studies on Lake Tanganyika monogeneans uncovered a diverse and largely endemic fauna belonging to Gyrodactylus von Nordmann, 1832 and Cichlidogyrus Paperna, 196018,23,24,25. The latter gill parasites represent the most abundant and prevalent monogenean genus on Tanganyika cichlids23. In most tropheine cichlid populations screened to this end, over two-thirds of fish individuals were infected by representatives of this genus26,27,28. Eggs of Cichlidogyrus develop and hatch on the bottom, after which a free-living ciliated larvae infects a host fish29.

We want to understand speciation by reconstructing the phylogenetic history of parasites belonging to Cichlidogyrus, retrieved from almost all tropheine host species with nuclear ITS rDNA and mitochondrial COI sequences. Taxonomic coverage is important in phylogenetic studies in general1 and in cophylogenetic work in particular30,31. The taxonomical and phylogenetic background that exists for the tropheine hosts (see above) is essential for a successful analysis of parasite diversification31. (i) We hypothesize that species of Cichlidogyrus infecting tropheine cichlids are host-specific, with a higher species richness on more stenotopic hosts. The generality of these patterns will be assessed to complement the few morphology-based reports23,28. (ii) The morphology of Cichlidogyrus suggests an influence of tropheine phylogeny on host choice24 and therefore we hypothesize that host and parasite phylogenies are to a certain extent congruent. However, following the results of Mendlová et al.32 for West African species of Cichlidogyrus, we expect that other speciation modes have also contributed to parasite diversification. Possible mechanisms of species formation in parasites include host-switching (ecological transfer between host species), cospeciation (concomitant speciation of host and parasite), duplication (within-host parasite speciation) and sorting (parasite extinction).


Sequence diversity and host-specificity

The dataset based on a rDNA fragment of 1399 bp length contained 82 haplotypes and 575 variable sites, of which 323 were parsimony-informative. As two rDNA sequences each from parasite of Petrochromis trewavasae trewavasae and from Petrochromis trewavasae ephippium and three from parasites of Tropheus brichardi differed to the extent that meaningful alignment was hampered, they were excluded from further analyses. For COI, the dataset amounted to 583 bp, 73 haplotypes, 278 variable sites and 230 parsimony-informative sites. The concatenated dataset included 62 haplotypes and totaled 1935 nucleotide positions. This alignment contained 621 bp of ITS-1, 157 bp of 5.8S rDNA, 574 bp of ITS-2 and 583 bp of COI. There were 861 variable sites, of which 659 parsimony-informative. Table 1 provides an overview of the number of sequences and unique haplotypes retrieved for parasites of each host species, of corrected pairwise genetic distances between parasites of the respective host species and of the estimated number of species sampled based on the species-level cut-offs proposed for the respective nuclear and mitochondrial sequences (see Materials and Methods). Our largest ITS rDNA dataset covers parasites of Lobochilotes labiatus, Simochromis diagramma and ‘Ctenochromishorei. Using ITS based species richness estimates, these cichlids harbor seven, three and one species of Cichlidogyrus, respectively. Identical parasite haplotypes were consistently retrieved from conspecific hosts. This host-specificity pattern is stronger than obvious from the phylogenetic tree (Fig. 1) as most haplotypes represent multiple individuals (Table 2).

Table 1 Summary of the sequence dataset of Cichlidogyrus flatworms from Lake Tanganyika.
Table 2 Overview of cichlids, parasites and locations sampled.
Figure 1
figure 1

Consensus cladogram of flatworms belonging to the genus Cichlidogyrus infecting Lake Tanganyika cichlids.

Cladogram based on the combined nuclear ITS-1, 5.8S rDNA, ITS-2 and mitochondrial COI sequences of Cichlidogyrus parasitizing Lake Tanganyika tropheine cichlids and the outgroups mentioned in Table 2. Statistical support is shown as posterior probability under BI/ML bootstrap. Clades that neither yield a support value of 85 nor of 70 under BI or ML, respectively, were collapsed; “–” indicates that a clade was not recovered in a particular analysis. Tip labels indicate host species with sampling locality and country (C: Democratic Republic of Congo; B: Burundi; T: Tanzania and Z: Zambia) and are coloured according to host genus consistent with Fig. 4. Monophyletic assemblages infecting one host species are boxed. Inlet: Cichlidogyrus parasites (300–400 μm in length) on the gills of Sarotherodon melanotheron Rüppell, 1852 (photograph taken by author A.P.).

Phylogenetic analyses

Tree reconstruction on the basis of the concatenated dataset (Fig. 1) showed that all Tanganyika monogeneans are grouped in a well-supported clade. In the tree, the parasites of the following taxa are grouped together: the non-tropheine Tanganyika endemic Neolamprologus fasciatus, the haplochromine Astatotilapia burtoni, which is endemic to the Tanganyika Basin but not to the lake proper and the Tropheini. Within this group, Cichlidogyrus infecting tropheines is supported as a monophyletic group. Cichlidogyrus sclerosus and its congener C. zambezensis, the latter hosted on the riverine haplochromine Serranochromis robustus jallae, are clearly separated from the Tanganyikan parasites. Well-supported clusters of Cichlidogyrus are organised according to host species. Irrespective of sampling locality, ‘Ctenochromishorei, ‘Gnathochromispfefferi, Limnotilapia dardennii, Lobochilotes labiatus, Simochromis diagramma and southern Pseudosimochromis babaulti all harbour monophyletic parasite clades. The parasites of Ps. curvifrons are paraphyletic with respect to the lineages infecting northern Ps. babaulti and Ps. marginatus. This is in agreement with the hosts’ affinities within one genus and with the recent removal of Ps. babaulti and Ps. marginatus from Simochromis25. The Tropheus parasites cluster at the host genus level. The clades of T. annectens and T. brichardi parasites do not strictly follow host species boundaries. However, haplotypes were never shared between Tropheus species. Petrochromis does not host monophyletic parasite assemblages, which is in agreement with its paraphyly and need for taxonomic revision8,24.

A decrease in speciation rate towards the present can be observed on the LTT plot (Fig. 2). A positive value (17.41) for the difference in AIC score between the best-fit rate-constant and rate-variable model indicates that the speciation rate of Cichlidogyrus changed with time. This difference was significant as it outnumbered all differences in AIC scores comparing the same models for 5000 randomly generated trees of the same size.

Figure 2
figure 2

Lineages-through-time plot for Cichlidogyrus parasites of Tropheini.

Lineages-through-time plot based on an ultrametric Bayesian ITS rDNA tree, constructed under a relaxed clock model, of Cichlidogyrus living on tropheine hosts; x-axis: time; y-axis: number of lineages (logarithmic scale).

Cophylogenetic analyses

The two best solutions proposed by the CoRe-Pa software, either based on a resolved parasite tree or a parasite tree with a basal polytomy, are presented in Table 3. The best solution for the resolved tree and the second best solution in the polytomy case invoked no host-switching and an unrealistically high cost for it, in comparison to the costs inferred for other events. These scenarios are not biologically plausible. Cost schemes should not only be judged on a purely statistical basis but also include the biological context33. Hence these reconstructions were not further considered and, in case of the resolved tree scenario, replaced by the second best proposal (Fig. 3). The two retained solutions, with and without a polytomy in the parasite tree, proposed comparable numbers for all events, with a high number (33–40) of sortings, similar frequencies of cospeciation (11–13) and within-host speciation events (12-11) and a low number of host-switches (4-3) (Table 3; Fig. 3). In the software package TreeMap the inferred number of 12 cospeciation events was found statistically significant at a level of 0.05, because only 421 out of 104 random reconciliations included 12 or more cospeciation events. This indicates topological congruence between host and parasite trees. Overall congruence between the host and parasite mitochondrial genotypes was significant in a distance-based cophylogenetic analysis (P < 0.01) although only 11 out of 47 links were reported as significant at a level of 0.05.

Table 3 Cophylogenetic reconciliations proposed by CoRe-Pa.
Figure 3
figure 3

Co-phylogenetic reconciliations of Cichlidogyrus and Tropheini trees.

CoRe-Pa reconciliations of Tropheini (based on AFLP as published by Koblmüller et al.8) and Cichlidogyrus (based on the concatenated nuclear-mitochondrial dataset) trees. Left: second best reconciliation based on a fully resolved parasite ML tree; right: best reconciliation based on a parasite ML tree where nodes with bootstrap support under 70 were collapsed. Branches and tips represent hosts (dark gray) or parasites (dashed/light gray). Drawings made by author T.H.; photographs taken by authors P.I.H. (I. loocki, L. labiatus, Pe. famula, T. moorii), M.P.M.V. (Ps. curvifrons) and M.V.S. (L. dardennii) and reproduced with kind permission from Radim Blažek (‘C.horei, ‘G.pfefferi, northern Ps. babaulti, S. diagramma, T. duboisi) and Ad Konings (A. burtoni, Pe. fasciolatus, Pe. macrognathus, Pe. polyodon, Pe. trewavasae ephippium, Ps. marginatus, southern Ps. babaulti, T. brichardi).


The well-studied cichlid tribe Tropheini of Lake Tanganyika, a lineage of endemic and mainly rock-dwelling species, was used as a framework to study parasite diversity and speciation. Phylogenetic reconstruction of its monogenean parasites belonging to Cichlidogyrus covered nearly all nominal tropheine host species and resulted in a clear pattern of host-specificity and congruence between host and parasite trees. We explore how parasite diversity relates to the biology of the respective host species and how parasite speciation mechanisms relate to the radiation within Tropheini.

Representatives of Cichlidogyrus are abundant on tropheine hosts: our sampling shows a picture throughout the tribe’s populations, similar to previous case studies26,27,28, of two-thirds to all of the hosts infected (unpublished data). Many species within the genus infect just one (or a set of closely related) host species21. However, host-specificity varies among species and lineages of Cichlidogyrus and the degree of host-specificity may correlate with the biology of the host34. In Lake Tanganyika, a rather generalist species of Cichlidogyrus infects pelagic bathybatines18. Conversely, based on the few morphology-based case-studies, representatives of Cichlidogyrus seemed to be host-specific on littoral Tanganyika cichlids (overview in Pariselle et al.18). We confirm this host-specificity genetically for parasites belonging to Cichlidogyrus of the entire tribe Tropheini. Conspecific cichlid populations host the same parasite species even when geographically separated by hundreds of kilometers. This is clearly exemplified by the monophyletic and monospecific parasite clades of ‘C.horei, ‘G.pfefferi, L. dardennii and southern Ps. babaulti (Table 1; Fig. 1). Conversely, sympatric host species had their unique set of parasite species. On several localities in the D.R. Congo and Zambia, sampling comprised representatives of the four main clades within Tropheini, namely Lobochilotes, Petrochromis/Interochromis, Tropheus and the ‘substrate dwellers’ including ‘Ctenochromis’, ‘Gnathochromis’, Limnotilapia, Pseudosimochromis and Simochromis8. Their parasite fauna never overlapped. As eggs of Cichlidogyrus develop away from their parental host and infective larvae have to actively colonize a new fish, each parasite individual in the survey may be considered an independent observation. Hence there is little chance of overestimating host-specificity, unlike species with clonal reproduction on the host (e.g. within Gyrodactylus35).

This consistent host-specificity is remarkable in view of tropheine ecology; with many species sympatrically inhabiting shallow rocky habitat, opportunities for parasite transfer are plenty. Monogeneans recognize their host using the species-specific chemical and physical properties of the fish’ integument36. Interindividual and interspecific variation in chemical cues characterize cichlids, as evidenced by tests for olfaction-based mate recognition37. Chemical cues emitted by the tropheine hosts might hence explain how flatworms belonging to Cichlidogyrus discern between host species. The life history of Cichlidogyrus seems to select for successful colonization through host specialisation, as larvae are shortlived and therefore have to find a suitable host soon. Moreover they do not have a second chance because they cannot switch hosts after attachment29. It should be noted that factors other than colonization of the host may also explain host-specificity. For example, differential survival after reaching an ant host colony is considered important in the specificity of “cuckoo species” of myrmecophilous lycaenid butterflies38. Competition can also mediate host-specificity39. Anyhow, the narrow host-range of Cichlidogyrus from the tropheine system is striking, given that some congeners display a much wider host range, both within18 and outside Lake Tanganyika34,40. More specific monogeneans tend to be found on larger-bodied or longer-lived fishes and their specialisation is considered to be a consequence of higher predictability of host resources41,42. While Mendlová and Šimková34 did not find evidence of this predictability hypothesis in Cichlidogyrus with regard to host body size or longevity, we assume that Lake Tanganyika tropheine littoral cichlids are predictable resources as regards their ecology because their abundance and species-richness are higher than for cichlids in the pelagic realm5,17. It has been observed in many systems that abundant hosts harbor more specialised parasite species43.

Our estimates of species richness (Table 1) have to be regarded with caution because sample size plays a role when estimating species numbers. The risk of underestimating parasite diversity is especially valid in rarely sampled host species. However, for the tropheine hosts of which the species of Cichlidogyrus are well characterized morphologically, ITS based estimates correspond to the number of formally described parasite species: one in ‘Ctenochromishorei, ‘Gnathochromispfefferi and Limnotilapia dardennii44 and three in Interochromis loocki24 and Simochromis diagramma25. By focusing on the best-sampled species, a pattern emerges. A stenotopic host such as Lobochilotes labiatus harbors more species of Cichlidogyrus than eurytopic cichlids such as Simochromis species and ‘C.’ horei8. This matches with the morphology-based observation of Grégoir et al.28, which was based on just two tropheine species. Hence, as suggested by Pariselle et al.18, host isolation or migration influences the species richness of the Cichlidogyrus community. Many factors were mentioned to determine the number of congeneric parasites a host supports, but the evolutionary mechanisms remain poorly understood45. Isolation among host populations is suggested as a strong driver of parasite genetic structuring16,46 and is here proposed to also promote parasite speciation.

Parasites provide useful complementary data on their fish host, e.g. in elucidating their hosts’ biogeography, identification or phylogeny47. The affinities between representatives of Cichlidogyrus in closely related hosts corroborate recent findings on the phylogenetic relationships among the Tropheini in several aspects. Firstly, the monophyletic clustering of Tropheus parasites includes parasites found on T. duboisi. The position of this host species within Tropheus was confirmed only recently8. Secondly, Ps. marginatus parasites are phylogenetically nested within those of the closely related Ps. curvifrons, corroborating their shared parasite species25. Thirdly, although they also share a species of Cichlidogyrus25, northern and southern Ps. babaulti are also infected by monogeneans which are not closely related. Hence parasite data agree with the morphological and genetic differentiation of geographically separated Ps. babaulti populations. This corresponds with the historical division of the Lake in subbasins, the influence of which on cichlid diversity is well-documented48.

The significant congruence between host and parasite phylogenies in distance-based and topology-based cophylogenetic analysis suggests that cospeciation played an important role in the diversification of the parasite fauna of the Tropheini. Although host-specificity may promote cospeciation49, it does not preclude speciation through host-switching50. Since the seminal paper of Hafner et al.51 which showed cospeciation between gophers and their ectoparasitic lice, cospeciation has rarely been demonstrated. Many empirical studies have shown that the combination of host and parasite traits and biogeography led to little or no congruence between host and parasite phylogenies, even when there is host-specificity (e.g. for Rhabdomys four-striped mice and Polyplax sucking lice52). Even in the case of phylogenetic congruence, other factors have been shown to be stronger drivers of speciation than coevolutionary interactions, such as geographic isolation (yuccas and associated prodoxid moths53) or phylogenetically constrained host-switching (gobies and Gyrodactylus54). For several systems, including Cichlidogyrus of West African cichlids, host-switching and duplication were suggested to be the underlying mechanisms of parasite diversity32. The difference with our results could be explained by two factors. West African cichlids are infected by a combination of specialist and more generalist monogeneans and the ecological differences between the lacustrine tropheines and the more generalist cichlids included in Mendlová et al.32 are considerable. Topological reconciliations propose that cospeciation and duplication events are three to four times as frequent as host-switches in the Cichlidogyrus-Tropheini system and that the number of parasite extinctions is high (Table 3). Frequent extinction is likely in the case of overdispersion, which is common in Monogenea20. In addition, the often small and fluctuating population sizes of monogeneans55 may promote extinction. The fact that the outcome is very similar for fully resolved and unresolved parasite trees (Table 3; Fig. 3) suggests that the proportions are robust and not an artefact of poor phylogenetic resolution. In order to discriminate cospeciation from preferential host-switching, an absolute timeframe is required to establish temporal congruence54,56. Most parasite genetic distances between host species range between 2 and 7% (ITS). Using the 2.4 Mya estimate of Koblmüller et al.8 for the most recent common ancestor of Tropheini, this would translate into an ITS mutation rate of Cichlidogyrus of 0.4–1.5% my−1. This is lower than the rate of 5.5% my−1 calculated for Gyrodactylus57 but this monogenean has a much shorter generation time35, which likely results in a higher mutation rate58. Therefore, four conclusions come to mind. (1) Divergence within Tropheini and the associated Cichlidogyrus fauna was probably concomitant. (2) The basal polytomy in the parasite tree (Fig. 1) represents a true (“hard”) polytomy, congruent with the rapid radiation of the host which led to a similar polytomy8. (3) It is likely that the diversification of these flatworms happened within the confines of Lake Tanganyika in view of the monophyly of the ingroup. (4) Simultaneous diversification of tropheine cichlids and their parasites belonging to Cichlidogyrus might explain the decrease in parasite speciation rate towards the present (Fig. 2), which is indicative of a radiation event1.

Cospeciation has rarely been observed in fish parasites in general22,33,59 or in monogeneans in particular60. In terrestrial systems it is in many instances a by-product of restricted contact between host species (phthirapteran chewing lice of geomyid pocket gophers51 or of seabirds61) or predominantly vertical transmission (Buchnera symbiotic bacteria of Uroleucon aphids30). Neither of them apply to representatives of Cichlidogyrus infecting the many sympatrically occurring8 tropheine cichlids. Our survey covers aquatic hosts occurring in the same lentic microhabitats, with parasites that have a free-living and actively recolonizing larval stage, hence offering plenty of opportunities for host-switching. However, the colonisation mode of the parasite, with a single attachment event to the host, a short survival time away from the host and sufficient access to the typical host species in the littoral habitat, seems to select for a specific host choice and against ecological transfer. Both the topology-based phylogenetic analysis and the monophyletic host-associated clusters point to the equally important role of within-host speciation. Its importance has been reported in other dactylogyridean monogeneans, e.g. on West African cichlids32, European cyprinids62 and Asian pangasiids63. Hence, we propose that the monogenean fauna diversified as a result of reproductive isolation following host speciation, in combination with within-host duplication resulting in higher parasite than host diversity. When parasites on a shared host species are sister taxa like in the present case, Poulin64 suggests that they may be named parasite species flocks. Hence, we propose that the Cichlidogyrus fauna on tropheine cichlids is an overlooked case of the numerous invertebrate radiations of Lake Tanganyika17.


Sampling and data collection

Cichlidogyrus specimens were collected on 18 of the 23 nominal species of Tropheini. Nominal species excluded are Tropheus kasabae Nelissen, 1977, a junior synonym of T. moorii65; T. polli Axelrod, 1977, a junior synonym of T. annectens65; Petrochromis horii Takahashi & Koblmüller, 2014, which was unknown at the time of sampling; Petrochromis orthognathus Matthes, 1959 and Simochromis margaretae Axelrod and Harrison, 1978. The latter species is known only from four museum specimens, none of which suited for molecular analyses; inspection of two of these individuals did not yield gill monogeneans. We distinguish between northern and southern Ps. babaulti. Indeed, the southern populations form a separate clade8 and were classified at the time of sampling as a separate species: Ps. pleurospilus (Nelissen, 1978). The latter species was only recently synonymized with Ps. babaulti25. Hence, host taxon sampling is almost as exhaustive as possible. Given the position of the Tropheini within the haplochromines (see above), two haplochromines were included to use their parasites belonging to Cichlidogyrus as outgroup. These are Astatotilapia burtoni, a derived haplochromine which occurs in Lake Tanganyika tributaries and a more basal representative of the Haplochromini, Serranochromis robustus jallae from southern Africa. In addition, the lamprologine Neolamprologus fasciatus was sampled as it shares the rocky littoral habitat with many tropheines. Figure 4 and Table 2 provide a detailed overview of species and locations from which samples were retrieved.

Figure 4
figure 4

Lake Tanganyika localities sampled for monogenean cichlid parasites belonging to Cichlidogyrus.

Colour codes refer to the respective host genera; the bottom right map details the sampling localities and major cities. For details, see Table 2. Photographs were taken by authors P.I.H. (I. loocki, L. labiatus, Pe. famula, T. moorii), M.P.M.V. (Ps. curvifrons) and M.V.S. (L. dardennii) and reproduced with kind permission from Radim Blažek (‘C.horei, ‘G.pfefferi, S. diagramma). Map created using ArcMap v.10 and reproduced with kind permission by Tobias Musschoot.

Cichlids were collected in the rocky littoral of Lake Tanganyika using gill nets. Sampling protocols were approved by the following competent national authorities and carried out in accordance with research permit no. 2007-258-CC-2006-151 from the Tanzania Commission for Science and Technology (COSTECH); the memorandum of understanding between the Karl-Franzens University of Graz, the University of Zambia and the Department of Fisheries, Zambian Ministry of Agriculture and Co-operatives; and mission statement no. 013/MNRST/CRHU/2010 from the Ministère de la Recherche Scientifique et Technologique–CRH-Uvira. Newly collected fish were kept alive in aerated tanks until they were sacrificed by severing the spinal cord or with an overdose of MS-222. They were identified to species level in situ and in the laboratories of the RMCA, where host vouchers are kept (Table 4). Host fish or their branchial arches were fixed and stored in pure ethanol. Gills were inspected for monogeneans under an Olympus SZX12 stereomicroscope. Parasites were isolated with a dissection needle and stored in 5 μl of milli-Q H20 at −20 °C awaiting further processing. A small number of flatworms were stored in the field on FTA Classic Cards (Whatman). In total, sequences were obtained from 220 parasite specimens, retrieved from 84 cichlid individuals. DNA extraction, PCR amplification and sequencing followed Vanhove23 (Supplementary Methods S1). Table 4 shows the GenBank accession numbers of the parasite sequences obtained.

Table 4 Accession numbers of parasite sequences and host vouchers used to reconstruct a combined nuclear-mitochondrial phylogeny of Cichlidogyrus infecting Lake Tanganyika tropheine cichlids.

Phylogenetic analyses of parasites belonging to Cichlidogyrus

The closest BLAST hit from GenBank, the complete ITS-1 sequence of Cichlidogyrus sclerosus Paperna and Thurston, 1969 (DQ537359) was included as additional outgroup for rooting. Sequence alignment was performed by MUSCLE v.3.866 under default distance measures and sequence weighting schemes. The resulting alignments were visually inspected and improved in MEGA v.567. In the case of COI, alignment was straightforward as there were no gaps and translation into amino acids (using the echinoderm and flatworm mitochondrial code) did not result in nonsense or stop codons. These nuclear and mitochrondrial datasets were used separately for an assessment of genetic diversity. jModelTest v.0.1.168 was used to select the optimal molecular evolution model starting from a maximum likelihood (ML) optimized tree. Based on the corrected Akaike information criterion (AICc), the TVM + Γ model was selected for the nuclear alignment and the TIM2 + I + Γ model for the mitochondrial dataset (with gamma shape parameter of 0.40 for ITS rDNA and 0.11 for COI). Gamma-corrected pairwise genetic distances were calculated in PAUP* v.4.01b (Swofford, 2001, Sinauer Associates). As a rough estimate of the species diversity contained in the sample, the number of haplotypes displaying at least 1% (for ITS rDNA) and 2% divergence (for COI) was determined. This follows the ITS divergence cut-off proposed to match morphospecies boundaries in the best-studied monogenean, Gyrodactylus57 and the threshold of sequence divergence between species commonly used in barcoding69.

For tree reconstruction, a concatenated dataset was built on the basis of the specimens that yielded both nuclear and mitochondrial sequences. For an assessment of the phylogenetic content of the dataset, we performed a likelihood mapping analysis based on quartet puzzling70 implemented in TREE-PUZZLE v.5.271. In this combined alignment, the proportion of fully resolved quartets was 85.2%, with 9.4% partly resolved and 5.5% unresolved. In view of its relatively high phylogenetic content and the use of independently evolving (unlinked) markers, such concatenated nuclear-mitochondrial dataset allows for more robust (co-)phylogenetic hypotheses to be put forward (see also the recommendations by de Vienne et al.56).

Bayesian inference of phylogeny (BI) was carried out in MrBayes v.372. Posterior probabilities were calculated over 107 generations. Stationarity of the Markov chain was reached, as evidenced by a standard deviation of split frequencies of 0.008, by a potential scale reduction factor converging to 1 and by the absence of a trend in the plot of log-probabilities as a function of generations. The Markov chain was sampled with a frequency of 102 generations; one-fourth of the samples were discarded as “burn-in”. A ML search was carried out in RAxML v.7.3.073, assessing nodal support through 1000 bootstrap samples. The evolutionary model was optimized for each fragment separately, suggesting HKY + Γ for ITS-1, JC for 5.8S rDNA, TPM1uf + Γ for ITS-2 and GTR + Γ for COI. These models were substituted by GTR + Γ in RAxML and, in the case of ITS-2, also in MrBayes, as this was the implemented model with the best AICc score. In the latter software, all parameter estimates for the various sequence portions were unlinked.

To assess the rate of diversification as a function of time, we started from the largest dataset (ITS rDNA) in order to include a maximal sample size. Identical sequences were removed, as well as haplotypes differing less than 0.01, with the help of the CD-HIT Suite web server74 and additional manual removal in case of length differences. This sequence selection (see above for the rationale behind this cut-off) ensures a focus on speciation rather than on intraspecific variation. For this nuclear alignment, MEGA selected the HKY + Γ model as the optimal model of molecular evolution based on the Bayesian information criterion. Under this model, an ultrametric tree was built in BEAST v.1.8.175 applying four rate categories with the initial and average value of the gamma shape parameter set to 0.29, under the Yule tree prior and an uncorrelated relaxed log-normal clock model. Indeed, a likelihood-ratio test conducted in TREE-PUZZLE had rejected the molecular clock hypothesis. The Markov Chain Monte Carlo run was run for 107 generations with a sample frequency of 103 generations; a burn-in of one-tenth was applied. Based on this tree, a lineages-through-time (LTT) plot was constructed in APE76. In view of the possibly ambiguous interpretation of the course of a LTT, LASER77 was used to quantify possible changes in diversification rate over time. This package compares the likelihood of data under models with a constant versus variable rate of diversification by contrasting the AIC score of the best-fit rate-constant model with that of the best-fit rate-variable model. To assess statistical significance, the difference in AIC scores between the best-fit rate-constant and rate-variables models was calculated for 5000 randomly generated trees, using the same set of models and the same tree size (37 terminal nodes).

Cophylogenetic analyses

A range of methods exists to compare the phylogeny and divergence in host-parasite or other symbiotic systems, inferring the speciation patterns that contributed to the consistencies or inconsistencies in their evolutionary trajectories. Of these, methods reconciling host and parasite tree topologies are often considered to maximize cospeciation events and to take topological congruence as an evidence for cospeciation, which is not always justified56. To minimize the risks of such assumptions, host and parasite tree were reconciled in the software package CoRe-Pa v.0.578, checking 104 cost sets using a simplex method on the quality function. This software carries out an event-based analysis. A major asset is its parameter-adaptive approach that allows for the automated estimation of proportional event costs, thus avoiding a priori cost assignment to the different categories of parasite speciation mechanisms79,80. Root-to-root mapping was enforced and the chronological consistency of events checked. To work with a fully resolved tree that best approaches the “species tree”, the ML parasite phylogram reconstructed by RAxML for the combined nuclear-mitochondrial dataset, was included in cophylogenetic analysis. The AFLP tree of Koblmüller et al.8 was coded with the help of TreeSnatcher81 to provide a host topology. The selection of optimal trees obviously entails some uncertainty for this topology-based analysis. Therefore, the analysis was repeated with a parasite phylogeny in which all nodes receiving less than 70% of bootstrap support were collapsed. As within-host speciation in terminal taxa can artificially infer cospeciations at the cost of additional duplications56,63, terminal monophyletic clades associated with a single host species were collapsed. For the same reason, the software was set not to bill an additional duplication for a host-switching event. Host and parasite tree were also reconciled in a heuristic search in TreeMap v.1.0a82. Because this topology-based software is known to maximize the number of cospeciation events, the number of the various speciation mechanisms that TreeMap proposes will not be taken into consideration. The test for the significance of the number of cospeciation events inferred is only considered as a measure of topological congruence rather than of cospeciation56. This was assessed by randomizing host and parasite topologies (104 random trees used) under the proportion-to-distinguishable model.

Distance-based cophylogenetic analyses test for correlation between phylogenies without assuming congruence to be produced by cospeciation and are hence considered less biased than topology-based methods56. Distance-based cophylogenetic analysis was carried out in Copycat v.1.1483 making use of AxParafit and AxPcoords84 using 9999 permutations under default settings. In this analysis, the independence between host and parasite patristic distances is tested. To this end, distance matrices were constructed with the help of T-rex85. For the hosts, the mitochondrial ND2 and control region data from Koblmüller et al.8 were used, as these mitochondrial fragments are better suited for distance calculations than AFLP data (long terminal and short internal branches in the AFLP tree). Because of the large number of insertions/deletions in the nuclear fragment which inevitably bias genetic distance estimates over the whole dataset, only all COI mitochondrial fragments, omitting the third codon position, were used to infer parasite patristic distances.

Additional Information

How to cite this article: Vanhove, M. P. M. et al. Hidden biodiversity in an ancient lake: phylogenetic congruence between Lake Tanganyika tropheine cichlids and their monogenean flatworm parasites. Sci. Rep. 5, 13669; doi: 10.1038/srep13669 (2015).