Owing to their taxonomic, phenotypic, ecological and behavioural diversity and propensity for explosive diversification, the assemblages of cichlid fish in the East African Great Lakes Victoria, Malawi and Tanganyika are important role models in evolutionary biology. With the release of five reference genomes and many additional genomic resources, as well as the establishment of functional genomic tools, the cichlid system has fully entered the genomic era. The in-depth genomic exploration of the East African cichlid fauna — in combination with the examination of their ecology, morphology and behaviour — permits novel insights into the way organisms diversify.
Why is species richness so unequally distributed across the tree of life? Why did some organismal lineages diversify into new forms in a seemingly explosive manner, whereas others have lingered phenotypically unvaried over millions of years? These questions have puzzled generations of biologists ever since Darwin and Wallace jointly introduced their theory of evolution by natural selection1. 160 years of scholarly study later, there is a reasonable understanding of how and under which circumstances new species can originate2,3,4,5,6,7. However, the causal factors that determine species richness and the rate at which new species form remain largely elusive8,9. Particularly in light of the global biodiversity crisis that our planet is currently facing10, it is no longer of purely academic interest to know how novel species form and, consequently, how biodiversity arises.
Unravelling how variation at the genomic level is interlinked with phenotypic evolution is key to understanding organismal diversification2,11,12,13. To this end, we must understand how organisms evolve, how they function and how they interact with other organisms and the environment. The problem is that many widely used model organisms provide limited insights into the underpinnings of rapid — by way of comparison — organismal diversification; many traditional laboratory-based model organisms tell us little about how organisms adapt, behave and diversify in the wild, while model species in ecology and evolution often lack tractability in the laboratory and fundamental data on genomics and development. Importantly, most established model organisms do not belong to extensively diversifying clades.
Instances of adaptive radiation — that is, the rapid (sometimes ‘explosive’) origin of taxonomic, ecological and morphological diversity as a consequence of adaptation to novel or hitherto underutilized ecological niches14,15 — combine the advantages of laboratory and natural model species in the context of the genesis of biodiversity. Therefore, iconic examples of adaptive radiation, such as Darwin’s finches on the Galapagos archipelago, anole lizards on the islands of the Caribbean, threespine stickleback fish in post-glacial rivers and lakes, and cichlid fish in East Africa (Box 1), have long been recognized as essential models to study organismal diversification12,16,17,18. Scientific interest in many of these radiations can be traced back to the 19th century, such that, for a long time, these groups have been intensely investigated with respect to their evolution, ecology, ontogenetic development and behaviour. The close relatedness of the species emerging from adaptive radiations facilitates genetic and genomic investigations12, for example, on the basis of hybrid crosses or divergence mapping. Moreover, representatives of these adaptive radiations were among the first vertebrates to have their genomes sequenced19,20,21,22,23.
The species flocks of cichlid fish in the East African Great Lakes Victoria, Malawi and Tanganyika represent the most species-rich and phenotypically diverse adaptive radiations in vertebrates and are characterized by exceptionally fast diversification rates18,24,25 (Box 1; Fig. 1). To put cichlid radiations into a temporal context, during the evolutionary time span of our own species, starting with the split between chimpanzees and humans some 5–7 million years ago, approximately 2,000 species of cichlid fish evolved in East Africa, the geographic region where the chimpanzee–human split initially occurred. Within the time span that it took for 14 species of Darwin’s finches to evolve on the Galapagos archipelago22, about 1,000 cichlid species evolved in Lake Malawi alone26,27,28. In addition, since the last ice age, which is when sticklebacks began to diverge into replicate species pairs in the Northern hemisphere20, hundreds of cichlid species evolved in Lake Victoria29,30.
In this Review, I discuss how the examination of recently available genome-wide sequence data of East African cichlids has deepened our understanding of the phenomena of adaptive radiation and explosive diversification in general and in cichlids in particular. I start with a discussion of the explosive nature of species formation in East African cichlids and the resultant difficulties in delineating species. Then, I focus on the challenges that emerge in the reconstruction of the evolutionary history of rapidly diversifying clades at the interface between population genetics and phylogenetics. Finally, I summarize what we have learned about the genomes of East African cichlids thus far and discuss which features in their genomes are potentially linked to their propensity to diversify explosively.
Explosive diversification in cichlids
The adaptive radiations of cichlid fish in Lakes Victoria, Malawi and Tanganyika (Fig. 1) differ from all other cases of adaptive radiation in vertebrates — including those of cichlids elsewhere — by their unparalleled degree of phenotypic and taxonomic diversity in sympatry24,31. That cichlids are unusual had been realized already over the course of the earliest biological explorations of the African Great Lakes in the late 19th century32,33. Ever since, researchers repeatedly came to the conclusion that the current understanding of speciation was insufficient to explain the plethora of cichlid species in East Africa17,34,35. With specific reference to the cichlids, Woltereck in 1931 (ref.34) introduced the term ‘Artexplosion’ for outbursts of endemic diversity on islands and in some ancient lakes. More widely known as explosive speciation, this term refers to a substantial increase in speciation rate in a clade relative to a comparable group, irrespective of ecological and morphological differentiation, and is thus distinct from the phenomenon of adaptive radiation36.
The process of species formation
Generally speaking, speciation is the formation of a new species that is distinct from all other species; it is commonly defined as the build-up of reproductive isolation between an ancestral species and the newly formed species2. Most of the earlier work on this topic has focused on the role of geographical separation in the origin of species (reviewed in ref.2). More recently, the field has shifted towards a more process-oriented approach, emphasizing the importance of ecology in speciation5,6 — through divergent natural selection in distinct environments — and establishing that speciation is possible, perhaps even common, in light of some levels of gene flow between the diversifying lineages37,38. Both of these features, ecological speciation and gene flow, are particularly common in adaptive radiations, in the course of which new species typically form in the absence of geographical barriers14,15.
Speciation is often (but not always) a gradual process that has a clearly defined starting point — a single species — and is completed when at least one new species (if the ancestral species continues to exist) or a minimum of two species (if the ancestral species becomes extinct) has emerged. In between, along the so-called speciation continuum, it is impossible to conclusively determine whether there is one or more than one species. Any consideration of the process of speciation is therefore inextricably interwoven with the questions of what a species is and how it can be distinguished from other, closely related ones.
Species delineation in cichlids
The delineation of species is not an easy undertaking in cichlids. Although species belonging to different lineages (that is, tribes or genera) in the longer persisting cichlid radiations of Lakes Tanganyika and Malawi are phenotypically and ecologically clearly distinct from one another (Fig. 1), the status of species within such lineages as well as within younger species flocks is often unclear. This is, in part, due to the sheer number of species, which makes it difficult for taxonomists to keep track of taxonomic units and to sort species according to diagnostic characteristics39,40. However, the main issues in deciding what a species is in cichlids emerge from their close relatedness and the fact that radiations are still ongoing: morphologically distinct sister taxa are typically of very recent origin4,30,41,42 (in the range of a few hundred to a few thousand years), and, in many cases, it is not clear whether they have reached the end of the speciation continuum. In other cases, geographically isolated sister taxa are connected through intermediate forms43,44,45, much as in the case of ring species, rendering it impossible to define clear boundaries between them. By contrast, closely related cichlids of uncertain taxonomic status often mate assortatively with respect to their own source population46,47,48, that is, individuals mate more frequently with members of their own population than expected under a random pattern, or even occur in sympatry in parts of their distribution ranges43,44 (Fig. 2), suggesting that these sister taxa are valid species.
The classic species concepts provide little guidance for species delineation in cichlids. The most widely used definition in biology for the category species, the biological species concept49,50, is not very practical when applied to cichlids and often fails on the basis that reproductive isolation is usually incomplete between sister taxa. In fact, many East African cichlid species are intercrossable51,52, even when belonging to distinct phylogenetic lineages and being derived from different adaptive radiations53,54, and cichlids do interbreed in the wild as evidenced by occasionally observed hybrid specimens55 as well as molecular analyses demonstrating substantial levels of gene flow between species41,56,57. Delineating cichlid species by means of genetic markers is problematic too. DNA barcoding, a widely used method for identifying species on the basis of the mitochondrial COX1 gene58, performs poorly when applied to cichlids59. This is not surprising, given the high levels of DNA sequence similarity in cichlids (for example, the average genome-wide sequence divergence between Lake Malawi cichlids is only 0.1–0.25%)27 as well as mitochondrial haplotype sharing between species42,43. Defining species according to the phylogenetic species concept is equally problematic, as there is no a priori level of genetic distinctiveness above which two sister taxa should be considered different species and because reciprocal monophyly of sister taxa does not help in deciding whether these are populations of the same species or different species. Grouping individuals (or lineages thereof) into species according to their shared ecology — as suggested in the ecological species concept — is difficult in cichlids, as there is substantial niche and resource overlap and, hence, little competitive exclusion (whereby two species cannot stably coexist in the same ecological niche) between some species17,60,61. In practice, to facilitate the expedient naming of distinct taxonomic units in cichlids, species are seen as clusters of individuals that are morphologically and ecologically similar and distinct from other such clusters (that is, the vernacular species concept)62.
Taken together, there is no straightforward way of species delineation in cichlids. Phenotypically distinct yet closely related cichlids within species flocks may perhaps best be characterized as multispecies, that is, sets of closely related species that co-occur and that occasionally exchange genes63. This is exemplified by two closely related Pundamilia species from Lake Victoria, Pundamilia nyererei and Pundamilia pundamilia, which, when co-occurring, are genetically more similar to one another than two geographically separated (allopatric) populations of the same species4,41. Importantly, what makes species delineation so difficult in cichlid species flocks — namely, that there are many species, that species are very young and that speciation is often ‘caught in the act’ — is exactly what makes cichlids such a useful model for the study of the diversification process. Note that to account for the situation in which, in many cases, speciation is not yet complete, one should use the more general term ‘diversification’ instead of ‘speciation’ in the context of the adaptive radiations of East African cichlids, as the latter designation requires completion of the process.
Reconstructing cichlid evolution
The availability of accurate, comprehensively sampled and time-calibrated phylogenetic hypotheses is crucial to understanding the progression of adaptive radiations and explosive diversification64, as well as to interpret general patterns of these phenomena (Box 2). At the same time, the close relatedness of species within adaptive radiations, occasional gene flow between species and the pace at which new species form pose considerable challenges to the phylogenetic reconstruction of rapidly diversifying clades22,27,64,65,66 (Fig. 3), even if genome-wide data are available.
The first cichlid genomes
The initial round of genome sequencing in cichlids involved a set of five phylogenetically representative African species19: the Nile tilapia (Oreochromis niloticus), as a member of a less species-rich yet geographically widespread sister lineage to the cichlid radiations in the East African Great Lakes; the Princess of Burundi (Neolamprologus brichardi), from the most species-rich tribe within Lake Tanganyika, the Lamprologini; and three members of the tribe Haplochromini, namely, Burton’s cichlid (Astatotilapia burtoni) from Lake Tanganyika and nearby rivers, the Zebra mbuna (Metriaclima zebra) from the mbuna clade of Lake Malawi and P. nyererei from Lake Victoria (Figs. 1,4). To facilitate genome annotations, the study of Brawand et al.19 also established reference transcriptomes for these five species. In the meantime, many more genomes of East African cichlids have been sequenced at low coverage with short-read (Illumina) sequencing approaches27,41,57,67,68. The assemblies of two of the reference genomes, the Nile tilapia and the Zebra mbuna, have been much improved with long-read (PacBio) sequencing69,70, and additional transcriptomes from more species and more tissues have been generated71,72,73,74,75. These advances have confirmed that the genomes of rapidly diversifying cichlids are frequently subjected to incomplete lineage sorting and introgression (Fig. 3c,d), resulting in mosaic genomes that consist of small segments with different evolutionary histories12,19,27,57 (Fig. 3g).
The phylogeny of East African cichlids
Incongruence between gene trees and between gene trees and the species tree is particularly pronounced in East African cichlids. For example, analysis of the initially sequenced cichlid reference genomes revealed that more than 40% of all single nucleotide polymorphisms (SNPs) support topologies among the three representatives of the tribe Haplochromini that are in conflict with the species tree19. Phylogenetic analyses of thousands of segments in the genomes of 5 closely related Neolamprologus species from Lake Tanganyika showed that all of the 15 possible topologies connecting the 5 species received support from at least a few dozen segments in the genome, whereby one topology stood out as being supported by about half of all segments57. In addition, the analysis of 2,543 non-overlapping windows, each containing 8,000 SNPs, across the genomes of 73 Lake Malawi cichlid species produced 2,542 different topologies27. Obviously, in such situations, a single tree-like phylogeny can no longer capture the entire evolutionary history of a group, making the concept of a clade and the quest for bifurcating branching diagrams questionable.
The analysis of genome-wide markers nevertheless provides novel insight regarding evolutionary relationships in cichlids, not least because incomplete lineage sorting and (introgressive) hybridization can now be looked at on a genome-wide scale and across many taxa26,27,76. By and large, the phylogeny of the East African cichlids reflects the dynamic geological history of the area (Fig. 4). Lake Tanganyika, the oldest and geologically most stable of the African Great Lakes31, is home to the phenotypically most diverse cichlid species flock17. Taxonomically, the Tanganyikan cichlids have been grouped into 14 tribes, which differ substantially in species number (ranging from 1 species in, for example, Boulengerochromini to about 100 species in Lamprologini)61,77. Some tribes — even if endemic to Lake Tanganyika today — must have evolved elsewhere, that is, before the formation of the present lake some 9–12 million years ago (notably, the Bathybatini, Boulengerochromini and Trematocarini), whereas the origin of other tribes is compatible with a scenario of in situ evolution26,78,79. Although the respective monophyly of these tribes is usually well supported in phylogenetic analyses using genome-wide data26,78,79,80, there is strong evidence for past gene flow between some of them26,78,79 (Fig. 4).
The radiations in Lakes Malawi and Victoria involve only one of the African cichlid tribes, the Haplochromini (note that in all three African Great Lakes, a few members of the Oreochromini are found, which did not radiate). In the early stages of the cichlid adaptive radiation in Lake Malawi, three major clades emerged in closely timed lineage-splitting events: a pelagic clade formed by Diplotaxodon and Rhamphochromis; a clade including shallow and deep benthic species as well as the utaka lineage; and the mbuna clade. The presumed ancestor of the radiation, the generalist species Astatotilapia calliptera, continued to exist in rivers and lakes in the area (including Lake Malawi)27 (Fig. 4); this species is, hence, phylogenetically nested within the Lake Malawi cichlid radiation27,81. The inspection of genome-wide data further revealed multiple events of gene flow within and between the main cichlid clades in Lake Malawi27. The current age estimates for the onset of the Lake Malawi cichlid radiation26,27 are compatible with a time-calibrated paleoecological record82, which revealed that the lake transitioned into its current state as a more or less closed system with deepwater habitats approximately 800,000 years ago, making cichlid diversification into deepwater habitats possible only afterwards28.
The situation in Lake Victoria differs somewhat given that the cichlid fauna of the lake is part of a geographically more extended species assemblage, the so-called Lake Victoria region super-flock, which includes the adaptive radiations of cichlids in Lakes Victoria, Edward, Albert and Kivu, among others42,76 (Fig. 1). The onset of the diversification of the super-flock has been estimated at 100,000–200,000 years ago26,42,76,83, and most, if not all, species within Lake Victoria must have evolved within the past 15,000 years following its refill after complete desiccation in the late Pleistocene29,30. The extremely young age of the roughly 700 species in this super-flock thus makes classic phylogenetic analyses difficult. The most thorough analysis so far on the basis of restriction-site associated DNA (RAD) sequencing suggests that not all radiations in the lakes of the Lake Victoria region are reciprocally monophyletic and that hybridization played a key part in triggering these radiations76.
Genomic basis of cichlid diversification
The phenomenon of explosive diversification in cichlids has long been implicated with the particular environment in which cichlid evolution has taken place, as well as with intrinsic features of the cichlids themselves17,35,84,85,86. Without a doubt, the ample ecological opportunity provided by the African Great Lakes is conducive for diversification through adaptive radiation, not only in cichlids but also across different groups of animals31, which is corroborated by the scaling of the number of endemic species in a lake with such a size, relative stability and depth31,84,87. Still, the number of endemic cichlid species in Lakes Victoria, Malawi and Tanganyika is at least an order of magnitude higher than the number of endemic species in any other family resident in these lakes (except, perhaps, ostracods in Lake Tanganyika)31, and none of approximately 20 non-cichlid fish families that also occur in each of these lakes has brought forth more than a handful of endemic species17,88. These findings suggest that there is something special about cichlids — some kind of ‘cichlidness’ — that permits these fish to diversify explosively. That cichlids in general feature an intrinsic propensity for diversification is substantiated by numerous examples of smaller-scale (in comparison to the ones in the African Great Lakes) adaptive radiations in rivers and lakes throughout their distribution range84,89,90,91,92,93, of which the ones in small volcanic crater lakes in Africa and Central America are the most widely perceived3,67,94,95,96. Early on, the extraordinary diversity of cichlids has spurred speculations that particular genomic features might underlie their propensity to diversify35. Examining this ‘genomic substrate’ for diversification has become one of the main research targets within the cichlid genome project19.
Comparative cichlid genomics
The in-depth comparative analysis of the initial cichlid genome sequence data19 identified several distinctive features in the genomes of the radiating East African cichlids that could potentially — individually or jointly — be responsible for explosive diversification in this group, thereby confirming earlier assumptions. First, the cichlid genomes turned out to be genetically more diverse than expected in light of the very recent origin of the species flocks, which was attributed mainly to the accumulation of standing genetic variation before the radiations (as opposed to new mutations). Second, the four genomes of the explosively diversifying cichlid lineages from Lakes Victoria, Malawi and Tanganyika showed an increased rate of gene duplications compared with the Nile tilapia and other teleosts, and about one-fifth of the duplicated genes showed evidence of neo-functionalization, the gain of a new function in one of the gene copies. Third, the same four cichlid genomes are characterized by a greater dynamic in gene regulatory processes than observed in other fish species, as evidenced by increased rates of regulatory element evolution and novel and functionally diverse microRNAs. Fourth, these genomes exhibit accelerated coding sequence evolution, as evidenced by elevated rates of non-synonymous to synonymous (dN/dS) substitutions compared with the rates in the Nile tilapia. Finally, three waves of transposable element expansions were detected in cichlids. Many more cichlid genomes have been inspected since, calling for a revisit to the question of what the genomic underpinnings of adaptive radiation and explosive diversification in cichlids are.
Sources of genetic variation
There is a general consensus that divergent natural selection, partly in combination with sexual selection, has played a key part in the diversification of cichlids18,19,31,84,97. Heritable phenotypic variation in fitness-related traits is a prerequisite for selection to operate. This variation can arise from mutations at the level of single nucleotides, of genes or of chromosomes and via the reshuffling of existing genetic material during meiosis (recombination), making these sources of variation at the molecular level prime targets in the quest for the genomic basis of explosive diversification in cichlids.
Interestingly, the nucleotide mutation rate is not very high in East African cichlids. Quite to the contrary, the nucleotide mutation rate has been estimated — using trio sequencing — at 3.5 × 10−9 per bp per generation (Table 1), which is 3–4-fold lower than the rate in humans27. Therefore, the exceptionally high speciation rates in East African cichlids cannot be linked to an elevated mutation rate.
The rate of lineage-specific gene duplications, by contrast, has been found to be 4.5–6-fold higher in the common ancestor of the explosively diversifying East African cichlid lineages than in other fish; it is highest in the common ancestor of the most species-rich lineage in cichlids, the Haplochromini19, suggesting a link between gene duplication and diversification.
Transposable element insertions provide another source of genetic variation, and there is evidence from the cichlid genomes that the insertion of transposable elements near genes has altered their expression19. This is exemplified by a case study showing that the insertion of a short interspersed nuclear element (SINE) upstream of a previously unknown colour gene, fhl2b, in the ancestor of the Haplochromini lineage led to a gain of expression in iridophores (a specific type of pigment cells), which in turn has been linked to the origin of a new pigmentation trait in this group71.
At the level of entire chromosomes, there is little variation among the East African cichlids, with chromosome numbers ranging from 2n = 40 to 2n = 46 according to karyotyping; most species have 2n = 44 (refs98,99) (Table 1). Differences in chromosome number, thus, do not seem to have an important role in the origin or maintenance of cichlid species. The contribution of smaller-scale chromosomal rearrangements (such as inversions) to cichlid diversification is less clear to date, as its examination is currently hampered by the insufficient quality of genome assemblies based on short-read sequence data, and too few genome assemblies based on long-read data in combination with genetic mapping are currently available.
Besides mutational change, recombination is the other major factor that can lead to an increase in genetic variation. The effect of recombination on genetic variation is expected to be especially strong when chromosomes from genetically more distinct parents are involved, as is the case when two species hybridize. In this way, hybridization can instantaneously boost genetic variation100. In East African cichlids, hybridization has been identified as an important factor for establishing and maintaining genetic variation in species flocks27,57,76. For example, it has been shown that explosive diversification in cichlids of the Lake Victoria region was predated by a hybridization event involving two distantly related riverine lineages, creating a hybrid swarm, whereby fixed differences in the parental lineages recombined to form many new combinations of alleles in the emerging species76.
Obvious signatures of introgression were found in the cichlid assemblages of all three large lakes26,27,41,57,79 (Fig. 4). Based on the analysis of genome-wide data, it has further been suggested that there is occasional gene exchange between the cichlid faunas in rivers and the African Great Lakes26,101. Recurrent introgression events, potentially triggered by large-scale environmental changes in the form of lowstands of the lake level that lead to increased water turbidity and/or bring together populations in dense contact27,28,44,82, incomplete lineage sorting and, at least in some cases, hybridization at the onset of adaptive radiation can explain why cichlid genomes are genetically diverse and why much of the variation is shared between cichlid species (Fig. 3). For example, within Lake Malawi, more than 80% of the heterozygous sites are shared between species27, and it has been estimated that more than half of the SNPs that are polymorphic in Lake Malawi are also polymorphic in cichlids outside this lake101. Taken together, it seems that, in East African cichlids, the reshuffling of existing genetic variation has played a major part in creating the raw genetic material for selection to act upon.
Comparative genomic analysis in East African cichlids suggests that selection is multifarious during explosive diversification and targets many loci across the genome, as evident from the multiple genomic regions with elevated levels of divergence — so-called outlier regions — between sister taxa19,67,102,103, which is in turn a probable consequence of syndrome selection12. For example, in a comparison between two A. calliptera ecomorphs that are at the verge of speciation in a small crater lake near Lake Malawi, Malinsky et al.67 identified 55 genomic regions that were characterized by high divergence (in both F ST and D XY ) relative to the genome-wide background. Interestingly, these ‘islands of speciation’ are not randomly distributed across the genome; approximately half of these islands cluster on only five chromosomes67, that is, they form archipelagos104. Note that the fairly large number of outlier regions is somewhat contradictory to theoretical models, which predict that rapid diversification is most probable if the number of underlying loci is small15 and that genomic regions of high divergence are not necessarily indicative of divergent selection105.
Gene candidates emerging from divergence mapping — that is, genes showing elevated levels of divergence between diversifying populations or more rapid coding sequence evolution — include genes with functions in morphogenesis19,67, cytoskeleton development67, the sensory system (including the visual opsin genes)4,19,27,67,76,102,103, oxygen transport27,103, hormone signalling27,67, protein translation27,67, pigmentation19 and the immune system27,102. To date, very few of these gene candidates have been studied in sufficient detail to allow their assignment to discrete functions and to varying phenotypes in cichlids, and a direct link between any of these genes and explosive diversification is currently lacking. Future work should thus focus on the identification and in-depth functional characterization of the genes involved in diversification in cichlids, making use of divergence mapping and laboratory crosses.
Conclusions and outlook
The first wave of genomic exploration of the exceptionally diverse cichlid species flocks of the East African Great Lakes Victoria, Malawi and Tanganyika has identified gene duplication, accelerated coding sequence evolution, transposable element insertion and regulatory evolution, but not increased nucleotide mutation rate, as candidate genomic features underlying explosive diversification19,27. The reshuffling of existing allelic variation via hybridization at the onset or in the form of introgression in the course of adaptive radiation has further been identified as an important factor boosting genetic variation27,57,76, which can be tied directly to novel phenotypes via transgressive segregation106,107. Occasional gene exchange between evolutionary lineages in cichlids is possibly facilitated by the relative overall stability of their genomes in terms of genome size and the number of chromosomes as well as comparably low nucleotide mutation rates (Table 1), allowing species to hybridize across fairly large phylogenetic distances (see refs53,54). It is less clear whether hybridization has a causal role in the generation of taxonomic and phenotypic diversity throughout the course of cichlid adaptive radiations or whether hybridization is rather a by-product of the proliferation and subsequent coexistence of numerous recently diverged taxa that are reproductively isolated primarily based on pre-mating isolation mechanisms.
An important question is then what maintains the taxonomic and phenotypic diversity in the cichlid species flocks of the African Great Lakes in light of (occasional) gene flow between lineages and whether assortative mating alone is sufficient to keep species apart stably or whether other mechanisms, be they genetic or ecological ones, are involved. Clearly, more experimental work — ideally under (semi-)natural conditions — is needed to disentangle the relative roles of signal and mate choice evolution, genetic incompatibilities, habitat preferences and/or spatiotemporal isolation in cichlid diversification, which in turn would expand our knowledge of the nature of cichlid species. Similarly, more experimental work is needed to clarify the relative contribution of phenotypic plasticity to cichlid evolution108,109 and the potential role that the high turnover rates in sex determination systems might have110,111,112,113.
Ultimately, it will of course be important to know more about the genes underlying explosive diversification in cichlids, as well as their exact functions. The possibility of generating and rearing artificial hybrids for genetic mapping of particular traits51,52,113 in combination with divergence mapping in natural populations67,102,114 and the prospect of CRISPR–Cas9 gene editing in cichlids115 promise exciting new insights into this topic, especially when these strategies are combined. Overall, only a small fraction of the East African cichlid fauna has been subjected to genome (re-)sequencing to date. Many more genomes from many more species and in much better quality would be needed to fully understand the genomic underpinnings of adaptive radiation and explosive diversification in cichlids. The cichlid system offers an ideal comparative framework in this context, as excessively proliferating lineages can be compared with their non-radiating sister lineages living outside the lakes as well as with comparatively species-poor lineages that diversified side by side to species-rich ones in the same lake and that descend from the same common ancestor (Fig. 4). Importantly, such considerations make sense only in an integrative framework, that is, when similar efforts, such as with genome sequencing, are undertaken to scrutinize the morphology, ecology, physiology and behaviour of the East African cichlids, rather than with a sole focus on ecomorphological traits in the study of cichlid divergence. These empirical efforts should be accompanied by the development of new theory tailored to the phenomenon of explosive diversification and the establishment of novel analytical pipelines to handle these data. Finally, more emphasis should be devoted to better incorporate information provided by the fossil and paleoecological record buried in the lakes’ sediments116, including the examination of ancient DNA.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The author is grateful for generous support from the European Research Council (ERC) and the Swiss National Science Foundation (SNF) and to three referees for valuable comments.
Nature Reviews Genetics thanks G. F. Turner and the other anonymous reviewer(s) for their contribution to the peer review of this work.
- Model organisms
Non-human species studied in detail in the context of a particular research question with the motivation to be able to make more general statements about the functioning of organisms.
Branches on an evolutionary tree, consisting of a common ancestor and all its descendants (a clade is, hence, equivalent to a monophyletic group).
- Divergence mapping
Genetic-marker-based search for genomic regions exhibiting exceptionally strong differentiation between different biological entities, such as populations or species.
- Species flocks
Unusually taxon-rich assemblages of closely related species that coexist in the same area, such as an island, a lake or a section of a river.
- Lateral line
A sensory system along the body in aquatic vertebrates that consists of sensory cells.
The existence of two or more species in the same geographic area such that they encounter each other frequently.
- Reproductive isolation
Any property that prevents (or reduces the probability of) members of one species breeding successfully with members of another species.
- Gene flow
The movement or exchange of genes into or through a population by interbreeding or by migration and subsequent interbreeding.
- Speciation continuum
The bandwidth of variation between diversifying populations ranging from virtually no variation (panmixia) through partially discontinuous variation (incipient barriers to gene flow) to strongly discontinuous variation (complete reproductive isolation).
The taxonomic rank between the genus and the family level.
- Sister taxa
Reciprocally closest relatives of one another.
- Ring species
Two reproductively isolated populations are connected through a geographic ring of populations that interbreed; no morphological character can be used, except arbitrarily, to divide the ring into discrete taxonomic units.
- Biological species concept
A concept positing that species are groups of actually or potentially interbreeding natural populations that are reproductively isolated from other such groups.
A stretch of DNA on a single molecule (chromosome, plastid or mitochondrion) that is inherited as a single unit.
- Phylogenetic species concept
A concept positing that species are ‘tips’ on a phylogeny, that is, the smallest set of organisms that share an ancestor and can be distinguished from other such sets.
The condition of forming a monophyletic group (that is, a clade).
- Ecological species concept
A concept positing that species are sets of organisms that are adapted to a particular set of resources, that is, to the same ecological niche.
- Incomplete lineage sorting
The imperfect segregation of a gene into all evolutionary lineages, that is, a gene fails to coalesce within the duration of a species.
Also referred to as introgressive hybridization. The transfer of genetic material from one species into another via repeated and asymmetrical backcrossing with one of the parental lineages after a hybridization event.
- Mosaic genomes
Genomes heterogeneous in ancestry, emerging from introgression, incomplete lineage sorting or lateral gene transfer.
- Gene trees
Actual evolutionary relationships between the versions of a gene as present in different taxa.
- Species tree
Actual evolutionary relationships between a set of species. The species tree reflects the ‘true’ evolutionary history of a clade.
- Ecological opportunity
The availability of ecologically accessible resources that may be exploited by an evolutionary lineage.
- Trio sequencing
Whole-genome sequencing of two parents and one of their offspring, allowing accurate phasing (and the determination of the nucleotide mutation rate if sequence coverage is high enough).
- Syndrome selection
Selection on a combination of traits (for example, body and mouth morphology and coloration in cichlids) in a given environment.
- F ST
Known as the fixation index, FST is a measure of differentiation between two populations due to genetic structure.
- D XY
Average number of differences between two individuals randomly sampled from two populations.