Main

Candida species are yeasts, many of which live as commensals of animals, including humans. The clinical spectrum of candidiasis can range from superficial infections of the vagina and the respiratory and intestinal tracts to systemic and potentially life-threatening invasive disease in immunocompromised individuals.

Although Candida albicans is the most common pathogenic species, infections with other Candida species are on the rise. In a recent study1, the genomes of eight members of the Candida clade were compared with those of nine pathogenic and non-pathogenic Saccharomyces species. The genome size of the Candida species varies from 10.6 to 15.4 Mb, with each genome encoding 6,000 genes. Intrachromosomal inversions and local rearrangements are found in the otherwise syntenic genomes of four of the five diploid species. Conserved gene order is also maintained in two of the three haploid species, Candida guilliermondii and Debaryomyces hansenii . The pan-genomic comparison identified 21 gene families that are expanded in the pathogenic species, including expansions in the repertoires of extracellular enzymes, transporters and adhesins. In addition, species-specific gene families potentially associated with C. albicans hyphal growth were identified.

Pathogenic fungi exhibit phenomenal diversity in their sexual behaviour. Analysis of the major mating-type loci identified substantial variability in the Candida genomes that were examined. The absence of mating-type loci such as MAT1α and genes involved in regulating meiosis, such as the IME1 locus, in some species suggests there is a high level of plasticity in the meiotic pathways in Candida. This raises questions about our current understanding of the evolution of sexual reproduction in Candida and other yeasts.

A second recently published paper used an in-depth pairwise comparison to home in on the genomic changes that may have accompanied the evolution of pathogenicity2. As the closest relative of C. albicans, Candida dubliniensis shares many phenotypic characteristics with C. albicans but causes dramatically fewer infections in humans. Its 14.6 Mb genome, although karyotypically distinct, contains 5,758 protein-coding genes that are largely collinear with the C. albicans genome, with 98.1% of genes being positionally conserved. This conservation of genome structure extends even to complex repeat regions such as the subtelomeres, which are regions that are typically devoted to gene families associated with pathogenicity and the evasion of the host immune response in other organisms3. Overall, 168 C. albicans-specific genes were identified, including genes encoding factors that are involved in the yeast-to-hyphal transition, which has been implicated in pathogenesis. The 115 C. dubliniensis pseudogenes that were identified are orthologues of filamentous-growth regulator genes, which have also been connected with pathogenesis.

Surprisingly, however, the authors did not identify large-scale expansion or degeneration of known hypha-promoting genes as the likely cause of the increased pathogenicity of C. albicans. Instead, they suggest that disparities in copy number and the diversification of members of a number of previously uncharacterized gene families are responsible. The authors exemplified this with an in-depth analysis of two gene families, the Ifa family of putative leucine-rich-repeat transmembrane proteins and the Tlo family of putative transcription factors. They concluded that, since speciation from the common ancestor, the C. dubliniensis genetic repertoire has been shaped by gene loss and the creation of pseudogenes, whereas the C. albicans Ifa and Tlo gene families have expanded. Indeed, experimental validation demonstrated that the loss of TLO1 in C. dubliniensis notably reduced hyphal formation in response to serum, a process that could be rescued by a complementation assay.

These two studies illustrate how large-scale, multi-species comparisons can assist our understanding of complex processes such as the evolution of mating behaviour, sexual reproduction and virulence in fungi. They also highlight the advantage of having the finished genome sequence for an organism and the types of questions that can be addressed by draft-quality sequences, a point that is worth remembering as the research community increasingly moves towards investigating genomic variation at the population level.