The Périgord black truffle (Tuber melanosporum Vittad.) and the Piedmont white truffle dominate today’s truffle market1,2. The hypogeous fruiting body of T. melanosporum is a gastronomic delicacy produced by an ectomycorrhizal symbiont3 endemic to calcareous soils in southern Europe. The worldwide demand for this truffle has fuelled intense efforts at cultivation. Identification of processes that condition and trigger fruit body and symbiosis formation, ultimately leading to efficient crop production, will be facilitated by a thorough analysis of truffle genomic traits. In the ectomycorrhizal Laccaria bicolor, the expansion of gene families may have acted as a ‘symbiosis toolbox’4. This feature may however reflect evolution of this particular taxon and not a general trait shared by all ectomycorrhizal species5. To get a better understanding of the biology and evolution of the ectomycorrhizal symbiosis, we report here the sequence of the haploid genome of T. melanosporum, which at ∼125 megabases is the largest and most complex fungal genome sequenced so far. This expansion results from a proliferation of transposable elements accounting for ∼58% of the genome. In contrast, this genome only contains ∼7,500 protein-coding genes with very rare multigene families. It lacks large sets of carbohydrate cleaving enzymes, but a few of them involved in degradation of plant cell walls are induced in symbiotic tissues. The latter feature and the upregulation of genes encoding for lipases and multicopper oxidases suggest that T. melanosporum degrades its host cell walls during colonization. Symbiosis induces an increased expression of carbohydrate and amino acid transporters in both L. bicolor and T. melanosporum, but the comparison of genomic traits in the two ectomycorrhizal fungi showed that genetic predispositions for symbiosis—‘the symbiosis toolbox’—evolved along different ways in ascomycetes and basidiomycetes.
The 125-megabase (Mb) genome of T. melanosporum is the largest sequenced fungal genome to date6, but no evidence for whole-genome duplication or large scale dispersed segmental duplications was observed (Supplementary Table 1 and Supplementary Information section 2). The approximately fourfold larger size of the truffle genome compared with other sequenced ascomycetes is accounted for by multi-copy transposable elements (TE) which constitute about 58% of the assembled genome (Fig. 1, Supplementary Figs 5, 6 and 8, Supplementary Information section 3). Estimated insertion times suggest a major wave of retrotransposition at <5 million years ago (Supplementary Fig. 7). TEs are not uniformly spread across the genome, but are clustered in gene-poor or gene-lacking regions (Fig. 1 and Supplementary Fig. 8). The expansion of regions between blocks of protein-coding genes results from an increased density of TEs. The proliferation of TEs within the truffle genome may result from its low effective population size7 during postglaciation migrations8 (Supplementary Information section 2.5).
The predicted proteome is in the lower range of sequenced filamentous fungi6, as only 7,496 protein-coding genes were identified (Supplementary Information section 4). They are mainly located in TE-poor regions and the gene density is heterogeneous when compared with that of other ascomycetes (Fig. 1, Supplementary Figs 8 and 9). Among the predicted proteins, only 3,970, 5,596 and 5,644 showed significant sequence similarity to proteins from Saccharomyces cerevisiae, Neurospora crassa and Aspergillus niger, respectively (Supplementary Fig. 10). This agrees with the predicted ancient separation (>450 Myr ago) of the Pezizomycetes from the other ancestral fungal lineages (Supplementary Fig. 4)9. Of the ∼5,650 T. melanosporum genes that have an orthologue, very few show conservation of neighbouring orthologues (synteny) in at least one of the other species (Supplementary Fig. 11, Supplementary Information section 5.2). The T. melanosporum genome shows a structural organization strikingly different from other sequenced ascomycetes; the largest syntenic region (with Coccidioides immitis) only contains 99 genes with 39 orthologues (Supplementary Fig. 12). TE proliferation probably facilitated genome rearrangements. Some regions of meso-synteny were however detected, suggesting that T. melanosporum could be used for assessing the genome organization of ancestral ascomycete clades.
Expression of most predicted genes was detected in free-living mycelium, ectomycorrhizal (ECM) root tips and/or fruiting body by custom oligoarrays, expressed-sequence-tag pyrosequencing and Illumina RNA-Seq (Supplementary Information sections 2.4 and 8, Supplementary Table 2, Supplementary Fig. 26). Only a low proportion of transcripts (7.6%) is differentially expressed (fold-ratio ≥4.0, P < 0.05) in either ectomycorrhiza or fruiting body by comparison to free-living mycelium (Table 1, Supplementary Table 4). Only 61 transcripts unique to ectomycorrhiza, fruiting body or free-living mycelium were detected (Supplementary Table 5). A few transcripts coding for a H-type lectin, an arabinogalactan protein, a LysM-domain containing protein, major facilitator superfamily (MFS) transporters, laccase/tyrosinase, a lipase and polysaccharide-degrading enzymes are strikingly enriched (>1,000-fold) in symbiotic tissues (Table 1). They may play a role in adhesion to host cells, detoxication of plant defence metabolites, nutrient exchange, and colonization of root apoplast through the deconstruction of cell walls.
A process that is crucial to the success of ECM interactions is the mutualistic exchange of nutrients between the microsymbiont and its host plant. A comparison with other fungi revealed that the total number of predicted transporters is lower in T. melanosporum (381 members) compared with L. bicolor (491 members) as well as with saprotrophic and pathogenic ascomycetes (481–781 members) (Supplementary Table 26). However, 64 predicted membrane transporters showed an upregulated expression in truffle ectomycorrhizas, suggesting increased fluxes of carbohydrates, oligopeptides, amino acids and polyamines at the symbiotic interface (Supplementary Table 27). PFAM classification of fungal genes induced in symbiotic tissues of either L. bicolor or T. melanosporum ECM root tips revealed strikingly divergent fungal symbiotic proteomes (Supplementary Fig. 15). However, the PFAM categories corresponding to the MFS transporters (PFAM00083), aquaporin-related major intrinsic proteins (PFAM00230) and amino acid permeases (PFAM000324) were among the most strongly overrepresented in genes that were transcriptionally upregulated in both L. bicolor and T. melanosporum ectomycorrhizas.
Orthologous genes (that is, reciprocal best hits, BLASTP e-value ≤10-5) significantly induced in the symbiosis represent only 1.5% and 4.1% of the ectomycorrhiza-upregulated genes in both T. melanosporum and L. bicolor, respectively. Most of these rare transcripts code for membrane transporters involved in sugar, amino acid or sulphate uptake (Table 2). This transcriptome trait appears to be a hallmark of the mycorrhizal symbiosis. The resulting increased nutrient flux probably explains the beneficial effect of the symbionts on the growth of their host seedlings (Supplementary Information section 1 and Supplementary Fig. 3). Other overrepresented PFAM categories displayed different patterns in the two symbionts. None of the effector-like small secreted MiSSP proteins specifically expressed in L. bicolor ectomycorrhizas4 were detected among ectomycorrhiza-regulated T. melanosporum transcripts.
One of the most striking characteristics of the T. melanosporum genome is the almost complete absence of highly similar gene pairs. Of the predicted 7,496 protein-coding genes, only seven pairs share >90% amino-acid identity in their coding sequence, whereas 30 pairs share >80% identity (Supplementary Information section 5.3, Fig. 2). This feature was also observed in the ascomycetous saprotroph N. crassa10. In striking contrast to the ECM L. bicolor4, multigene families in T. melanosporum are limited in number and comprise only 19% of the predicted proteome; most families have only two members (Supplementary Fig. 13). The rate of gene family gain is much lower than the rate of gene loss and among the 11,234 gene families found in ascomycetes, 5,695 appear to be missing in T. melanosporum (Supplementary Information section 5.4, Fig. 2). This compact gene coding space may reflect the genome organization of an ascomycete common ancestor, as the Pezizomycetes clade is the earliest diverging lineage within the Pezizomycotina (Supplementary Fig. 4). By comparison to other ascomycetes, gene families predicted to encode metabolite transporters (for example, amino acid and sugar permeases) and secondary metabolism enzymes (such as polyketide synthases and cytochrome P450s) are much smaller. Only 465 genes encoded by expanding gene families of L. bicolor are also found in the T. melanosporum genome (BLASTP, e-value ≤10-5) and 154 orthologues are shared between expanding gene families of both symbionts. None of them is differentially expressed in ectomycorrhizas. Differences in gene family expansion, in particular dynamic repertoires of genes encoding symbiosis-regulated effector-like proteins and sugar-cleaving enzymes (see below), are probably responsible for different symbiotic traits between T. melanosporum and L. bicolor, such as altered host specificity. The compact genome of T. melanosporum might be a product of selection for specialization; this is because genome expansion, as observed in L. bicolor, is probably driven by selection on the symbiont to exploit a diversity of encountered substrates provided by multiple potential hosts and by their diverse soils4,5.
The volatiles released by truffles are attractive to rodents and truffle flies11, which disperse their spores, but also to humans who consider this elusive mushroom a delicacy. T. melanosporum is the first sequenced fungus producing highly flavoured hypogeous fruiting bodies (Supplementary Information section 6.4, Fig. 3). Genomic signatures of the long-standing (>2,000-year-old) reputation of the black truffle as a gastronomic delicacy are its extremely low allergenic potential (Supplementary Fig. 18), coupled with the lack of key mycotoxin biosynthetic enzymes (Supplementary Information section 6.2, Supplementary Table 14), and the preferential overexpression of various flavour-related enzymes in the fruiting body (Supplementary Figs 19–21). Among the latter are specific subsets of sulphur assimilation and S-amino acid interconversion enzymes. These include cystathionine lyases known to promote the side-formation of methyl sulphide volatiles abundant in truffles12 as well as various enzymes involved in amino acid degradation through the Ehrlich pathway which are giving rise to known truffle volatiles and flavours, for example, 2-methyl-1-butanal (Fig. 3, Supplementary Information section 6.4, Supplementary Figs 20 and 21). Also notable, given the subterranean habitat of this fungus, is the presence of various putative light-sensing components (Supplementary Information section 6.6), which might be involved in light avoidance mechanisms and/or in the control of seasonal developmental variations, especially those related to fruiting body formation and sexual reproduction.
The analysis of genes implicated in the mating process, including pheromone response, meiosis and fruiting body development showed that most sex-related components identified in other ascomycetes are also present in T. melanosporum (Supplementary Table 11). Sexual reproduction in ascomycete filamentous fungi is partly controlled by two different mating-type (MAT) genes that establish sexual compatibility13: one MAT gene codes for a protein with an α-box domain, whereas the other encodes a high mobility group (HMG) DNA binding protein (Supplementary Information section 6.5). It was widely believed that T. melanosporum was a homothallic or even an exclusively selfing species14. The sequenced Mel28 strain contains the HMG locus, and the opposite linked MATα locus was identified in another natural isolate (Supplementary Fig. 22), confirming recent hints that T. melanosporum is heterothallic and thus an obligate outcrossing species15. This result has major implications for truffle cultivation, which will be improved by the use of host plants harbouring truffle strains of opposite mating types. In most ascomycetes, the genomic regions flanking the MAT locus show an extended conservation13, but there is no synteny of the MAT loci between T. melanosporum and other sequenced fungi (Supplementary Fig. 23).
To determine whether T. melanosporum sugar-cleaving capabilities resemble those of other fungi, we have undertaken a comparison of the glycoside hydrolase (GH) and polysaccharide lyase (PL) repertoires16 of 18 completely sequenced fungi (Fig. 4). As expected for a symbiotic fungus living in the root apoplast, T. melanosporum has a relatively small number of GH-encoding genes (91 members; Supplementary Tables 23 and 24); much fewer than phytopathogens (for example, Magnaporthe grisea, Fusarium graminearum) and saprotrophs (for example, N. crassa, Podospora anserina). The T. melanosporum GH repertoire bears some similarity with that of L. bicolor4, especially a reduced spectrum of enzymes targeting the plant cell wall compared to saprobes, culminating in both fungi with the absence of cellulases from families GH6 and GH7. There are however significant differences in the spectrum of enzymes present in these two symbiotic fungi. For instance, T. melanosporum has hemicellulases from families GH10 and GH43, whereas L. bicolor has none. Similarly, T. melanosporum has a family GH45 cellulase that is absent from the L. bicolor genome. Other differences include different strategies to cleave pectin: whereas L. bicolor utilizes six hydrolytic GH28 pectinases, T. melanosporum has only two, but these are complemented by three pectin lyases and a pectin methylesterase that are missing in L. bicolor. Both fungi have a set of proteins, few in number, bearing cellulose-binding domains, but differences appear here too: the single cellulose-binding CBM1 motif of L. bicolor is appended to a GH5 endoglucanase, whereas T. melanosporum has two CBM1 motifs attached to a GH61 enzyme and to a protein of unknown function. GH61 enzymes have been reported to display weak cellulolytic activity17.
An extended comparison with other sequenced fungi (Fig. 4) shows that T. melanosporum clusters neither with L. bicolor nor with saprotrophic ascomycetes, most probably because of its limited overall number of GHs and PLs that make it closer to yeasts and fungi that do not interact with plants, but rather with animals (Cryptococcus neoformans, Malassezia globosa). Differences between the enzyme repertoires of T. melanosporum and L. bicolor suggest differences in the mode of interaction of the two symbionts with their respective host plants. A striking difference is the presence of an invertase gene in T. melanosporum, whereas L. bicolor has none and is therefore completely dependent on its host for its provision of glucose5. In contrast, T. melanosporum could access and hydrolyse the plant-derived sucrose. This would suggest that although both fungi develop symbiotic relationships with plants, T. melanosporum is probably less dependent than L. bicolor. The overall pattern of induction of genes coding for enzymes acting on polysaccharides is similar in both L. bicolor and T. melanosporum symbiotic transcriptomes, although a larger number of carbohydrate-cleaving enzyme transcripts are upregulated for some families—for example, GH16 (β-1,6-glucanases), GH18 (chitinases) and GT20 (α,α-trehalose-phosphate synthase) in L. bicolor (Supplementary Table 25 and Supplementary Fig. 24). Intriguingly, a GH5 cellulase and a GH28 pectinase are among the rare transcripts that are highly upregulated in both L. bicolor and T. melanosporum ectomycorrhizas, suggesting that they play a key role in the symbiosis. On the other hand, the β-glucan synthesis-associated protein present in both ectomycorrhizas is involved in fungal cell wall remodelling16 and may play a role in the alteration of cell wall surface during infection to conceal the hyphae from the host.
The ability to establish ECM symbioses is a widespread characteristic of various ascomycetes and basidiomycetes3. The truffle genome reveals features of an ancestral fungal lineage that diverged from other lineages >450 Myr ago9. Despite their similar symbiotic structures and similar beneficial effects on plant growth, the ascomycete T. melanosporum and the basidiomycete L. bicolor encode strikingly different proteomes—compact with very few multigene families, versus large with many expanded multigene families—and symbiosis-regulated genes. Effector-like proteins, such as the L. bicolor ECM-induced SSP MiSSP7 (ref. 4), are not expressed in T. melanosporum ectomycorrhizas. On the basis of our results, the ECM symbiosis appears as an ancient innovation that developed several times during the course of Mycota evolution using different ‘molecular toolkits’18. Sequencing of the T. melanosporum genome has provided unprecedented insights into the molecular bases of symbiosis, sex and fruiting in a most popular representative of the only lifestyle not yet addressed by Ascomycota genomics19. This sequencing will be a major step in moving truffle research into the realm of ecosystem science, and a deeper understanding of the genome of the Périgord black truffle is expected to have substantial social and cultural impact.
A whole-genome shotgun strategy was adopted for sequencing and assembling the T. melanosporum genome (Supplementary Information section 2). All genomic DNA was obtained from the homokaryotic haploid strain Mel28. All data were generated by paired-end sequencing of cloned 3 kb and 10 kb inserts using Sanger technology. The pool of data available for the assembly consisted of 1,262,177 reads, with ∼1,250 Mb of sequence. The data were assembled using the ARACHNE assembler. The 4,464 contigs (N50 = 62 kb) were assembled in 398 supercontigs (N50 = 638 kb) corresponding to 124.946 Mb of sequence. The main genome scaffolds were at a depth of 10. Assemblies and annotations are available at INRA (http://mycor.nancy.inra.fr/IMGC/TuberGenome/) and Genoscope (http://www.genoscope.cns.fr/tuber).
The GAZE pipeline selected a best representative gene model for each locus on the basis of expressed-sequence-tag support and similarity to known proteins from other organisms, and predicted 7,496 protein-coding gene models (Supplementary Information section 4). All predicted genes were annotated using Gene Ontology and KEGG pathways. Protein domains were predicted using InterProScan. Gene families were built from proteins using Tribe-MCL.
Single dye labelling of cDNAs, hybridization procedures, data acquisition, background correction and normalization of custom-exon expression arrays were performed at the NimbleGen facilities following their standard protocol. A Student t-test with false discovery rate was applied to the data using the ARRAYSTAR software (DNASTAR). Transcripts with a significant P value (<0.05) and ≥4-fold change in transcript level were considered as differentially expressed in ECM root tips or fruiting body.
Gene Expression Omnibus
Genome assemblies together with predicted gene models and annotations have been deposited at DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank under the project accession numbers CABJ01000001–CABJ01004455 (whole genome shotgun sequencing data) and FN429986–FN430383 (scaffolds and annotations). The complete expression dataset is available as series (accession number GSE17529) at the Gene Expression Omnibus at NCBI.
We thank the late L. Riousset and C. Dupré for providing the Mel28 isolate, and acknowledge F. Le Tacon and J. Weissenbach for continuous support. The genome sequencing of T. melanosporum was funded by the Génoscope, Institut de Génomique, CEA, and Agence Nationale de la Recherche (ANR). Genome annotation and transcriptome analysis were supported by INRA, the European FP6 Network of Excellence EVOLTREE, Région Lorraine, the ANR FungEffector project, Fondazione Cariparma, Compagnia di San Paolo-Torino, the Italian Ministry of Education, University and Research (MIUR), Regione Umbria and Instituto Pasteur Fondazione Cenci Bolognetti. We thank D. Hibbett and J. Heitman for comments on an early draft of the manuscript, and J. Plett for a critical reading of the paper.
Author Contributions B.H., M.-H.L., F.P., P. Bonfante, S.O. and P.W. contributed equally to this work as senior authors; A.K., C.M., R.B., P.M.C., O.J., B.M., E.M., B.N., R.P., B.P. and A.R. contributed equally to this work as second authors. F.M. initiated the project and coordinated the genome annotation, data analysis and manuscript preparation; P.W. coordinated the sequencing and automated annotation at Genoscope. F.M. and S.O. wrote the manuscript with input from P. Bonfante. R.B., P.M.C., B.H., O.J., A.K., M.H.L., B.M., E.M., C.M., B.N., F.P., R.P., A.R. and P.W. also made substantial contributions (listed in alphabetical order). All other authors are members of the genome sequencing consortium and contributed annotation, analyses or data throughout the project, and are listed in alphabetical order. We also thank A. Bonfigli, M. Buffalini, S. Colafarina, T. Flutre, S. Kamal, P. Ceccaroli, C. Roux, R. Saltarelli, S. von Pall di Tolna and O. Zarivi for their assistance in annotation.
This file contains Supplementary Methods, Supplementary Results, a Supplementary Discussion, Supplementary References, Supplementary Tables 1- 31 and Supplementary Figures 1-26 with legends.