Abstract
Gene clusters comprise genomically co-localized and potentially co-regulated genes that tend to be conserved across species. In eukaryotes, multiple examples of metabolic gene clusters are known, particularly among fungi and plants. However, little is known about how gene clustering patterns vary among taxa or with respect to functional roles. Furthermore, mechanisms of the formation, maintenance and evolution of gene clusters remain unknown. We surveyed 341 fungal genomes to discover gene clusters shared by different species, independently of their functions. We inferred 12,120 cluster families, which comprised roughly one third of the gene space and were enriched in genes associated with diverse cellular functions. Additionally, most clusters did not encode transcription factors, suggesting that they are regulated distally. We used phylogenomics to characterize the evolutionary history of these clusters. We found that most clusters originated once and were transmitted vertically, coupled to differential loss. However, convergent evolution—that is, independent appearance of the same cluster—was more prevalent than anticipated. Finally, horizontal gene transfer of entire clusters was somewhat restricted, with the exception of those associated with secondary metabolism. Altogether, our results provide insights on the evolution of gene clustering as well as a broad catalogue of evolutionarily conserved gene clusters whose function remains to be elucidated.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All proteomes were downloaded from NCBI (April 2017). The full list of BioProjects can be found in Supplementary Table 1. All other data are available from the corresponding author upon request.
Code availability
EvolClust can be downloaded from our github repository: (https://github.com/Gabaldonlab/EvolClust/) and Evolclassifier can be found in the folder ‘additional scripts’ in the repository.
References
Osbourn, A. E. & Field, B. Operons. Cell. Mol. Life Sci. 66, 3755–3775 (2009).
Hurst, L. D., Pál, C. & Lercher, M. J. The evolutionary dynamics of eukaryotic gene order. Nat. Rev. Genet. 5, 299–310 (2004).
Dávila López, M., Martínez Guerra, J. J. & Samuelsson, T. Analysis of gene order conservation in eukaryotes identifies transcriptionally and functionally linked genes. PLoS ONE 5, e10654 (2010).
Cohen, B. A., Mitra, R. D., Hughes, J. D. & Church, G. M. A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nat. Genet. 26, 183–186 (2000).
Wang, G.-Z., Chen, W.-H. & Lercher, M. J. Coexpression of linked gene pairs persists long after their separation. Genome Biol. Evol. 3, 565–570 (2011).
Boutanaev, A. M., Kalmykova, A. I., Shevelyov, Y. Y. & Nurminsky, D. I. Large clusters of co-expressed genes in the Drosophila genome. Nature 420, 666–669 (2002).
Reimegård, J. et al. Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana. Nucleic Acids Res. 45, 3253–3265 (2017).
Poyatos, J. F. & Hurst, L. D. Is optimal gene order impossible? Trends Genet. 22, 420–423 (2006).
Teichmann, S. A. & Veitia, R. A. Genes encoding subunits of stable complexes are clustered on the yeast chromosomes: an interpretation from a dosage balance perspective. Genetics 167, 2121–2125 (2004).
Lee, J. M. & Sonnhammer, E. L. L. Genomic gene clustering analysis of pathways in eukaryotes. Genome Res. 13, 875–882 (2003).
Wisecaver, J. H., Slot, J. C. & Rokas, A. The evolution of fungal metabolic pathways. PLoS Genet. 10, e1004816 (2014).
Hull, E. P., Green, P. M., Arst, H. N. Jr & Scazzocchio, C. Cloning and physical characterization of the L-proline catabolism gene cluster of Aspergillus nidulans. Mol. Microbiol. 3, 553–559 (1989).
Keller, N. P. & Hohn, T. M. Metabolic pathway gene clusters in filamentous fungi. Fungal Genet. Biol. 21, 17–29 (1997).
Johnston, M. A model fungal gene regulatory mechanism: the GAL genes of Saccharomyces cerevisiae. Microbiol. Rev. 51, 458–476 (1987).
Wong, S. & Wolfe, K. H. Birth of a metabolic gene cluster in yeast by adaptive gene relocation. Nat. Genet. 37, 777–782 (2005).
Nützmann, H.-W., Scazzocchio, C. & Osbourn, A. Metabolic gene clusters in eukaryotes. Ann. Rev. Genet. 52, 159–183 (2018).
Ocaña-Pallarès, E., Najle, S. R., Scazzocchio, C., Ruiz-Trillo, I. & Rokas, A. Reticulate evolution in eukaryotes: origin and evolution of the nitrate assimilation pathway. PLoS Genet. 15, e1007986 (2019).
Hoffmeister, D. & Keller, N. P. Natural products of filamentous fungi: enzymes, genes, and their regulation. Nat. Prod. Rep. 24, 393–416 (2007).
Keller, N. P. Translating biosynthetic gene clusters into fungal armor and weaponry. Nat. Chem. Biol. 11, 671–677 (2015).
Takos, A. M. & Rook, F. Why biosynthetic genes for chemical defense compounds cluster. Trends Plant Sci. 17, 383–388 (2012).
Slot, J. C. & Rokas, A. Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi. Curr. Biol. 21, 134–139 (2011).
Campbell, M. A., Rokas, A. & Slot, J. C. Horizontal transfer and death of a fungal secondary metabolic gene cluster. Genome Biol. Evol. 4, 289–293 (2012).
Khaldi, N. & Wolfe, K. H. Evolutionary origins of the fumonisin secondary metabolite gene cluster in Fusarium verticillioides and Aspergillus niger. Int. J. Evol. Biol. 2011, 423821 (2011).
Wisecaver, J. H. & Rokas, A. Fungal metabolic gene clusters-caravans traveling across genomes and environments. Front. Microbiol. 6, 161 (2015).
Slot, J. C. & Rokas, A. Multiple GAL pathway gene clusters evolved independently and by different mechanisms in fungi. Proc. Natl Acad. Sci. USA 107, 10136–10141 (2010).
Marcet-Houben, M. & Gabaldón, T. Evolclust: automated inference of evolutionary conserved gene clusters in eukaryotes. Preprint at bioRxiv https://doi.org/10.1101/698621 (2019).
Ohm, R. A. et al. Diverse lifestyles and strategies of plant pathogenesis encoded in the genomes of eighteen Dothideomycetes fungi. PLoS Pathog. 8, e1003037 (2012).
Winter, S. et al. Finding approximate gene clusters with Gecko 3. Nucleic Acids Res. 44, 9600–9610 (2016).
Ortiz-Merino, R. A. et al. Evolutionary restoration of fertility in an interspecies hybrid yeast, by whole-genome duplication after a failed mating-type switch. PLoS Biol. 15, e2002128 (2017).
Ma, L.-J. et al. Genomic analysis of the basal lineage fungus Rhizopus oryzae reveals a whole-genome duplication. PLoS Genet. 5, e1000549 (2009).
Marcet-Houben, M. & Gabaldón, T. Beyond the whole-genome duplication: phylogenetic evidence for an ancient interspecies hybridization in the baker’s yeast lineage. PLoS Biol. 13, e1002220 (2015).
Reynolds, H. et al. Differential retention of gene functions in a secondary metabolite cluster. Mol. Biol. Evol. 34, 2002–2015 (2017).
Khaldi, N., Collemare, J., Lebrun, M.-H. & Wolfe, K. H. Evidence for horizontal transfer of a secondary metabolite gene cluster between fungi. Genome Biol. 9, R18 (2008).
Druzhinina, I. S., Kubicek, E. M. & Kubicek, C. P. Several steps of lateral gene transfer followed by events of ‘birth-and-death’ evolution shaped a fungal sorbicillinoid biosynthetic gene cluster. BMC Evol. Biol. 16, 269 (2016).
Marcet-Houben, M. & Gabaldón, T. Horizontal acquisition of toxic alkaloid synthesis in a clade of plant associated fungi. Fungal Genet. Biol. 86, 71–80 (2016).
Marcet-Houben, M. & Gabaldón, T. Acquisition of prokaryotic genes by fungal genomes. Trends Genet. 26, 5–8 (2010).
Price, M. N., Dehal, P. S. & Arkin, A. P. Horizontal gene transfer and the evolution of transcriptional regulation in Escherichia coli. Genome Biol. 9, R4 (2008).
Khaldi, N. et al. SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genet. Biol. 47, 736–741 (2010).
Wolf, T., Shelest, V., Nath, N. & Shelest, E. CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes. Bioinformatics 32, 1138–1143 (2016).
Hane, J. K. et al. A novel mode of chromosomal evolution peculiar to filamentous Ascomycete fungi. Genome Biol. 12, R45 (2011).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002).
Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Huerta-Cepas, J. et al. PhylomeDBv3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions. Nucleic Acids Res. 39, D556–D560 (2011).
Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).
Lassmann, T. & Sonnhammer, E. L. Kalign—an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics 6, 298 (2005).
Landan, G. & Graur, D. Heads or tails: a simple reliability check for multiple sequence alignments. Mol. Biol. Evol. 24, 1380–1383 (2007).
Wallace, I. M., O’Sullivan, O., Higgins, D. G. & Notredame, C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 34, 1692–1699 (2006).
Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Gascuel, O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14, 685–695 (1997).
Akaike, H. Information theory and extension of the maximum likelihood principle. In Proc. of the 2nd International Symposium on Information Theory (eds Petrov, B. N. & Csaki, F.) 267–281 (Akademiai Kiado, 1973).
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Marcet-Houben, M. & Gabaldon, T. TreeKO: a duplication-aware algorithm for the comparison of phylogenetic trees. Nucleic Acids Res. 39, e66 (2011).
Huerta-Cepas, J., Serra, F. & Bork, P. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 33, 1635–1638 (2016).
Bourque, G. & Pevzner, P. A. Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12, 26–36 (2002).
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).
Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).
Finn, R. D., Bateman, A., Clements, J. & Coggill, P. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
Wilson, D., Charoensawan, V., Kummerfeld, S. K. & Teichmann, S. A. DBD—taxonomically broad transcription factor predictions: new content and functionality. Nucleic Acids Res. 36, D88–D92 (2008).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Al-Shahrour, F., Díaz-Uriarte, R. & Dopazo, J. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20, 578–580 (2004).
Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS ONE 6, e21800 (2011).
Ballester, A.-R. et al. Genome, transcriptome, and functional analyses of penicillium expansum provide new insights into secondary metabolism and pathogenicity. Mol. Plant. Microbe Interact. 28, 232–248 (2015).
Acknowledgements
The T.G. group acknowledges support from the Spanish Ministry of Economy, Industry and Competitiveness (MEIC) for the EMBL partnership, and grants ‘Centro de Excelencia Severo Ochoa 2013-2017’ SEV-2012-0208 and BFU2015-67107 cofounded by European Regional Development Fund (ERDF); from the CERCA Programme/Generalitat de Catalunya; from the Catalan Research Agency (AGAUR) SGR857, and a grant from the European Union’s Horizon 2020 research and innovation programme under grant agreement ERC-2016-724173 the Marie Sklodowska-Curie grant agreement H2020-MSCA-ITN-2014-642095.
Author information
Authors and Affiliations
Contributions
T.G. and M.M.-H. designed the study, M.M.-H. gathered the data and performed the cluster prediction and phylogenomics analysis. T.G. and M.M.-H. analysed the results and wrote the manuscript. T.G. supervised the study.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary discussion and Supplementary Figs. 1–4.
Supplementary Tables
Supplementary tables 1–19
Supplementary Table 2
List of predicted clusters
Rights and permissions
About this article
Cite this article
Marcet-Houben, M., Gabaldón, T. Evolutionary and functional patterns of shared gene neighbourhood in fungi. Nat Microbiol 4, 2383–2392 (2019). https://doi.org/10.1038/s41564-019-0552-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41564-019-0552-0
This article is cited by
-
Changes in the distribution of fitness effects and adaptive mutational spectra following a single first step towards adaptation
Nature Communications (2021)
-
Genetic bases for variation in structure and biological activity of trichothecene toxins produced by diverse fungi
Applied Microbiology and Biotechnology (2020)