Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Evolutionary and functional patterns of shared gene neighbourhood in fungi


Gene clusters comprise genomically co-localized and potentially co-regulated genes that tend to be conserved across species. In eukaryotes, multiple examples of metabolic gene clusters are known, particularly among fungi and plants. However, little is known about how gene clustering patterns vary among taxa or with respect to functional roles. Furthermore, mechanisms of the formation, maintenance and evolution of gene clusters remain unknown. We surveyed 341 fungal genomes to discover gene clusters shared by different species, independently of their functions. We inferred 12,120 cluster families, which comprised roughly one third of the gene space and were enriched in genes associated with diverse cellular functions. Additionally, most clusters did not encode transcription factors, suggesting that they are regulated distally. We used phylogenomics to characterize the evolutionary history of these clusters. We found that most clusters originated once and were transmitted vertically, coupled to differential loss. However, convergent evolution—that is, independent appearance of the same cluster—was more prevalent than anticipated. Finally, horizontal gene transfer of entire clusters was somewhat restricted, with the exception of those associated with secondary metabolism. Altogether, our results provide insights on the evolution of gene clustering as well as a broad catalogue of evolutionarily conserved gene clusters whose function remains to be elucidated.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Sunburst chart showing the percentage of clustered protein-coding genes per species.
Fig. 2: Example of gene cluster families classified as vertically evolving.
Fig. 3: Example of a gene cluster family classified as HGT.
Fig. 4: Example of a gene cluster family that has undergone convergent evolution.
Fig. 5

Data availability

All proteomes were downloaded from NCBI (April 2017). The full list of BioProjects can be found in Supplementary Table 1. All other data are available from the corresponding author upon request.

Code availability

EvolClust can be downloaded from our github repository: ( and Evolclassifier can be found in the folder ‘additional scripts’ in the repository.


  1. 1.

    Osbourn, A. E. & Field, B. Operons. Cell. Mol. Life Sci. 66, 3755–3775 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Hurst, L. D., Pál, C. & Lercher, M. J. The evolutionary dynamics of eukaryotic gene order. Nat. Rev. Genet. 5, 299–310 (2004).

    CAS  PubMed  Google Scholar 

  3. 3.

    Dávila López, M., Martínez Guerra, J. J. & Samuelsson, T. Analysis of gene order conservation in eukaryotes identifies transcriptionally and functionally linked genes. PLoS ONE 5, e10654 (2010).

    PubMed  PubMed Central  Google Scholar 

  4. 4.

    Cohen, B. A., Mitra, R. D., Hughes, J. D. & Church, G. M. A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nat. Genet. 26, 183–186 (2000).

    CAS  PubMed  Google Scholar 

  5. 5.

    Wang, G.-Z., Chen, W.-H. & Lercher, M. J. Coexpression of linked gene pairs persists long after their separation. Genome Biol. Evol. 3, 565–570 (2011).

    PubMed  PubMed Central  Google Scholar 

  6. 6.

    Boutanaev, A. M., Kalmykova, A. I., Shevelyov, Y. Y. & Nurminsky, D. I. Large clusters of co-expressed genes in the Drosophila genome. Nature 420, 666–669 (2002).

    CAS  PubMed  Google Scholar 

  7. 7.

    Reimegård, J. et al. Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana. Nucleic Acids Res. 45, 3253–3265 (2017).

    PubMed  PubMed Central  Google Scholar 

  8. 8.

    Poyatos, J. F. & Hurst, L. D. Is optimal gene order impossible? Trends Genet. 22, 420–423 (2006).

    CAS  PubMed  Google Scholar 

  9. 9.

    Teichmann, S. A. & Veitia, R. A. Genes encoding subunits of stable complexes are clustered on the yeast chromosomes: an interpretation from a dosage balance perspective. Genetics 167, 2121–2125 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Lee, J. M. & Sonnhammer, E. L. L. Genomic gene clustering analysis of pathways in eukaryotes. Genome Res. 13, 875–882 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Wisecaver, J. H., Slot, J. C. & Rokas, A. The evolution of fungal metabolic pathways. PLoS Genet. 10, e1004816 (2014).

    PubMed  PubMed Central  Google Scholar 

  12. 12.

    Hull, E. P., Green, P. M., Arst, H. N. Jr & Scazzocchio, C. Cloning and physical characterization of the L-proline catabolism gene cluster of Aspergillus nidulans. Mol. Microbiol. 3, 553–559 (1989).

    CAS  PubMed  Google Scholar 

  13. 13.

    Keller, N. P. & Hohn, T. M. Metabolic pathway gene clusters in filamentous fungi. Fungal Genet. Biol. 21, 17–29 (1997).

    CAS  PubMed  Google Scholar 

  14. 14.

    Johnston, M. A model fungal gene regulatory mechanism: the GAL genes of Saccharomyces cerevisiae. Microbiol. Rev. 51, 458–476 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Wong, S. & Wolfe, K. H. Birth of a metabolic gene cluster in yeast by adaptive gene relocation. Nat. Genet. 37, 777–782 (2005).

    CAS  PubMed  Google Scholar 

  16. 16.

    Nützmann, H.-W., Scazzocchio, C. & Osbourn, A. Metabolic gene clusters in eukaryotes. Ann. Rev. Genet. 52, 159–183 (2018).

    PubMed  Google Scholar 

  17. 17.

    Ocaña-Pallarès, E., Najle, S. R., Scazzocchio, C., Ruiz-Trillo, I. & Rokas, A. Reticulate evolution in eukaryotes: origin and evolution of the nitrate assimilation pathway. PLoS Genet. 15, e1007986 (2019).

    PubMed  PubMed Central  Google Scholar 

  18. 18.

    Hoffmeister, D. & Keller, N. P. Natural products of filamentous fungi: enzymes, genes, and their regulation. Nat. Prod. Rep. 24, 393–416 (2007).

    CAS  PubMed  Google Scholar 

  19. 19.

    Keller, N. P. Translating biosynthetic gene clusters into fungal armor and weaponry. Nat. Chem. Biol. 11, 671–677 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Takos, A. M. & Rook, F. Why biosynthetic genes for chemical defense compounds cluster. Trends Plant Sci. 17, 383–388 (2012).

    CAS  PubMed  Google Scholar 

  21. 21.

    Slot, J. C. & Rokas, A. Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi. Curr. Biol. 21, 134–139 (2011).

    CAS  PubMed  Google Scholar 

  22. 22.

    Campbell, M. A., Rokas, A. & Slot, J. C. Horizontal transfer and death of a fungal secondary metabolic gene cluster. Genome Biol. Evol. 4, 289–293 (2012).

    PubMed  PubMed Central  Google Scholar 

  23. 23.

    Khaldi, N. & Wolfe, K. H. Evolutionary origins of the fumonisin secondary metabolite gene cluster in Fusarium verticillioides and Aspergillus niger. Int. J. Evol. Biol. 2011, 423821 (2011).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Wisecaver, J. H. & Rokas, A. Fungal metabolic gene clusters-caravans traveling across genomes and environments. Front. Microbiol. 6, 161 (2015).

    PubMed  PubMed Central  Google Scholar 

  25. 25.

    Slot, J. C. & Rokas, A. Multiple GAL pathway gene clusters evolved independently and by different mechanisms in fungi. Proc. Natl Acad. Sci. USA 107, 10136–10141 (2010).

    CAS  PubMed  Google Scholar 

  26. 26.

    Marcet-Houben, M. & Gabaldón, T. Evolclust: automated inference of evolutionary conserved gene clusters in eukaryotes. Preprint at bioRxiv (2019).

  27. 27.

    Ohm, R. A. et al. Diverse lifestyles and strategies of plant pathogenesis encoded in the genomes of eighteen Dothideomycetes fungi. PLoS Pathog. 8, e1003037 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Winter, S. et al. Finding approximate gene clusters with Gecko 3. Nucleic Acids Res. 44, 9600–9610 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Ortiz-Merino, R. A. et al. Evolutionary restoration of fertility in an interspecies hybrid yeast, by whole-genome duplication after a failed mating-type switch. PLoS Biol. 15, e2002128 (2017).

    PubMed  PubMed Central  Google Scholar 

  30. 30.

    Ma, L.-J. et al. Genomic analysis of the basal lineage fungus Rhizopus oryzae reveals a whole-genome duplication. PLoS Genet. 5, e1000549 (2009).

    PubMed  PubMed Central  Google Scholar 

  31. 31.

    Marcet-Houben, M. & Gabaldón, T. Beyond the whole-genome duplication: phylogenetic evidence for an ancient interspecies hybridization in the baker’s yeast lineage. PLoS Biol. 13, e1002220 (2015).

    PubMed  PubMed Central  Google Scholar 

  32. 32.

    Reynolds, H. et al. Differential retention of gene functions in a secondary metabolite cluster. Mol. Biol. Evol. 34, 2002–2015 (2017).

    CAS  PubMed  Google Scholar 

  33. 33.

    Khaldi, N., Collemare, J., Lebrun, M.-H. & Wolfe, K. H. Evidence for horizontal transfer of a secondary metabolite gene cluster between fungi. Genome Biol. 9, R18 (2008).

    PubMed  PubMed Central  Google Scholar 

  34. 34.

    Druzhinina, I. S., Kubicek, E. M. & Kubicek, C. P. Several steps of lateral gene transfer followed by events of ‘birth-and-death’ evolution shaped a fungal sorbicillinoid biosynthetic gene cluster. BMC Evol. Biol. 16, 269 (2016).

    PubMed  PubMed Central  Google Scholar 

  35. 35.

    Marcet-Houben, M. & Gabaldón, T. Horizontal acquisition of toxic alkaloid synthesis in a clade of plant associated fungi. Fungal Genet. Biol. 86, 71–80 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Marcet-Houben, M. & Gabaldón, T. Acquisition of prokaryotic genes by fungal genomes. Trends Genet. 26, 5–8 (2010).

    CAS  PubMed  Google Scholar 

  37. 37.

    Price, M. N., Dehal, P. S. & Arkin, A. P. Horizontal gene transfer and the evolution of transcriptional regulation in Escherichia coli. Genome Biol. 9, R4 (2008).

    PubMed  PubMed Central  Google Scholar 

  38. 38.

    Khaldi, N. et al. SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genet. Biol. 47, 736–741 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Wolf, T., Shelest, V., Nath, N. & Shelest, E. CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes. Bioinformatics 32, 1138–1143 (2016).

    CAS  PubMed  Google Scholar 

  40. 40.

    Hane, J. K. et al. A novel mode of chromosomal evolution peculiar to filamentous Ascomycete fungi. Genome Biol. 12, R45 (2011).

    PubMed  PubMed Central  Google Scholar 

  41. 41.

    Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    CAS  PubMed  Google Scholar 

  42. 42.

    Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Google Scholar 

  45. 45.

    Huerta-Cepas, J. et al. PhylomeDBv3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions. Nucleic Acids Res. 39, D556–D560 (2011).

    CAS  PubMed  Google Scholar 

  46. 46.

    Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).

    PubMed  PubMed Central  Google Scholar 

  47. 47.

    Lassmann, T. & Sonnhammer, E. L. Kalign—an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics 6, 298 (2005).

    PubMed  PubMed Central  Google Scholar 

  48. 48.

    Landan, G. & Graur, D. Heads or tails: a simple reliability check for multiple sequence alignments. Mol. Biol. Evol. 24, 1380–1383 (2007).

    CAS  PubMed  Google Scholar 

  49. 49.

    Wallace, I. M., O’Sullivan, O., Higgins, D. G. & Notredame, C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 34, 1692–1699 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Gascuel, O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14, 685–695 (1997).

    CAS  PubMed  Google Scholar 

  52. 52.

    Akaike, H. Information theory and extension of the maximum likelihood principle. In Proc. of the 2nd International Symposium on Information Theory (eds Petrov, B. N. & Csaki, F.) 267–281 (Akademiai Kiado, 1973).

  53. 53.

    Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).

    CAS  PubMed  Google Scholar 

  54. 54.

    Marcet-Houben, M. & Gabaldon, T. TreeKO: a duplication-aware algorithm for the comparison of phylogenetic trees. Nucleic Acids Res. 39, e66 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Huerta-Cepas, J., Serra, F. & Bork, P. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 33, 1635–1638 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Bourque, G. & Pevzner, P. A. Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12, 26–36 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Finn, R. D., Bateman, A., Clements, J. & Coggill, P. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).

    CAS  PubMed  Google Scholar 

  60. 60.

    Wilson, D., Charoensawan, V., Kummerfeld, S. K. & Teichmann, S. A. DBD—taxonomically broad transcription factor predictions: new content and functionality. Nucleic Acids Res. 36, D88–D92 (2008).

    CAS  PubMed  Google Scholar 

  61. 61.

    Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Al-Shahrour, F., Díaz-Uriarte, R. & Dopazo, J. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20, 578–580 (2004).

    CAS  PubMed  Google Scholar 

  63. 63.

    Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS ONE 6, e21800 (2011).

    CAS  Google Scholar 

  64. 64.

    Ballester, A.-R. et al. Genome, transcriptome, and functional analyses of penicillium expansum provide new insights into secondary metabolism and pathogenicity. Mol. Plant. Microbe Interact. 28, 232–248 (2015).

    CAS  PubMed  Google Scholar 

Download references


The T.G. group acknowledges support from the Spanish Ministry of Economy, Industry and Competitiveness (MEIC) for the EMBL partnership, and grants ‘Centro de Excelencia Severo Ochoa 2013-2017’ SEV-2012-0208 and BFU2015-67107 cofounded by European Regional Development Fund (ERDF); from the CERCA Programme/Generalitat de Catalunya; from the Catalan Research Agency (AGAUR) SGR857, and a grant from the European Union’s Horizon 2020 research and innovation programme under grant agreement ERC-2016-724173 the Marie Sklodowska-Curie grant agreement H2020-MSCA-ITN-2014-642095.

Author information




T.G. and M.M.-H. designed the study, M.M.-H. gathered the data and performed the cluster prediction and phylogenomics analysis. T.G. and M.M.-H. analysed the results and wrote the manuscript. T.G. supervised the study.

Corresponding author

Correspondence to Toni Gabaldón.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary discussion and Supplementary Figs. 1–4.

Reporting Summary

Supplementary Tables

Supplementary tables 1–19

Supplementary Table 2

List of predicted clusters

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Marcet-Houben, M., Gabaldón, T. Evolutionary and functional patterns of shared gene neighbourhood in fungi. Nat Microbiol 4, 2383–2392 (2019).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing