Cellulosomes are large, multiprotein complexes that tether plant biomass-degrading enzymes together for improved hydrolysis1. These complexes were first described in anaerobic bacteria, where species-specific dockerin domains mediate the assembly of enzymes onto cohesin motifs interspersed within protein scaffolds1. The versatile protein assembly mechanism conferred by the bacterial cohesin–dockerin interaction is now a standard design principle for synthetic biology2,3. For decades, analogous structures have been reported in anaerobic fungi, which are known to assemble by sequence-divergent non-catalytic dockerin domains (NCDDs)4. However, the components, modular assembly mechanism and functional role of fungal cellulosomes remain unknown5,6. Here, we describe a comprehensive set of proteins critical to fungal cellulosome assembly, including conserved scaffolding proteins unique to the Neocallimastigomycota. High-quality genomes of the anaerobic fungi Anaeromyces robustus, Neocallimastix californiae and Piromyces finnis were assembled with long-read, single-molecule technology. Genomic analysis coupled with proteomic validation revealed an average of 312 NCDD-containing proteins per fungal strain, which were overwhelmingly carbohydrate active enzymes (CAZymes), with 95 large fungal scaffoldins identified across four genera that bind to NCDDs. Fungal dockerin and scaffoldin domains have no similarity to their bacterial counterparts, yet several catalytic domains originated via horizontal gene transfer with gut bacteria. However, the biocatalytic activity of anaerobic fungal cellulosomes is expanded by the inclusion of GH3, GH6 and GH45 enzymes. These findings suggest that the fungal cellulosome is an evolutionarily chimaeric structure—an independently evolved fungal complex that co-opted useful activities from bacterial neighbours within the gut microbiome.

  • Subscribe to Nature Microbiology for full access:



  • Purchase article full text and PDF:


    Buy now

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.


  1. 1.

    & Cellulosomes: highly efficient nanomachines designed to deconstruct plant cell wall complex carbohydrates. Annu. Rev. Biochem. 79, 655–681 (2010).

  2. 2.

    , & Facilitated substrate channeling in a self-assembled trifunctional enzyme complex. Angew. Chem. Int. Ed. 51, 8787–8790 (2012).

  3. 3.

    , & Functional assembly of a multi-enzyme methanol oxidation cascade on a surface-displayed trifunctional scaffold for enhanced NADH production. Chem. Commun. 49, 3766–3768 (2013).

  4. 4.

    , , , & The conserved noncatalytic 40-residue sequence in cellulases and hemicellulases from anaerobic fungi functions as a protein docking domain. J. Biol. Chem. 270, 29314–29322 (1995).

  5. 5.

    et al. Characterization of a double dockerin from the cellulosome of the anaerobic fungus Piromyces equi. J. Mol. Biol. 373, 612–622 (2007).

  6. 6.

    , , , & Anaerobic gut fungi: advances in isolation, culture, and cellulolytic enzyme discovery for biofuel production. Biotechnol. Bioeng. 111, 1471–1482 (2014).

  7. 7.

    et al. Early-branching gut fungi possess a large, comprehensive array of biomass-degrading enzymes. Science 351, 1192–1195 (2016).

  8. 8.

    et al. Assembly of minicellulosomes on the surface of Bacillus subtilis. Appl. Environ. Microb. 77, 4849–4858 (2011).

  9. 9.

    , , , & Functional assembly of minicellulosomes on the Saccharomyces cerevisiae cell surface for cellulose hydrolysis and ethanol production. Appl. Environ. Microb. 75, 6087–6093 (2009).

  10. 10.

    , & Functional display of complex cellulosomes on the yeast surface via adaptive assembly. ACS Synth. Biol. 2, 14–21 (2013).

  11. 11.

    & The anaerobic fungus Neocallimastix frontalis—isolation and properties of a cellulosome-type enzyme fraction with the capacity to solubilize hydrogen-bond-ordered cellulose. Appl. Microbiol. Biotechnol. 37, 125–129 (1992).

  12. 12.

    et al. Phosphorylation of spore coat proteins by a family of atypical protein kinases. Proc. Natl Acad. Sci. USA 113, E3482–E3491 (2016).

  13. 13.

    et al. beta-Glucosidase in cellulosome of the anaerobic fungus Piromyces sp. strain E2 is a family 3 glycoside hydrolase. Biochem. J. 370, 963–970 (2003).

  14. 14.

    et al. The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader. Appl. Environ. Microb. 79, 4620–4634 (2013).

  15. 15.

    , , & Conserved repeat motifs and glucan binding by glucansucrases of oral streptococci and Leuconostoc mesenteroides. J. Bacteriol. 186, 8301–8308 (2004).

  16. 16.

    et al. Characterization of a cellulosome dockerin domain from the anaerobic fungus Piromyces equi. Nat. Struct. Biol. 8, 775–778 (2001).

  17. 17.

    , , & The cellulosomes: multienzyme machines for degradation of plant cell wall polysaccharides. Annu. Rev. Microbiol. 58, 521–554 (2004).

  18. 18.

    , & Horizontal gene transfer of glycosyl hydrolases of the rumen fungi. Mol. Biol. Evol. 17, 352–361 (2000).

  19. 19.

    et al. Mycocosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res. 42, D699–D704 (2014).

  20. 20.

    et al. IMG/m 4 version of the integrated metagenome comparative analysis system. Nucleic Acids Res. 42, D568–D573 (2014).

  21. 21.

    et al. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 40, D1178–D1186 (2012).

  22. 22.

    et al. Noncatalytic docking domains of cellulosomes of anaerobic fungi. J. Bacteriol. 183, 5325–5333 (2001).

  23. 23.

    , & Driving biomass breakdown through engineered cellulosomes. Bioengineered 6, 204–208 (2015).

  24. 24.

    et al. Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads. BMC Genomics 11, 663 (2010).

  25. 25.

    , , & Robust and effective methodologies for cryopreservation and DNA extraction from anaerobic gut fungi. Anaerobe 38, 39–46 (2016).

  26. 26.

    , , & FinisherSC: a repeat-aware tool for upgrading de novo assembly using long reads. Bioinformatics 31, 3207–3209 (2015).

  27. 27.

    et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013).

  28. 28.

    et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005); corrigendum 441, 120 (2006).

  29. 29.

    & Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).

  30. 30.

    et al. Gap resolution: a software package for improving newbler genome assemblies. in Proceedings of the 4th Annual Meeting on Sequencing Finishing, Analysis in the Future 35 (2009).

  31. 31.

    , , & MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).

  32. 32.

    MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

  33. 33.

    et al. Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948 (2007).

  34. 34.

    Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 114–120 (1995).

  35. 35.

    , , , & Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

  36. 36.

    , & Trimal: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

  37. 37.

    RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).

  38. 38.

    et al. Cellulases and hemicellulases of the anaerobic fungus Piromyces constitute a multiprotein cellulose-binding complex and are encoded by multigene families. FEMS Microbiol. Lett. 125, 15–21 (1995).

  39. 39.

    , & Microplate-based carboxymethylcellulose assay for endoglucanase activity. Anal. Biochem. 342, 176–178 (2005).

  40. 40.

    et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).

Download references


The authors acknowledge funding support from the Office of Science (BER), US Department of Energy (DE-SC0010352), the US Department of Agriculture (award no. 2011-67017-20459), the National Science Foundation (DGE 1144085) and the Institute for Collaborative Biotechnologies through grant no. W911NF-09-0001 from the US Army Research Office. A portion of this research was performed under the Facilities Integrating Collaborations for User Science (FICUS) exploratory effort and used resources at the DOE Joint Genome Institute and the Environmental Molecular Sciences Laboratory, which are DOE Office of Science User Facilities. Both facilities are sponsored by the Office of Biological and Environmental Research and operated under contract nos. DE-AC02-05CH11231 (JGI) and DE-AC05-76RL01830 (EMSL). The authors acknowledge support from the California NanoSystems Institute (CNSI), supported by the University of California, Santa Barbara, and the University of California, Office of the President. SPR data were generated in the UCSB and UCOP-supported Biological Nanostructures Laboratory within the California NanoSystems Institute. The authors thank P.J. Weimer (US Dairy Forage Research Center) for lignocellulosic substrates. B.H. acknowledges IDEX Aix-Marseille (Grant Microbio-E) and Agence Nationale de la Recherche (grant no. ANR-14-CE06-0020) for funding.

Author information

Author notes

    • Kevin V. Solomon
    •  & Theo van Alen

    Present address: Agricultural and Biological Engineering, Purdue University, West Lafayette, Indiana 47907, USA (K.V.S.); Department of Microbiology, Faculty of Science, Radboud University, PO Box 9010, 6500 GL Nijmegen, The Netherlands (T.v.A.).


  1. Department of Chemical Engineering, University of California, Santa Barbara, California 93106, USA

    • Charles H. Haitjema
    • , Sean P. Gilmore
    • , John K. Henske
    • , Kevin V. Solomon
    • , Randall de Groot
    •  & Michelle A. O'Malley
  2. US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California 94598, USA

    • Alan Kuo
    • , Stephen J. Mondo
    • , Asaf A. Salamov
    • , Kurt LaButti
    • , Zhiying Zhao
    • , Jennifer Chiniquy
    • , Kerrie Barry
    •  & Igor V. Grigoriev
  3. Environmental Molecular Sciences Laboratory, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99354, USA

    • Heather M. Brewer
    • , Samuel O. Purvine
    •  & Scott E. Baker
  4. Biological Sciences Division, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99354, USA

    • Aaron T. Wright
  5. Architecture et Fonction des Macromolécules Biologiques, Centre National de la Recherche Scientifique, Aix-Marseille Université, 13288 Marseille, France

    • Matthieu Hainaut
    •  & Bernard Henrissat
  6. INRA, USC 1408 AFMB, Marseille, France

    • Matthieu Hainaut
    •  & Bernard Henrissat
  7. Department of Evolutionary Microbiology, Radboud University, 6525 AJ Nijmegen, The Netherlands

    • Brigitte Boxma
    • , Theo van Alen
    •  & Johannes H. P. Hackstein
  8. Department of Biological Sciences, King Abdulaziz University, 23218 Jeddah, Saudi Arabia

    • Bernard Henrissat
  9. Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA

    • Igor V. Grigoriev


  1. Search for Charles H. Haitjema in:

  2. Search for Sean P. Gilmore in:

  3. Search for John K. Henske in:

  4. Search for Kevin V. Solomon in:

  5. Search for Randall de Groot in:

  6. Search for Alan Kuo in:

  7. Search for Stephen J. Mondo in:

  8. Search for Asaf A. Salamov in:

  9. Search for Kurt LaButti in:

  10. Search for Zhiying Zhao in:

  11. Search for Jennifer Chiniquy in:

  12. Search for Kerrie Barry in:

  13. Search for Heather M. Brewer in:

  14. Search for Samuel O. Purvine in:

  15. Search for Aaron T. Wright in:

  16. Search for Matthieu Hainaut in:

  17. Search for Brigitte Boxma in:

  18. Search for Theo van Alen in:

  19. Search for Johannes H. P. Hackstein in:

  20. Search for Bernard Henrissat in:

  21. Search for Scott E. Baker in:

  22. Search for Igor V. Grigoriev in:

  23. Search for Michelle A. O'Malley in:


C.H.H., S.P.G. and M.A.O. planned the experiments. C.H.H. and R.D. performed ELISA and S.P.G. performed SPR experiments. C.H.H., S.P.G., A.K. and M.A.O. wrote the manuscript. H.M.B., S.O.P. and A.T.W. performed proteomic analyses. K.V.S. and J.K.H. prepared and analysed genomic samples for N. californiae, P. finnis and A. robustus. B.B., T.v.A. and J.H.P.H. prepared and analysed genomic samples for Piromyces sp. E2. Z.Z. and J.C. sequenced, K.L. assembled, and A.K., S.J.M. and A.A.S. annotated and analysed genomes. B.H. and M.H. analysed and classified carbohydrate-active enzymes. M.A.O., S.E.B., K.B. and I.V.G. coordinated genome projects at JGI.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Michelle A. O'Malley.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    Description of Supplementary Datasets, Supplementary Tables 1–7 and Supplementary Figures 1–9

Zip files

  1. 1.

    Supplementary Datasets

    The supplementary datasets contain 3 types of files related to each of the 10 Pfam domains for which there is evidence of DDP HGT between Neocallimastigomycota and Bacteria.