Cellulosomes are large, multiprotein complexes that tether plant biomass-degrading enzymes together for improved hydrolysis1. These complexes were first described in anaerobic bacteria, where species-specific dockerin domains mediate the assembly of enzymes onto cohesin motifs interspersed within protein scaffolds1. The versatile protein assembly mechanism conferred by the bacterial cohesin–dockerin interaction is now a standard design principle for synthetic biology2,3. For decades, analogous structures have been reported in anaerobic fungi, which are known to assemble by sequence-divergent non-catalytic dockerin domains (NCDDs)4. However, the components, modular assembly mechanism and functional role of fungal cellulosomes remain unknown5,6. Here, we describe a comprehensive set of proteins critical to fungal cellulosome assembly, including conserved scaffolding proteins unique to the Neocallimastigomycota. High-quality genomes of the anaerobic fungi Anaeromyces robustus, Neocallimastix californiae and Piromyces finnis were assembled with long-read, single-molecule technology. Genomic analysis coupled with proteomic validation revealed an average of 312 NCDD-containing proteins per fungal strain, which were overwhelmingly carbohydrate active enzymes (CAZymes), with 95 large fungal scaffoldins identified across four genera that bind to NCDDs. Fungal dockerin and scaffoldin domains have no similarity to their bacterial counterparts, yet several catalytic domains originated via horizontal gene transfer with gut bacteria. However, the biocatalytic activity of anaerobic fungal cellulosomes is expanded by the inclusion of GH3, GH6 and GH45 enzymes. These findings suggest that the fungal cellulosome is an evolutionarily chimaeric structure—an independently evolved fungal complex that co-opted useful activities from bacterial neighbours within the gut microbiome.
The authors acknowledge funding support from the Office of Science (BER), US Department of Energy (DE-SC0010352), the US Department of Agriculture (award no. 2011-67017-20459), the National Science Foundation (DGE 1144085) and the Institute for Collaborative Biotechnologies through grant no. W911NF-09-0001 from the US Army Research Office. A portion of this research was performed under the Facilities Integrating Collaborations for User Science (FICUS) exploratory effort and used resources at the DOE Joint Genome Institute and the Environmental Molecular Sciences Laboratory, which are DOE Office of Science User Facilities. Both facilities are sponsored by the Office of Biological and Environmental Research and operated under contract nos. DE-AC02-05CH11231 (JGI) and DE-AC05-76RL01830 (EMSL). The authors acknowledge support from the California NanoSystems Institute (CNSI), supported by the University of California, Santa Barbara, and the University of California, Office of the President. SPR data were generated in the UCSB and UCOP-supported Biological Nanostructures Laboratory within the California NanoSystems Institute. The authors thank P.J. Weimer (US Dairy Forage Research Center) for lignocellulosic substrates. B.H. acknowledges IDEX Aix-Marseille (Grant Microbio-E) and Agence Nationale de la Recherche (grant no. ANR-14-CE06-0020) for funding.
The supplementary datasets contain 3 types of files related to each of the 10 Pfam domains for which there is evidence of DDP HGT between Neocallimastigomycota and Bacteria.