In soil ecosystems, microorganisms produce diverse secondary metabolites such as antibiotics, antifungals and siderophores that mediate communication, competition and interactions with other organisms and the environment1,2. Most known antibiotics are derived from a few culturable microbial taxa3, and the biosynthetic potential of the vast majority of bacteria in soil has rarely been investigated4. Here we reconstruct hundreds of near-complete genomes from grassland soil metagenomes and identify microorganisms from previously understudied phyla that encode diverse polyketide and nonribosomal peptide biosynthetic gene clusters that are divergent from well-studied clusters. These biosynthetic loci are encoded by newly identified members of the Acidobacteria, Verrucomicobia and Gemmatimonadetes, and the candidate phylum Rokubacteria. Bacteria from these groups are highly abundant in soils5,6,7, but have not previously been genomically linked to secondary metabolite production with confidence. In particular, large numbers of biosynthetic genes were characterized in newly identified members of the Acidobacteria, which is the most abundant bacterial phylum across soil biomes5. We identify two acidobacterial genomes from divergent lineages, each of which encodes an unusually large repertoire of biosynthetic genes with up to fifteen large polyketide and nonribosomal peptide biosynthetic loci per genome. To track gene expression of genes encoding polyketide synthases and nonribosomal peptide synthetases in the soil ecosystem that we studied, we sampled 120 time points in a microcosm manipulation experiment and, using metatranscriptomics, found that gene clusters were differentially co-expressed in response to environmental perturbations. Transcriptional co-expression networks for specific organisms associated biosynthetic genes with two-component systems, transcriptional activation, putative antimicrobial resistance and iron regulation, linking metabolite biosynthesis to processes of environmental sensing and ecological competition. We conclude that the biosynthetic potential of abundant and phylogenetically diverse soil microorganisms has previously been underestimated. These organisms may represent a source of natural products that can address needs for new antibiotics and other pharmaceutical compounds.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank S. Spaulding for assistance with fieldwork, and M. Traxler and W. Zhang for helpful discussions. Sequencing was carried out under a Community Sequencing Project at the Joint Genome Institute. Funding was provided by the Office of Science, Office of Biological and Environmental Research, of the US Department of Energy Grant DOE-SC10010566, the Paul G. Allen Family Foundation and the Innovative Genomics Institute of the University of California, Berkeley.
Extended data figures and tables
Schematic showing major components of microcosm time-point sampling and metagenomic analyses.
Biosynthetic loci identified by both antiSMASH and PRISM from the Candidatus Eelbacter genome that contained at least 10 kb of biosynthetic genes. Predictions of the organization of the biosynthetic domains in each locus shown here were determined by PRISM. Smaller biosynthetic loci from this genome are not shown. Full names for the biosynthetic domains are given in Supplementary Table 11.
Biosynthetic loci identified by both antiSMASH and PRISM from the Candidatus Angelobacter genome that contained at least 10 kb of biosynthetic genes. Predictions of the organization of the biosynthetic domains in each locus shown here were determined by PRISM. Smaller biosynthetic loci from this genome are not shown. Full names for the biosynthetic domains are given in Supplementary Table 11.
The graph shows levels of transcriptional expression of genes containing NRPS and PKS protein domains across genomes from the four phyla of interest. Values are reported in log10-transformed transcripts per million and are summed across the 120 soil microcosm samples.
The levels of transcriptional expression of genes from biosynthetic gene clusters encoded in the Candidatus Eelbacter genome across 120 soil microcosm time-point samples grouped by extraction times (reported in hours) are shown. Expression levels are reported in log10-transformed transcripts per million.
The levels of expression of biosynthetic gene clusters from all organisms studied (excluding Candidatus Angelobacter data shown in Fig. 3a) that were found to be significantly differentially expressed between time points (PERMANOVA; n = 120; P < 0.05, FDR = 5%) across 120 soil microcosm time-point samples are shown. Expression levels are reported in log10 transcripts per million.
A transcriptional network of co-expressed Verrucomicrobia_AV7 genes from a module found to be significantly enriched in genes from the biosynthetic gene clusters Verrucomicrobia_nrps_156 and Verrucomicrobia_nrps_157 (P < 0.05; hypergeometric distribution) is shown. Genes from the biosynthetic locus are outlined with a dashed line.