Main

Bacteria that are associated with plant roots and exert beneficial effects on plant development are referred to as plant growth–promoting rhizobacteria1. They competitively colonize plant roots and can simultaneously act as biofertilizers and as antagonists (biopesticides) of recognized root pathogens, including bacteria, fungi and nematodes. Plant growth–promoting rhizobacteria, most of which are Pseudomonas and Bacillus spp., are applied to a wide range of agricultural species to enhance growth, for example, by promoting seedling emergence, plant biomass and disease control. Plant growth–promoting rhizobacteria antagonize soil pathogens by competing for resources such as iron, or by producing antibiotics or lytic enzymes. The complete genome sequence of the plant growth–promoting and biocontrol agent Pseudomonas fluorescens Pf-5 was published recently2. Although biocontrol strains of fluorescent pseudomonades have contributed greatly to the understanding of the mechanisms that are involved in phytostimulation and disease suppression, biological preparations from spore-forming Bacillus spp. are preferred because their long-term viability facilitates the development of commercial products3. Compared to plant growth–promoting Pseudomonas rhizobacteria, relatively little is known about the lifestyle of plant-associated Bacillus spp., which were originally considered as typical soil bacteria, despite their well-established advantages for beneficial action on plant growth and biocontrol4,5. Owing to obvious differences in the physiology between these representatives of Gram-negative and Gram-positive bacteria, respectively, the two species may exhibit specific mechanisms that benefit plant-microbe interactions.

The plant root–colonizing B. amyloliquefaciens strain FZB42 is a naturally occurring isolate, distinguished from the model organism Bacillus subtilis 168 by its abilities to stimulate plant growth and suppress plant pathogens6. Previous analyses of this strain revealed numerous gene clusters involved in nonribosomal synthesis of cyclic lipopeptides7 and polyketides8 with distinct antimicrobial action.

In this study, we report the complete genome sequence of B. amyloliquefaciens FZB42, and through comparison with the domesticated strain B. subtilis 168, highlight genes that may contribute to its plant-associated lifestyle.

Results

General genomic features and DNA islands

The principal features of the B. amyloliquefaciens FZB42 genome are summarized in Table 1. The circular chromosome (3,918,589 bp) is somewhat smaller than that of the closely related B. subtilis9 and Bacillus licheniformis10,11, primarily owing to the absence of prophage islands, which are abundant in the B. subtilis 168 genome. Deletions, altogether spanning 495 kb, were detected in regions equivalent to B. subtilis prophages 1 to 7, the phage SPβC2 and the skin element (Supplementary Table 1 online). The majority of the B. amyloliquefaciens FZB42 protein-encoding sequences were found conserved in the closely related B. subtilis (3,181 b.p.) and B. licheniformis (2,857 b.p.), and most of them are arranged in a collinear manner in all three strains.

Table 1 Genomic features of the B. amyloliquefaciens FZB42 genome and comparison with genomes of other Bacillus spp.

Many of the 214 genes unique in B. amyloliquefaciens FZB42 are clustered in the 17 DNA islands (Supplementary Fig. 1 online, circle 3), which are defined by local deviations in the tetranucleotide usage patterns12 from the signature of the whole B. amyloliquefaciens FZB42 genome. In some of the islands, additional features, such as adjacent tRNAs, remnants of phages, transposase-like sequences and direct repeats, are indicative of horizontal gene transfer (Supplementary Table 1 online). Certain DNA islands appear to be linked with the plant-associated lifestyle of B. amyloliquefaciens FZB42. For example, the location of island 7 between 1,164 and 1,193 kb corresponds to a region occupied by prophage 4 in B. subtilis 168. The region is flanked on the left side by tRNA-Val, followed by two genes of the remnant B. subtilis prophage 4 in which direct repeats of 29 nucleotides are inserted. Several of the genes clustered in this 28,745-bp DNA island exhibit striking similarity to genes involved in extracellular arabinogalactan hydrolysis, galactose uptake by a sugar-specific phosphotransferase system IIABC and galactose catabolism (Leloir pathway) in enterococci, lactobacilli and Erwinia carotovora (Fig. 1). It can be assumed that acquisition of this molecular toolbox, comprising several elements derived from other soil- and plant-associated bacteria unrelated to bacilli, has enhanced the ability of B. amyloliquefaciens FZB42 to exploit plant-derived polysaccharides in the rhizosphere.

Figure 1: DNA island 7 harboring an acquired gal operon.
figure 1

Gene arrangement of the 28-kb island and the G+C profile are shown on the bottom. Flanking genes, conserved in B. subtilis, are shown in green. A tRNA (Val) (blue) is found at the left border of the island. Parts of the island are distinguished by their unique GC content, deviation in oligonucleotide usage pattern and the presence of direct repeat sequences (DR, red-filled triangles). An extended gene cluster consisting of genes for extracellular arabinogalactane hydrolysis (ganA), a specific sugar-uptake system by membrane-bound PtsIIABC, phospho-β-galactosidase (LacG) and enzymes of the Leloir pathway (Gal K, E, T) is controlled by one promoter element (P), with a CcpA-binding region (cre) and a binding site for divergently arranged LacRFZB42 transcription regulator. A MarR-related transcription regulator and a membrane protein of unknown function belonging to the major facilitator superfamily are localized at the opposite flanking region. The upper part shows uptake and metabolism of the galactosides derived from extracellular arabinogalactane hydrolysis. All reactions are catalyzed by the products of the gene cluster described above and the pgm gene located downstream of LacR. The resulting product, glucose-6-phosphate, is further catabolized by central routes of carbon metabolism (glycolysis and pentose phosphate cycle).

Of the thirty genes for sensor kinases present in B. amyloliquefaciens FZB42, five encode potential intramembrane-sensing histidine kinases (IM-HK) involved in cell envelope stress13. Two of these are absent in B. subtilis 168, but display weak similarity to BceRS-BceAB and YxdJK-YxdLM (Supplementary Table 2 online). Cell-envelope stress is also sensed by a specific group of membrane-bound σ factors with extracytoplasmic function14. B. amyloliquefaciens FZB42 possesses genes for 16 σ factors, six of which are predicted to have extracytoplasmic function and five of which are shared by B. subtilis 168 and B. amyloliquefaciens FZB42. The latter lack Bs168 σY and σZ, but instead possess a novel putative extracytoplasmic function σ factor, RBAM00641, together with its cognate anti-σ factor, RBAM00642 (Supplementary Table 3 online).

B. amyloliquefaciens FZB42 can use a wide range of substrates and analysis of the genome revealed 75 putative ATP-binding cassette (ABC) transporters, 29 amino acid permeases and at least 17 phosphotransferase system sugar transporters (Supplementary Table 4 online).

Small functional RNAs (sRNAs) and riboswitches are essential for a number of cellular processes such as regulation of gene expression, tRNA processing and protein secretion15. In B. subtilis, 2% of all genes are under riboswitch control alone16. Based on phylogenetic relationships (Supplementary Fig. 2 online), the Bacillus clausii, Bacillus cereus and Bacillus halodurans genome sequences were chosen for a comparative genomics-based scan of putative sRNAs, riboswitches and other RNA elements. Applying stringent parameters, 84 hits were found in the intergenic regions of the B. amyloliquefaciens FZB42 genome. By comparison with the contents of the Rfam database, three of these were predicted to encode RNase P RNA, tmRNA and the 6S RNA. Eighteen others were annotated as riboswitches (ten binding S-adenosylmethionine, three specific for thiamine pyrophosphate, two for purines and one each for glycine, flavin mononucleotide and lysine). A detailed report can be found at http://www.cyanolab.de/prediction/bacillus_fzb42/summary.html and entries have been included in the associated GenBank file.

Genes involved in bacterium-plant interactions

The ability of B. amyloliquefaciens FZB42 to efficiently colonize surfaces of plant roots is a prerequisite for phytostimulation. Rhizosphere competence is linked to the capability to form sessile, multicellular communities (biofilms). In liquid culture without shaking, B. amyloliquefaciens FZB42 forms robust pellicles at the liquid-air interface, whereas domesticated B. subtilis strains usually form thin, fragile pellicles17. The genome of B. amyloliquefaciens FZB42 contains the complete set of genes implicated in biofilm- and fruiting body–formation in B. subtilis, including the 15-gene exopolysaccharide operon epsA-O, apparently required for producing an exopolysaccharide that holds chains of cells together in bundles18. An additional gene cluster in the B. amyloliquefaciens FZB42 genome, which probably participates in exopolysaccharide and/or lipopolysaccharide biosynthesis, has no counterpart in B. subtilis. The unique genes RBAM00750, RBAM00751 and RBAM00754 encode proteins with a collagen-related GXT structural motif, and are probably involved in surface adhesion or biofilm formation (Supplementary Table 5 online).

B. amyloliquefaciens FZB42 displays a robust swarming phenotype. Both the protein encoded by swrA and the lipopeptide surfactin, previously demonstrated for B. amyloliquefaciens FZB42 (ref. 5), are thought to be essential for swarming motility. These proteins permit colonization of surfaces and nutrient acquisition through their surface wetting and detergent properties18. Nonswarming mutants, whose surfactin biosynthesis is blocked, are also severely impaired in biofilm formation. Furthermore, an swrA gene homolog, sharing 88% identity to swrA wild-type alleles present in environmental B. subtilis isolates, was detected in the B. amyloliquefaciens FZB42 genome, together with other genes known to be necessary for swarming motility (Supplementary Table 5).

We identified 262 genes encoding proteins with putative secretion signals (Supplementary Table 6 online) recognized by signal peptidases type I (150 gene products), type II (98 gene products) and the Tat system (14 gene products). The B. amyloliquefaciens FZB42 alkaline serine protease AprE displays high similarity (98% identity) to cuticle-degrading extracellular proteases that eventually kill plant-pathogenic nematodes19. Many of the 55 proteins we have experimentally verified as constituents of the extracellular proteome of B. amyloliquefaciens FZB42 by two-dimensional (2D gel) electrophoresis and subsequent MALDI-TOF mass spectrometry (http://www.mpiib-berlin.mpg.de/2D-PAGE) possess counterparts in the secretome of B. subtilis 168 (refs. 20,21). The wide range of extracellular macromolecular depolymerases, including the arabinogalactanase discussed above, enables B. amyloliquefaciens FZB42 to grow on plant surfaces. Phytase was detected as a prominent member of the B. amyloliquefaciens FZB42 secretome and its concencentration increases in the presence of plant root exudates. By contrast, a promoter mutation prevents phytase expression in B. subtilis22. The phytate-degrading activity of B. amyloliquefaciens FZB42 could generate phytate phosphorus to nourish plants under conditions of phosphate starvation and thus promote their growth. This is consistent with the finding that the transcription of the related B. amyloliquefaciens FZB45 phytase is regulated by the PhoP/R two component system induced by phosphate starvation22.

Root exudates also stimulated upregulation of enzymes probably involved in response to oxidative stress generated in plant roots by, for example, thiol peroxidase or enzymes catabolizing compounds secreted by plant roots. Enzymes of this type include bacillopeptidase F, γ-glutamyl transpeptidase (which generates peptides, oligopeptides and amino acids) and phosphotransacetylase (which acts on organic acids) (Supplementary Table 5).

Our proteomic analysis suggests that the hook-associated flagellar proteins HAP1 and HAP2 and the HAG flagellin are differently affected by exudates secreted by plant roots. Although levels of the flgK gene product HAP1 were reduced in the presence of root exudates, expression of the fliD and hag gene products (HAP2, HAG) was upregulated (Supplementary Table 5). Flagellin proteins are thought to elicit a host basal defense against potential pathogens. It is likely that variations of the flagellins and other exposed bacterial proteins during colonization at surfaces of plant roots might enhance the ability of B. amyloliquefaciens FZB42 to tolerate unfavorable plant responses, and thereby contribute to its competence in the rhizosphere23.

It has been shown that a blend of volatile compounds—especially 3-hydroxy-2-butanone (acetoin) and 2,3-butanediol, emitted by the rhizobacteria B. subtilis and B. amyloliquefaciens—can enhance plant growth24. These volatiles have also been implicated in eliciting induced systemic resistance by Bacillus strains GB03 and INR937a24. Acetolactate synthase catalyzes the condensation of two pyruvate molecules into acetolactate, which is decarboxylated by AlsD to acetoin. B. amyloliquefaciens FZB42 harbors the genes that catalyze the 2,3-butanediol pathway (Supplementary Table 5).

Bacillus spp. are thought to enhance plant growth through synthesis of the plant growth hormones gibberellic acid and an auxin, indole-3-acetic acid (IAA), although the genetic basis for the synthesis of any plant growth–regulating compound in bacilli has yet to be reported25. We have recently shown that representatives of the B. subtilis/B. amyloliquefaciens group produce substances with IAA-like bioactivity26 and that B. amyloliquefaciens FZB42 produces reasonable amounts of IAA when fed tryptophan27. A careful examination of the whole genome sequence of B. amyloliquefaciens FZB42 revealed three candidate genes with apparent homology to genes previously reported to be involved in IAA metabolism: ysnE (encoding a protein similar to IAA acetyltransferase from Azospirillum brasilense28), dhaS (similar to indole-3-acetaldehyde dehydrogenase from Ustilago maydalis29) and yhcX (a putative nitrilase similar to nitrilase2 from Arabidopsis thaliana30) (Supplementary Table 5). YsnE and YhcX, but not DhaS, participate in IAA synthesis27.

Nine giant gene clusters are involved in biocontrol

B. amyloliquefaciens FZB42 posseses a previously unrecognized potential to synthesize bioactive secondary metabolites. The nine gene clusters that direct the synthesis of bioactive peptides and polyketides by modularly organized mega-enzymes define both nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKS) (Fig. 2). Four (bmyD, pks2, pks3 and nrs) are not found in B. subtilis 168. Except for the gene cluster encoding bacilysin synthesis, the functional activities of the remaining gene clusters depend on Sfp, an enzyme that transfers 4′-phosphopantetheine from coenzyme A to the carrier proteins of nascent peptide or polyketide chains (Table 2). Recently, we have assigned biological functions to three NRPS-encoding (bmyD, fen, srf) and two PKS-encoding (pks1, pks3) gene clusters in the genome of B. amyloliquefaciens FZB42 (refs. 7,8).

Figure 2: Secondary metabolites with biocontrol and plant growth–promoting activities.
figure 2

Indole-3-acetic acid (IAA) and 2,3-butanediol are shown in green. The acetolactate synthase AlsS (1) and the decarboxylase AlsD (2) catalyze the two-step conversion from pyruvate to acetoin. The genes ysnE and yhcX (3) encoding a putative tryptophan acetyl transferase and a putative nitrilase, respectively, are involved in tryptophan-dependent IAA synthesis27. Antibacterial polyketides are synthesized by membrane-anchored, polyketide-megasynthases (PKS-MS)29. Structures of bacillaene, difficidin and macrolactin are shown in brown. The lipopeptides surfactin, fengycin and bacillomycinD (red) are nonribosomally synthesized by modularly organized, giant peptide synthetases (NRPS), which are either diffusible or membrane anchored. Synthesis of polyketides and lipopeptides is dependent on functional phospho-panthetheinyl-transferase Sfp and often on the membrane protein YczE (yellow-filled circle). NRPSs are also involved in synthesis of the dipeptide bacilysin (red) and the Fe2+ siderophore bacillibactin (blue). A further gene cluster harboring NRPS-encoding genes (NRS-NRPS) was detected in the genome of FZB42, but its putative synthesis product remains to be identified.

Table 2 NRPS and PKS gene clusters involved in synthesis of secondary metabolites in B. amyloliquefaciens FZB42 and B. subtilis 168a

PKS gene clusters

The bae (formerly pks1) gene cluster is involved in synthesis of bacillaene8, an inhibitor of prokaryotic protein synthesis. The structure of the bae gene cluster reflects the assembly line of bacillaene synthesis as apparent for the pksX gene cluster of B. subtilis31, with one notable deviation. The observation that the baeM gene of B. amyloliquefaciens FZB42 does not contain the superfluous last module present in the homologous pksM gene of B. subtilis suggests that this module is skipped during synthesis of the antibiotic in B. subtilis. In silico analysis of the pks2 gene cluster, renamed mln here, revealed that it could be attributed to the production of macrolactin, an inhibitor of bacterial peptide deformylase detected in both B. subtilis32 and B. amyloliquefaciens33. The domain structure of the mln gene cluster of B. amyloliquefaciens FZB42 reflects the pathway whereby the 24-membered ring lactone skeleton of macrolactin is synthesized by extension of an acetyl starter unit by 11 successive Claisen condensations with malonyl-CoA (Fig. 3). The polyketide megasynthase encoded by mlnB through mlnH comprises 11 modules, each containing at least three basic domains—a ketosynthase, a ketoreductase and an acyl carrier protein—but lacking an integrated acyl transferase (AT). In the growing family of type 1 PKS systems lacking AT activity, acyl units are transferred by a discrete and iteratively acting acyl transferase8, encoded by the mlnA gene in B. amyloliquefaciens FZB42. MlnA displays striking similarity to malonyl-CoA–specific trans-ATs and its activity should lead to the incorporation of the same extender unit in all modules lacking AT. The completed polyketide chain is released and cyclized by the thioesterase at the C terminus of the last module. Many of the modules are encoded on two separate genes; this split-module arrangement was previously reported in other PKS lacking AT, for example, those of bacilli6,31 and myxobacteria34 (Fig. 3). The chemical structure of the macrolactin ring is in accordance with the domain organization of the mln assembly line with three notable deviations. One enoyl reductase necessary for complete reduction and two dehydratase (DH) domains necessary for formation of double bonds lack modules 2, 7 and 10, respectively (Fig. 3). Deviations from the colinearity rule in PKS without AT are not unusual and the absence of dehydratase and enoyl reductase domains has been reported, for example in the difficidin gene cluster8. Given the three exceptions mentioned above, the number and the overall structure of the modules found in the mln gene cluster agree perfectly with the macrolactin assembly line. In fact, by combining mass spectrometric and ultraviolet-visible data obtained from the culture filtrates of wild-type and dif/bae double-mutant strains with a database search, we identified four members of the macrolactin family—macrolactins A and D as well as 7-O-malonyl- and 7-O-succinyl-macrolactin—as mln product–specific metabolites. The technical details of identification of the macrolactins in B. amyloliquefaciens FZB42 are described elsewhere35. Currently, at least 17 macrolactins have been described and one of them, 7-O-malonyl macrolactin A, is efficient against Gram-positive bacterial pathogens36. The knowledge of the genetic basis of macrolactin assembly will greatly facilitate future efforts to enhance application of this interesting class of polyketides.

Figure 3: The macrolactin assembly line deduced from the pks2 gene cluster (mlnBCDEFGH).
figure 3

AT activity is delivered by the trans-acting mlnA gene product. Enzyme functions necessary but not detectable inside the corresponding modules are indicated by red filled circles. The thiolation domains (T, ACP) predicted to be inactive are not colored.

Besides the substantial similarity in their domain structure8, the three pks gene clusters present in B. amyloliquefaciens FZB42 share specific features such as iteratively used separate AT domains, unusual split modules at the interface of two separate synthases and the skipping of superfluous modules and domains. As the three pks gene clusters displayed no apparent deviation in GC content and oligonucleotide usage pattern, it seems unlikely that they were acquired by recent events of horizontal gene transfer. Instead, they probably evolved from an ancient pks gene cluster by several duplicative events. A recent report37 describes the existence of a membrane-bound, organelle-like protein complex responsible for bacillaene synthesis in B. subtilis. It is likely that similar superstructures also exist in the three PKS assembly lines encoded by B. amyloliquefaciens FZB42. Moreover, we propose that the membrane-spanning protein YczE, shown to be essential for synthesis of the three polyketides and the lipopeptide bacillomycin D, may anchor the mega-synthases at the membrane (Table 2 and Fig. 3).

Peptide antibiotics and siderophores

In addition to the previously identified gene clusters involved in nonribosomal synthesis of the lipopeptides surfactin, bacillomycin D and fengycin, B. amyloliquefaciens FZB42 also harbors the bac and dhb gene clusters responsible for synthesis of the antibacterial dipeptide bacilysin38 and the iron-siderophore bacillibactin in B. subtilis39. Both compounds were expressed in B. amyloliquefaciens FZB42 (Table 2). The 15-kb nrs gene cluster, which probably directs nonribosomal synthesis of a hybrid comprising a cysteine-containing peptide and a polyketide (Supplementary Fig. 3 online), is located at the genomic island 12 within a region with substantially lower GC content and variation in oligonucleotide usage. It seems to be the only NRPS/PKS gene cluster acquired by horizontal gene transfer in B. amyloliquefaciens FZB42. The nrs gene cluster is preceded by a SigA-dependent promoter, generating a transcript that was detected by RT-PCR. Although the product of the nrs gene cluster cannot be assigned yet, it may act as a siderophore and enhance the ability of B. amyloliquefaciens FZB42 to scavenge iron from the rhizosphere. Under conditions of iron starvation, growth of a double mutant with deletions in both of the bacillibactin and nrs gene clusters is more hampered than that of a mutant deficient in bacillibactin alone (X.H.C., unpublished results).

In total, we estimate that B. amyloliquefaciens FZB42 dedicates 344 kb to the synthesis of secondary metabolites used to cope with competing microorganisms in the plant rhizosphere (Table 2). As this bacterium colonizes plant roots, it inhibits growth of phytopathogenic bacterial or fungal competitors either by depriving them of the essential iron (e.g., through the action of bacillibactin and, possibly, the nrs-encoded peptide) or by directly inhibiting their growth (e.g., through antimicrobial lipopeptides and polyketides).

Genetic amenability

B. amyloliquefaciens FZB42 is naturally competent and amenable to genetic transformation using a modified one-step protocol, originally developed for B. subtilis 168 (ref. 40). This property was used to generate a set of mutants impaired in many distinct functions, such as production of extracellular enzymes6, secondary metabolites7,8, biofilm formation, alternative σ factors and plant-growth promotion27, by a gene-replacement strategy involving homologous recombination. B. amyloliquefaciens FZB42 exhibited its maximal competence somewhat earlier than B. subtilis, during late exponential growth27. Not surprisingly, the B. amyloliquefaciens FZB42 genome harbors the complete set of genes necessary for development of genetic competence (Supplementary Table 7 online). The majority of competence genes are highly homologous to their counterparts in B. subtilis 168, but the genes that control the competence quorum-sensing system of B. amyloliquefaciens FZB42 (comQ, comX, comP) exhibit low similarity to the respective genes of B. subtilis 168 (36%, 31% and 55%, respectively). This genetic variability correlates with specificity in the quorum-sensing response, so that each pheromone is sensed only by its strain-specific cognate receptor.

Discussion

Analysis of its 16S rRNA sequence indicates that B. amyloliquefaciens FZB42 is closely related, but not identical, to the B. amyloliquefaciens type strain DSMZ7 (ref. 6). A phylogenetic tree constructed from the tetranucleotide usage patterns of genomes of previously sequenced Bacillus strains confirmed that B. amyloliquefaciens FZB42 represents a separate branch, clearly distant from B. subtilis 168 (Supplementary Fig. 2). The gyrA and cheA gene sequences were used to resolve its taxonomic position more precisely; both genes have efficiently resolved closely related taxa of the B. subtilis group40. The resulting phylogenetic trees (Supplementary Fig. 4 online) revealed that B. amyloliquefaciens FZB42 is rather close to strains recently introduced as endophytic or plant-associated ecomorphs of B. amyloliquefaciens41. Essentially all Bacillus strains that are currently commercialized for their plant growth–promoting and biocontrol activities belong to this ecotype. Moreover, nearly all of these strains produced a wide spectrum of polyketides, including bacillaene, difficidin and macrolactin (Supplementary Fig. 4). This suggests that this plant growth–promoting bacterium can be considered a paradigm for plant-associated bacteria related to the B. amyloliquefaciens type strain DSMZ7. The observation that as much as 8.5% of the genome appears to be devoted to antibiotic production is remarkable, especially in the light of recent estimations that not more than 4–5% of an average Bacillus genome is devoted to antibiotic production42. By comparison, Streptomyces avermitilis, which is well known for producing a broad range of antibiotics, devotes 6.4% of its genome secondary metabolite production43. Surprisingly, the genome of B. amyloliquefaciens FZB42 lacks known biosynthetic gene clusters of ribosomally synthesized antibiotics (lantibiotics), which are common in B. subtilis strains, even though it encompasses the regulatory and immunity genes directed against the lantibiotics mersacidin and subtilin (Supplementary Fig. 1; outmost circle). Nevertheless, B. amyloliquefaciens FZB42 seems to produce other conventionally synthesized lantibiotics or toxins, as it was recently shown that a sigW mutant strain of B. subtilis is extremely sensitive to culture fluids derived both from B. amyloliquefaciens FZB42 and its sfp derivative44.

The complete genome sequence of B. amyloliquefaciens FZB42, along with its amenability to genetic manipulation, should facilitate exploitation of FZB42's hitherto unappreciated potential to produce secondary metabolites for developing agrobiotechnological agents with predictable features. The genome should be equally valuable in revealing the complex interactions between Gram-positive rhizo-bacteria and plants and in studying biofilm formation and other biological processes that have been lost or attenuated during the intensive use of B. subtilis 168 as an experimental system over the past six decades45.

Methods

Strains, genome sequencing, assembly, annotation.

B. amyloliquefaciens FZB42 was deposited as 10A6 in Bacillus Genetic Stock Center, BGSC. A set of mutant strains is also available from BGSC. Characteristics of regulatory mutants and the mutants used in this study for verifying bacilysin, bacillibactin and macrolactin production are compiled in Supplementary Table 8 online. Genomic DNA prepared from B. amyloliquefaciens FZB42 was used to produce whole shotgun libraries. For the libraries, fragments of 1.5 to 3.0 kbp were separated by gel electrophoresis after mechanical shearing (Nebulizer; Invitrogen), end-repaired and cloned using vectors pTZ19R (Amersham) or pCR2.1-TOPO (TOPO TA Cloning Kit for Sequencing; Invitrogen). More than 20,000 plasmid DNAs were isolated using two BioRobots8000 (Qiagen). A total of 40,000 sequence reactions were automatically analyzed on ABI PRISM models 377-96 and 3730XL (Applied Biosystems) or MegaBace1000 and MegaBace4000 sequencers (GE Healthcare). The 40,000 generated sequences were assembled into contigs using the Phrap assembly tool (http://www.phrap.org/). Primer-walking on plasmids, as well as PCR-based techniques were used to close remaining gaps and to solve misassembled regions caused by the high degree of repetitive sequences. All manual editing steps were performed using the GAP4 software package v4.5 and v4.6 (ref. 46). Prediction of protein-encoding sequences and open reading frames (ORFs) was initially accomplished with YACOP47 using the ORF-finding programs Glimmer, Critica and Z-curve. All ORFs have been manually curated. Initial annotation was done by a close comparison to B. subtilis 168 using Subtilist48 as reference. All predictions were verified and modified manually by comparing the protein sequences with the public databases SwissProt, GenBank, ProDom, COG and Prosite using the annotation software GeneSOAP49.

Prediction of sRNA, riboswitches and genomic islands.

A comparative method similar to that used to predict sRNAs in cyanobacteria50 was applied. After the extraction of intergenic regions and homology detection by BLASTN, homologous sequences were clustered. The resulting (multiple) sequence alignments were scored using RNAz51. Homologies to known sRNAs stored in the Rfam database were identified by BLAST. The gene islands were identified by the specific genetic repertoire and by alterations in relative abundance of tetranucleotides12. Variations in oligonucleotide usage patterns were calculated as described previously12.

Proteomics.

Cells were grown under shaking (210 r.p.m.) at 24 °C for 14 h in medium 1C consisting of 0.7% tryptone, 0.3% peptone, 0.1% glucose, 0.5% NaCl and 10% soil extract. When appropriate, root exudates prepared from maize seedlings were added to the culture (250 mg/l). Preparation of the extracellular protein fraction, 2D-gel electrophoresis and protein identification by in-gel digestion with trypsin, extraction of peptides in the Ettan Spot Handling Workstation (Amersham) and determination of peptide masses in the Proteomics Analyzer 4700 (Applied Biosystems) were done as recently described52. The 2D-gel images were analyzed using the PD-Quest (Bio-Rad) software.

Mass spectrometry.

Polyketides were identified by mass-spectrometric analysis, as described previously8. High-performance liquid chromatography (HPLC)-electrospray ionization (ESI) mass spectrometry (MS) was performed from aliquots of acetonitrile-water extracts of the culture filtrates of the Bacillus strains. Before HPLC-ESI-MS was done, the extracts were desalted by solid-phase extraction. Every sample was measured in two different modes, negative and positive mode, and mass spectra were acquired in the range of m/z 300–800.

Accession number.

GenBank CP000560.

Note: Supplementary information is available on the Nature Biotechnology website.