Introduction

The slow-growing bacterial genus Bradyrhizobium has been shown to be one of the most abundant groups in soil1,2 including soils sampled from long-term field experiments in the UK (Rothamsted Research, Harpenden)3,4. A key characteristic of the order Rhizobiales including the genus Bradyrhizobium is the ability to form nitrogen-fixing symbioses with legumes to increase nitrogen availability to plants5,6,7,8. This ability is thought to have evolved through horizontal gene transfer as the genes involved in this process are usually located on symbiosis islands on the chromosomes of bradyrhizobia7,8,9 or on symbiotic plasmids in many rhizobia5. Some isolates of Bradyrhizobium have been shown to be non-symbiotic and do not possess the ability to form nodules. The absence of nodulation ability has been noted in the strain Bradyrhizobium sp. S23321 isolated from paddy soil in Japan8 although nitrogen fixation (nif) genes were present. Recently, Bradyrhizobium ecotypes from forest soils have been shown to lack both nodulation and nitrogen fixation genes2.

Bradyrhizobium is biologically important in soils, with different representatives found to perform a wide range of biochemical functions including photosynthesis, nitrogen fixation during symbioses, denitrification and aromatic compound degradation8. Nitrogen removal through heterotrophic denitrification is an important step in the global nitrogen cycle carried out by many groups including Bradyrhizobium10. The multiple roles of Bradyrhizobium in the nitrogen cycle make the ecology of this group important for agriculture.

Bradyrhizobium is studied extensively due to its symbiotic relationship with soybean and consequently several genomes have been published. Currently, there are seven complete Bradyrhizobium genomes in the NCBI database. Six of these are symbiotic and are able to fix nitrogen and form root nodules on legumes (B. diazoefficiens USDA 110, B. japonicum USDA 6, B. japonicum E109, Bradyrhizobium sp. ORS278, Bradyrhizobium sp. BTAi1 and B. oligotrophicum S58) with ORS278, BTAi1 and S58 able to form both stem and root nodules on the aquatic legume Aeschynomene6,11. The other bradyrhizobial genome (Bradyrhizobium sp. S23321) is free-living because it is unable to form nodules; however, it still contains the genes required for nitrogen fixation. Four genomes sequenced from North American forest soils were also missing nodulation and nitrogen fixation genes (Bradyrhizobium sp. LTSP849, Bradyrhizobium sp. LTSP857, Bradyrhizobium sp. LTSP885 and Bradyrhizobium sp. LTSPM299). These genomes were sequenced using shotgun sequencing of the soil community and were assembled to near completion2. Due to the availability of a diverse array of genome reference sequences, Bradyrhizobium is an appropriate model to study other soil bacteria: understanding the mechanisms of Bradyrhizobium adaptation to independent living in agricultural soils under contrasting management may reveal the genetic potential of this globally important genus.

Here we present the genome and gene annotations and carbon metabolism profiles of two free-living Bradyrhizobium isolates from the Highfield experiment at Rothamsted Research that has three long-term treatment regimes: grassland, arable (wheat) and bare fallow tilled regularly to maintain a plant-free soil. Maintenance of these treatments for 60 years has led to distinct differences in soil properties and the soil microbiome12. Bradyrhizobium sp. G22 and Bradyrhizobium sp. BF49 were isolated from soil taken from the permanent grassland and permanent bare fallow plots of the Highfield experiment respectively. These Bradyrhizobium strains are the first to be isolated and genome sequenced from European soil and the first free-living and non-diazotrophic isolates, without the presence of either nodulation or nitrogen fixation genes, to have their genomes sequenced and assembled from cultured samples. The isolates were interrogated for differences to determine the level of genetic heterogeneity in carbon metabolism between these isolates.

Results and Discussion

General genome description and comparisons

The genome of the grassland isolate G22 is 9,022,917 bp in size while the bare fallow isolate BF49 genome is 7,547,693 bp, constituting a 1.5 Mbp size difference in addition to a 364,482 bp plasmid in G22. The genome size for G22 is similar to nodulating strains B. diazoefficiens USDA 110, B. japonicum USDA 6, B. japonicum E109, B. oligotrophicum S58 and Bradyrhizobium sp. BTAi1, whereas for BF49 it is closer in size to the free-living strain Bradyrhizobium sp. S23321 and the photosynthetic, nodulating strain Bradyrhizobium sp. ORS278. Table 1 summarises the genome information of the two novel strains G22 and BF49 along with the other seven completed Bradyrhizobium genomes in the database. G22 had 19 contigs which could not be placed in the chromosome or plasmid. These were annotated and used in the subsequent analysis.

Table 1 Summary of the seven published complete Bradyrhizobium genomes and the two novel strains G22 and BF49.

G22 shows 1356 more genes (8787) than BF49 (7431) and this rises to 1902 when including the genes contained on the plasmid (541). GC content is similar between the strains at 63.7% and 63.8% for G22 and BF49 respectively, consistent with other Bradyrhizobium strains listed in Table 1. The G22 plasmid has a GC content of 60.7%, identical to the plasmid of BTAi1. The G22 and BF49 genomes show high pairwise sequence identity in comparison to both USDA 110 (G22: 84.4%, BF49: 83.4%) and S23321 (G22: 83.4%, BF49: 83.2%) using LASTZ (Large-Scale Genome Alignment Tool).

Orthologous gene clusters and core genome phylogeny

G22 and BF49 were compared with the free-living isolate, S23321 and the symbiotic isolate, USDA 110 (Fig. 1). This suggests that there is a core genome of 4562 genes which are present in all four genomes assessed. The 103 genes present only in the USDA 110 genome include those involved in nodulation and uptake hydrogenase. The 171 genes which are only present in the USDA 110 and S23321 genomes include nitrogen fixation genes. Only a small number of genes are unique to any one isolate (G22: 99, BF49: 90, S23321: 35, USDA 110: 103). OrthoVenn identified a core genome of 3442 homologous gene families across the nine complete genomes available in the database. The core genome SNP analysis (Fig. 2) shows that G22 clusters closest with the B. japonicum strains; E109 and USDA 6. BF49 is separate from G22 being more basal. The closest relatives are the free-living strain S23321 and the soybean nodulating type-strain USDA 110. Two of the photosynthetic, Aeschynomene-nodulating isolates (BTAi1 and S58) cluster together.

Figure 1
figure 1

Venn diagram showing orthologous gene clusters for G22, BF49 and two published genomes, S23321 and USDA 110.

Output generated using data from OrthoVenn.

Figure 2
figure 2

Phylogenetic tree for the 3442 homologous gene clusters in the core genome for the nine Bradyrhizobium genomes.

The node labels represent the certainty of that node in the phylogenetic tree where one is maximum certainty. Reference sequence: B. diazoefficiens USDA 110.

Isolate identification and 16S phylogeny

The 16S rRNA sequence from G22 shows 100% identity with Bradyrhizobium sp. VUPMI37 [Accession number: HG940535] and Bradyrhizobium sp. ICMP12674 [Accession number: AY491080]13 from the NCBI and RDP databases respectively. Bradyrhizobium sp. VUPMI37 was originally isolated from Vigna unguiculata nodules and Bradyrhizobium sp. ICMP12674 was originally isolated from Ulex europaeus13,14. The BF49 16S rRNA fragment shows 100% identity with B. canariense SEMIA928 from the NCBI database [Accession number: FJ390904]15 and B. canariense LG-6 from the RDP database [Accession number: GU306140]. B. canariense SEMIA928 was originally isolated from Lupinus spp. and B. canariense LG-6 was originally isolated from Lupinus angustifolius root nodules16. The 16S phylogeny (Fig. 3) clusters G22 with VUPMI37 and ICMP12674, which were the top hits from the NCBI and RDP databases. Similarly, BF49 clusters with the top hits from the NCBI and RDP databases (SEMIA928 and LG-6) in addition to B. lupini USDA 3051 which was the second hit from the NCBI database [Accession number: NR_134836]. This strain was originally isolated from Lupinus and was reclassified from Rhizobium lupini to B. lupini in 201517. Both G22 and BF49 are in the same clade as the free-living strain, S23321 and the soybean-nodulating strains, E109, USDA 6 and USDA 110. Two of the four strains from North American forest soils, Bradyrhizobium sp. LTSP849 and LTSP8572 were also in this clade. The photosynthetic strains (ORS278, BTAi1 and S58) are in a separate clade including the remaining two strains from North American forest soils Bradyrhizobium sp. LTSP885 and LTSPM299.

Figure 3
figure 3

Phylogenetic tree for the 16S rRNA gene using Neighbour-Joining clustering method with 1000 bootstraps.

Genes involved in nitrogen fixation and nodulation

Nitrogen fixation and nodulation genes including nifDKH, nodD and nodABC; are all absent from both G22 and BF49 (Table 2) and so we suggest that these isolates are not able to either form nodules or fix atmospheric nitrogen. The absence of both nif and nod genes is in contrast to other published, complete Bradyrhizobium genomes and indicates similarity with forest soil bacteria (LTSP849, LTSP857, LTSP885 and LTSPM299)2. B. diazoefficiens USDA 110 (previously B. japonicum USDA 110) is a known symbiotic strain and contains all nif and nod genes listed. The seven complete Bradyrhizobium genomes which have previously been published either contain both nif and nod genes (the soybean-nodulating strains B. diazoefficiens USDA 110, B. japonicum USDA 6 and B. japonicum E109) or just nif genes (the Aeschynomene-nodulating strains Bradyrhizobium sp. BTAi1, Bradyrhizobium sp. S58 and Bradyrhizobium sp. ORS278 and the free-living strain Bradyrhizobium sp. S23321) (Fig. 4). These strains use a nod-independent route for stem and root nodulation of Aeschynomene11. The FixLJ two component system is present in both the grassland and bare fallow isolate (Table 2). FixLJ acts in response to low oxygen conditions in soil and in the nodule and controls the expression of genes for both nitrogen fixation and denitrification18,19. This two-component system has also been shown to regulate the response to nitric oxide in Sinorhizobium meliloti being shown to regulate a high proportion of genes induced by the presence of nitric oxide20. The presence of fixLJ in G22 and BF49 is consistent with other Bradyrhizobium isolates including all seven completed genomes.

Table 2 Summary of presence or absence of genes of interest in the seven published complete Bradyrhizobium genomes, four isolates from North America forest soils and the two novel strains G22 and BF49 (+ = present, − = absent).
Figure 4
figure 4

Whole chromosome comparisons between the seven complete Bradyrhizobium genomes, BF49 and G22 showing positions of genes involved with nitrogen cycling, nodulation and photosynthesis on the reference genome sequence.

Reference sequence: USDA110 (a) and S23321 (b).

Genes involved in denitrification

Both genomes encode a nitrate reductase, NapA/B, which catalyses the reduction of nitrate to nitrite; the first stage in denitrification21,22,23. This is common among Bradyrhizobium including all seven complete genomes (Table 2 and Fig. 4) and the four genomes from North American forest soils (Table 2). In addition, all fully-sequenced isolates including both G22 and BF49 contain nirK, encoding a respiratory copper-containing nitrite reductase which is involved in the second stage of denitrification reducing nitrite to nitric oxide. The third stage of denitrification is the conversion of nitric oxide into nitrous oxide, a potent greenhouse gas and is catalysed by a nitric oxide reductase encoded by norB/C22,23. The presence of a nitric oxide reductase gene has been noted in all previously published and complete Bradyrhizobium genomes in addition to G22 and BF49. The denitrification pathway is not present in the four genomes of strains from North American forest soils.

The ability to convert the greenhouse gas, nitrous oxide into environmentally benign nitrogen gas in the final stage of denitrification is an attribute with global importance24. Nitrous oxide reductase encoded by nosZ catalyses this process but it is not ubiquitous across Bradyrhizobium22,23,25,26. Of the seven published complete genomes, only B. diazoefficiens USDA 110, Bradyrhizobium sp. BTAi1 and B. oligotrophicum S58 contain the nosZ gene. It is absent from the grassland isolate, G22 but present in the bare fallow isolate, BF49. The presence of the gene shows a potential for BF49 to perform this function and increases the agricultural and environmental importance of this isolate.

Uptake hydrogenase

The uptake of hydrogen is catalysed by uptake hydrogenase which is encoded by the hup genes27,28,29. This process produces ATP which is used by nitrogen-fixing bacteria to mediate for energy lost through the nitrogen fixation process27,29. The nickel-iron hydrogenase, encoded by hupSL29, is absent from both G22 and BF49 but is present in all of the symbiotic strains of Bradyrhizobium with complete genomes. These genes are also absent from LTSP849, LTSP857, LTSP885 and LTSPM299 (Table 2).

Photosynthesis and carbon fixation

Genes for photosynthesis are present in four of the published complete Bradyrhizobium genomes; S23321 (free-living), S58, BTAi1 and ORS278 (Aeschynomene host) (Fig. 4b). These include genes for bacteriochlorophyll (bchCXYZ/FNBHLM), carotenoids (crtEF), light harvesting polypeptides (pucBAC/pufBA) and reaction centre subunits (puhA/pufLM)30,31. They are not present in the soybean-nodulating isolates (USDA 110, USDA 6 and E109), G22 or BF49 genomes or the four isolates from forest soils (LTSP849, LTSP857, LTSP885 and LTSPM299) (Table 2). Many heterotrophic bacteria including Bradyrhizobium can fix carbon using the Calvin-Benson-Bassham cycle (CBB cycle). The significance of this is unclear in most cases although the photosynthetic Aeschynomene-nodulating strain ORS278 has been shown to require an active CBB cycle for symbiotic nitrogen fixation32. The first stage of the CBB cycle is catalysed by Ribulose-1,5-bisphosphate carboxylase oxygenase (RuBisCo) which is present in both G22 and BF49 genomes, consistent with other Bradyrhizobium isolates (Table 2). Transketolase is an important enzyme in both the CBB cycle and pentose phosphate pathway, catalysing the interconversion of sugars33,34. Genes for the transketolase enzyme are present in both G22 and BF49 consistent with all other complete Bradyrhizobium isolates which have been published.

Carbon metabolism

The principal components analysis (PCA) biplot (Fig. 5A) shows that there is a separation of the time points and USDA 6 from G22 and BF49 across PC1 which accounts for 44.86% of the variation. Across PC2, G22 and BF49 separate and PC2 accounts for 30.52% of the total variation. The first two principal components were visualised as together they accounted for 75.38% of the variation. The third PC accounted for 7.5%. Figure 5B shows the 95 substrates colour coded according to category. More carboxylic acids and amino acid substrates are associated with the separation across PC1; USDA 6 from G22 and BF49. Carbohydrates tend to have a negative direction across PC2 being more closely associated with BF49. One carboxylic acid, malonic acid, was the only substrate which has a positive direction for PC2 and negative for PC1 associating more closely with G22.

Figure 5
figure 5

Principal components analysis biplot showing PC1 and PC2 accounting for 75.38% of the variation.

(A) shows individual replicates across the 5 time points and (B) shows the 95 substrates classified into 6 categories.

Malonic acid can be found in plant tissues, including legumes being first characterised from alfalfa leaves in 1925 and has been found to be in very high concentrations in soybeans35,36,37. This pathway was found to be closely associated with the symbiotic nitrogen fixation pathway in Rhizobium leguminosarium bv trifolii38. Malonic acid is activated before being broken down into acetate and carbon dioxide through decarboxylation39. It is often converted to malonyl-CoA by a CoA transferase, which is present in G22 (Phosphoribosyl-dephospho-CoA transferase) and the malonyl-CoA is then decarboxylated by malonate decarboxylase35,39. Malonate decarboxylase has four subunits; alpha, beta, gamma and delta; all of which are present in G22. The grassland isolate also contains two mad genes; madL and madM. These genes have previously been reported to be part of the malonate decarboxylase operon as transport proteins thought to be involved in malonate uptake35. The grassland isolate G22 also contains a malonyl CoA acyl carrier protein transacylase and triphosphoribosyl-dephospho-CoA synthetase. All genes involved in malonate decarboxylation are absent in the bare fallow isolate BF49. Malonate transport and utilisation genes are also present in other Bradyrhizobium isolates including BTAi1, ORS278, USDA 110, S23321 and all four of the forest strains (LTSP849, LTSP857, LTSP885 and LTSPM299). When grown in malonic acid, only G22 was able to metabolise it whereas BF49 and USDA 6 were not able to utilise this carbon source to the same extent.

The highest loadings and therefore the substrates which make the largest contribution to PC1 were L-pyroglutamic acid, L-leucine and D-galacturonic acid. L-pyroglutamic acid is an amino acid which is also known as 5-oxoproline. It is involved in the glutathione pathway and is converted to L-glutamic acid by the enzyme 5-oxoprolinase (EC 3.5.2.9)40,41. This gene is present in USDA 6 but is absent in both G22 and BF49. USDA 6 recorded a larger OD across all time points for this substrate than G22 and BF49. L-leucine is also an amino acid and is involved in numerous pathways including valine, leucine and isoleucine biosynthesis and degradation40,41. USDA 6 was able to utilise this substrate to a much greater extent than G22 and BF49 and the main difference in genes involved in L-leucine metabolism was the presence of leucine transaminase (EC 2.6.1.6) in USDA 6 and absence in G22 or BF49. D-galacturonic acid is a carbohydrate involved in numerous pathways including pentose and glucoronate interconversions, starch and sucrose metabolism and amino sugar and nucleotide sugar metabolism40,41. It is metabolised by USDA 6 more readily than G22 and BF49. This is likely due to the presence of pectinase (EC 3.2.1.15) and urinate dehydrogenase (EC 1.1.1.203) in USDA 6 but not in G22 or BF49. OD curves for the substrates discussed can be found in Supplementary Information S1.

Plasmid

Plasmid replication genes repABC and parAB are only found on the G22 plasmid and not on either the G22 or BF49 chromosomes. The trb operon for conjugal transfer consists of 12 genes; traI, trbBCDEJKLFGHI42. From this operon, 10 out of the 12 genes are present on the G22 chromosome; only trbH and traI are missing. Conjugal transfer genes trbBCDEFGIJL are absent from BF49. Conjugative transfer DNA nicking endonuclease genes traR/traO are present only in the G22 plasmid. The other genes which were unique to the plasmid were two genes involved in purine utilisation (yagS and a putative xanthine transporter) and one for osmotic stress (opgC). The purine utilisation genes, yagS, encodes a periplasmic aromatic aldehyde oxidoreductase. YagS is usually part of an operon yagTSRQ which encodes a molybdenum-containing iron sulphur flavoprotein, where YagS is the FAD-containing subunit. The role of this protein has been suggested to be detoxification of aromatic aldehydes43,44. The flavoprotein produced from yagTSRQ shows homology with xanthine dehydrogenase43. The osmotic stress gene, opgC, is involved in the synthesis of osmoregulated periplasmic glucans (OPGs). The exact role of OPGs is not understood however they have been shown to potentially play a role in the interaction between bacteria and their eukaryotic host45,46. The OpgC protein has been examined in Rhodobacter sphaeroides and was shown to encode a succinyltransferase homolog involved in the succinyl modification of OPGs45,46. The other genes contained on the plasmid are also on the chromosome of both G22 and BF49 including genes for DNA ligases, DNA repair, cAMP signalling and RNA processing and modification.

Conclusions

The strains described here are the first to be isolated and genome-sequenced from European soil and are unique compared to other completed Bradyrhizobium genomes due to the absence of previously characterised genes and gene clusters for symbiosis, nitrogen fixation and photosynthesis. They are also distinct from the North American forest isolates as G22 and BF49 contain genes for denitrification. They represent a major group, likely to play a key role in denitrification. The presence or absence of the terminal denitrification gene, nosZ, may determine whether the end product of denitrification is the potent greenhouse gas, nitrous oxide or the less problematic nitrogen gas. The carbon metabolism analysis shows that G22 and BF49 show different metabolic profiles over time and this are also distinct from a nodulating strain, USDA 6. The genomes and carbon metabolism analysis indicate that the free-living soil Bradyrhizobium have the potential to carry out many degradative and transformative functions in soil; the marked differences between two isolates from comparable soils that have undergone different management indicates that they form part of an extremely heterogeneous group.

Methods

Isolation and identification

Bradyrhizobium were isolated from the permanent grassland and bare fallow plots in the Highfield experiment at Rothamsted Research12. Serially diluted soil samples were plated onto modified arabinose gluconate (MAG) agar plates incubated at 28 °C47. Colonies forming after 7 days were picked, DNA extracted using MicroLYSIS-Plus using manufacturer’s instructions (Microzone, UK) and identified by PCR by the production of a 1360 bp 16S rRNA fragment using custom-designed Bradyrhizobium specific primers: Bradj16S70F (5′-GCGGGCGTAGCAATACGTCAGC-3′) and Bradj16S1430R (5′-GCCGGCTGCCTCCCTTGCGGGTTA-3′). Each PCR mixture (20 μl) consisted of 1x NH4 reaction buffer, 0.5 mM of each dNTP, 2.5 mM MgCl2, 0.1 μM of each primer, 1 U Biotaq polymerase (Bioline, UK) and 10 ng of genomic DNA. The PCR conditions were as follows: 95 °C 2 min, 30 cycles of 95 °C 15 sec, 68 °C 15 sec and 72 °C 1 min and extension at 72 °C for 10 min. The PCR products were examined on a 1.5% agarose gel stained with ethidium bromide (0.2 μg ml−1) at 100 V for 60 min. PCR products were purified using the Wizard SV gel and PCR clean up system (Promega, USA), sequenced as described in Mauchline et al. (2014)48 and identification confirmed using BLAST and searching the NCBI and RDP databases. Two isolates, one from grassland soil and one from bare fallow soil were chosen at random from the generated Bradyrhizobium culture collection to have their genomes sequenced de novo.

De novo genome sequencing

DNA was extracted from isolates grown in MAG broth incubated at 28 °C and shaken at 100 rpm using the GenElute Bacterial Genomic DNA kit (Sigma Aldrich, USA). Extracted DNA was quantified using Qubit Fluorometric Quantification (Life Technologies, USA). Two sequencing platforms were used; Ion Torrent PGM™ (Life Technologies, USA) (located at Rothamsted Research, UK) and Illumina HiSeq 2000 (Illumina Inc, USA) (located at BGI, the Beijing Genomics Institute, China). For Ion Torrent sequencing, two barcoded, unamplified sequencing libraries were constructed using an Ion Plus Fragment Library Kit (Life Technologies, USA) using 1 μg template DNA. Libraries were pooled and sequence template generated using an Ion PGM™ Template OT2 400 Kit. Sequencing was performed using the Ion PGM™ Sequencing 400 Kit and an Ion 318™ Chip. The G22 sample generated 3,652,179 reads and the BF49 sample generated 3,129,252 reads. Illumina sequencing was performed by BGI and 40 ug of genomic DNA was used. A 6 kb mate library with 90 bp read length and 50X coverage was created using the Illumina HiSeq 2000 sequencing platform resulting in 2,997,699 paired reads for G22 and 3,026,005 paired reads for BF49.

De novo genome assembly and annotation

The sequence data was assembled using SOAPdenovo2 assembler and gap closer49 using a maximum read length of 615 bp and the average insert size of 6 kb. A range of assemblies were produced using kmer values; 61–83, 81, 83, 85, 87, 89, 91. The genomes were manually curated using Geneious (Biomatters Ltd v8.1.5). Gap closing was carried out by PCR. Any remaining contigs from the 61–83 kmer reference assembly over 500 bp were investigated while those under were discarded. The G22 assembly had 19 contigs over 500 bp unplaced within the chromosome, with some containing annotated genes. No contigs over 500 bp remained after the completion of the final BF49 reference and so we can be confident that all genes are present in the assembly. The genomes were uploaded into RAST (Rapid Annotation using Subsystem Technology) for annotation50,51. The default parameters were used which were as follows: Classic RAST annotation scheme, RAST gene caller and Release70 FIGfam version. Automatically fix errors and backfill gaps were selected. The sequences have been deposited in the ENA database [Study ID: PRJEB10689, Sample ID G22 and unplaced contigs: ERS955657, G22 plasmid: ERS955536, Sample ID BF49: ERS954959]. The raw sequence data was also uploaded to ENA [G22: ERR1110561, BF49: ERR1110562] for Illumina and Ion Torrent [G22: ERR1110625, BF49: ERR1110597].

Genome comparisons

Genome annotations were downloaded from RAST and examined manually using KEGG40,41. OrthoVenn was used to assess gene orthology using an e-value of 1e−1052. BRIG (BLAST Ring Image Generator) was used to compare the genomes with other Bradyrhizobium isolates (NCBI blast version 2.2.31)53. Large-Scale Genome Alignment Tool (LASTZ) was used to assess genome-wide sequence similarity in Geneious54.

16S rRNA phylogeny analysis

The 16S rRNA gene sequences for G22 and BF49 were compared with other Bradyrhizobium sequences in the NCBI database. All Bradyrhizobium 16S sequences from the NCBI RefSeq database and the four Bradyrhizobium isolates from North American forest soils2 were downloaded. These sequences were aligned using MUSCLE using 8 iterations55. The aligned region was extracted (1220 bp) and a phylogenetic tree was created using the neighbour joining clustering method with 1000 bootstraps. A 75% support threshold was used for drawing the phylogenetic tree. Accession numbers for the sequences used in this analysis can be found in Supplementary Information S2.

Core genome phylogenetic analysis

OrthoVenn was used to identify genes which were present in all 9 genomes and were considered the “core genome”. The inflation value was set to 1.5 and an e-value of 1e−10. These genes were uploaded into the call SNPs and infer phylogeny (CSI) tool56 hosted by the Center for Genomic Epidemiology57 using the default options using B. diazoefficiens USDA 110 as the reference sequence. Default options were as follows: minimum of 10x depth at SNP positions, 10% relative minimum depth at SNP positions, a minimum of 10 bp distance between SNPs, minimum SNP quality score of 30, minimum read mapping quality score of 25 and minimum z-score of 1.96. Altered FastTree was selected. The core genome phylogenetic tree (Newick file) was visualised in Geneious.

Biolog carbon assays

Isolates G22, BF49 and B. japonicum USDA 6 were grown in MAG broth, incubated at 28 °C and shaken at 100 rpm. Cell density was estimated from 1 ml of culture stained with 0.05% methylene blue using a haemocytometer. The cultures were diluted to a cell density of 106 μl−1. The diluted cultures were centrifuged at 14000 × g for 1 minute, the supernatant was removed and the cells re-suspended in sterile deionised water. Each well of a Biolog GN2 MicroPlate™ was inoculated with 140 μl (108 cells/ 140 μl) of bacterial culture. Three replicate plates per isolate were used. The optical density (OD) was read using a Varioscan SkanIt plate reader (Thermo Fisher Scientific Inc.) at 590 nm and 25 °C every 24 hours for a total of 98 hours (0, 24, 48, 72 and 98 hours). The plates were incubated at 25 °C and shaken at 100 rpm. Full list of substrates can be seen in Supplementary Information S3.

Statistical analyses

The Biolog data was analysed using principal components analysis (PCA) using the inbuilt function, prcomp, in R (version 3.2.2). For visualisation of the PCA, biplots were drawn using PC1 and PC2 and the loadings matrix was extracted (see Supplementary Information S3). This identified specific substrates which were associated with the isolates and also substrates making the greatest contribution to the principal component. These pathways were then examined in the genomes. The substrates were grouped into carbohydrates, carboxylic acids, amines and amides, amino acids, polymers and miscellaneous according to categories identified in previously published work58.

Additional Information

How to cite this article: Jones, F. P. et al. Novel European free-living, non-diazotrophic Bradyrhizobium isolates from contrasting soils that lack nodulation and nitrogen fixation genes – a genome comparison. Sci. Rep. 6, 25858; doi: 10.1038/srep25858 (2016).