Polycyclic aromatic hydrocarbons (PAHs) are ubiquitous soil pollutants with environmental and public health concerns [1]. PAHs released into the environment by industrial activities can be removed through physical, chemical, and biological processes [2]. Microorganisms that use PAHs as carbon sources play essential roles in natural attenuation of pollutants in contaminated ecosystems [3]. Numerous PAH catabolic pathways are described in bacteria, archaea, and fungi, including both aerobic and anaerobic processes [1, 3]. Several studies have evidenced an increased microbial degradation of PAHs in the presence of plants [4, 5]. The discovery of this “rhizosphere effect” has promoted the development of rhizoremediation strategies to treat PAH-contaminated soils [6, 7]. However, the literature shows conflicting results with no rhizosphere effect or even a negative influence on PAH-degradation efficiency [8, 9]. These discrepancies may reflect the use of different model plants, soil types, indigenous microorganisms [10], and types of PAHs [11]. Such variations may also result from temporal variations in rhizospheric processes during plant development, whose determinants must be identified to better control rhizoremediation processes.

We previously showed that phenanthrene degradation was slowed down during the early development of ryegrass [12], a plant commonly used in PAH rhizoremediation studies [13]. Differences in PAH degradation with or without plants could be explained by the selection of two distinct PAH-degrading bacterial populations. Indeed, active bacterial populations were different in soil supplemented or not with root exudates [14] and in planted or bare soil microcosms [12]. Plant roots exude labile carbon sources that can be preferentially degraded by rhizospheric microbial communities to the detriment of PAH dissipation [9]. Therefore, investigation of PAH-degraders in the rhizosphere of contaminated soils is needed to decipher the influence of plants and ultimately improve PAH biodegradation efficiency.

One challenge is to specifically target microorganisms actively degrading PAHs in polluted soils without cultivation bias. DNA-stable isotope probing (DNA-SIP) is a powerful technique for linking functions with identity of uncultured microorganisms in complex microbiota. Approaches combining DNA-SIP and metagenomics are increasingly used to investigate microbial communities involved in anthropogenic compound biodegradation [15,16,17]. To date DNA-SIP has been successfully applied to identify soil bacteria that metabolize PAHs such as naphthalene [18,19,20,21], phenanthrene [14, 19, 22,23,24,25,26,27], anthracene [25, 28, 29], fluoranthene [25], pyrene [30, 31], and benzo[a]pyrene [32]. DNA-SIP has also been used to explore PAH-degraders in the presence of purified root exudates [14] and to identify degraders of root exudates in the rhizosphere [33]. However, we still lack a comprehensive view on the major microbial actors of PAH degradation and their catabolic pathways in contaminated soils, and how it changes in the presence of plants [34].

In this study, we investigated PAH-degrading bacteria of a historically polluted soil, using 13C-labeled phenanthrene (PHE) as a metabolic tracer. We combined DNA-SIP with metagenomics to assess for the first time the influence of plant rhizosphere on the diversity, identity, and metabolic functions of bacteria actively involved in phenanthrene degradation. Our objectives were to determine the diversity and metabolic properties of active soil PAH-degrading bacteria both in bare and planted soils. We hypothesized first that plants would select for contrasting PAH-degrading bacteria with altered PAH-biodegradation pathways leading to an increase in biodegradation efficiency in the rhizosphere. Second, we hypothesized that the PAH-degrading microbial guild is composed of diverse bacteria possessing complete catabolic pathways and acting in parallel.

Material and methods

Soil sample and phenanthrene spiking

Soil was collected from a former coking plant site (NM soil, Neuves-Maisons, France), dried at room temperature, sieved to 2 mm and stored in the dark at room temperature until experimental set-up. PAH contamination dates back ca. 100 years and reaches 1260 mg kg−1 (sum of the 16 US-EPA PAHs) with 7% of phenanthrene (90 mg kg−1). Other soil characteristics were detailed elsewhere [35]. For spiking, two batches of soil were recontaminated at 2 500 mg kg−1 respectively with [U-13C14]-labeled phenanthrene (Sigma-Aldrich Isotec, St Louis, USA) and unlabeled 12C-PHE (Fluka). Ten grams of dry soil were mixed with 2.5 ml of PHE stock solutions (10 mg ml−1 in hexane). After complete solvent evaporation (48 h under a fumehood), aliquots of 1.1 g recontaminated soil were mixed with 9.9 g of non-recontaminated soil (1:10 dilution) to obtain a final concentration of fresh PHE of 250 mg kg−1. This spiked-soil was immediately used for SIP incubations.

SIP incubations

SIP incubations were performed using previously described two-compartment microcosms [12]. Eight ryegrass seedlings (Lolium multiflorum, Italian ryegrass, Podium variety, LG seeds, France) were allowed to grow for 21 days until roots reached the bottom of the first compartment containing 30 g of non-recontaminated NM soil and maintained at 80% of the soil water-holding capacity in a growth chamber (22/18 °C day/night, 80% relative humidity, c.a. 250 µmol photons m−2 s−1, 16 h photoperiod). A second compartment containing 10 g of 12C or 13C PHE-spiked soil, moistened at 80% of the soil water-holding capacity, was then appended below the first one to allow root colonization in the growth chamber. After 10 days, soil from the second compartment was retrieved in a glass Petri dish. Roots were removed using a brush and tweezers. Aliquots of soil were collected in glass vials for organic extraction and isotopic analysis; the rest was stored in plastic vial and immediately frozen in liquid nitrogen for further nucleic acid extraction. All samples were stored at −80 °C until analysis. Non-vegetated (“bare”) microcosms were handled in the same way, except no ryegrass was planted in the first compartment. Independent triplicates were performed for the four conditions (i.e. 12C-bare, 12C-planted, 13C-bare, and 13C-planted), for a total of 12 microcosms. Aerial plant biomass was harvested, ground in liquid nitrogen, dried at 60 °C, and stored at room temperature. To capture the initial PHE concentration, an additional set of second compartments was prepared and killed within 2 h.

Organic extraction and phenanthrene quantification

Lyophilized soil was pulverized in a mixer mill (MM200, Retsch). Soil samples (250 mg) were mixed with 1 g activated copper and 2 g anhydrous Na2SO4 and extracted with dichloromethane (DCM) using an high pressure and temperature automated extractor Dionex ASE350 as described elsewhere [36]. Post-extraction soil residues were recovered for isotopic analysis. DCM extracts were analyzed by gas chromatography coupled to mass spectrometry (GC-MS) to quantify 13C-PHE concentration [37]. The 13C content (coming from the added 13C-PHE) was deduced from these data, using the formula: 13Csample = α × PHEsample, where PHEsample is the 13C-PHE content (mg kg−1) and α the massic proportion of carbon atoms in PHE molecule (0.9434). The proportion of 13C remaining compared to the amount added initially was expressed as 100 × 3Csample D10/13Csample D0. D0 was the initial measure and D10 after 10 days.

Isotopic analysis

Isotopic composition of soil samples and post-extraction soil residues, dry DCM extracts, and aerial plant biomass was determined at the INRA PTEF (Champenoux, France) using Elemental analyzer (vario ISOTOPE cube, Elementar, Hanau, Germany) interfaced in line with a gas isotope ratio mass spectrometer (IsoPrime 100, Isoprime Ltd, Cheadle, UK). Carbon isotopic composition was expressed as δ13C (‰) versus Vienna PeeDee Belemnite (V-PDB). The ratio of 13C versus 12C in samples was expressed as Rsample = RV-PDB × (1 + δ13Csample/1000), where RV-PDB = 13CV-PDB/12CV-PDB = 0.0112375. The proportion of 13C in a sample compared to the amount initially added from 13C-PHE was calculated as (Rsample D10Rmatrix)/(Rsample D0Rmatrix), where Rmatrix is the natural 13C/12C ratio of the NM soil, DCM extracts or soil residues (average 0.01085).

DNA extraction and isopycnic separation

DNA was extracted from 0.5 g of soil using the Fast DNA Spin Kit for Soil (MP Biomedicals, France). Five replicate extractions were pooled for each sample. DNA was quantified using the Quant-iT Picogreen dsDNA assay kit (Invitrogen). Isopycnic separation of 12C- and 13C-labeled DNA (“light” and “heavy” DNA, respectively) was performed as described previously [38]. DNA (3.9 µg) was mixed in 5.1 ml tubes with CsCl solution and gradient buffer to a final density of 1.725 g ml−1. Ultracentrifugation was performed in a vertical rotor (VTi 65.2, Beckman), at 15 °C, 176 985×g for 40 h (INRA, Champenoux). Thirteen fractions (ca. 400 µl) were collected per tube and weighed on a digital balance (precision 10−4 g) to confirm gradient formation. DNA was precipitated with 800 µl polyethylene glycol 6000 (1.6 M) and 2 µl polyacryl carrier (Euromedex) overnight at room temperature, recovered by centrifugation 45 min at 13,000 × g and washed once with 500 µl 70% (v/v) ethanol. Pellets were dried for 2 min in a vacuum concentrator (Centrivap Jouan RC1010, ThermoScientific), resuspended in 30 µl molecular-biology grade water (Gibco, Life Technologies) and stored at −20 °C.

Real-time quantitative PCR

Bacterial 16S, Archaeal 16S, and fungal 18S rRNA gene copies were quantified using the primer pairs 968 F/1401R [39], 571F/910R [40], and FF390R/Fung5F [41, 42]. Real-time quantitative PCR (qPCR) reactions (20 µl) were performed as described previously [43] on a CFX96 real-time system (BioRad) and contained 1X iQ SybrGreen Super Mix (BioRad), 12 µg bovine serum albumin, 0.2 µl dimethyl sulfoxide, 40 µg of T4 bacteriophage gene 32 product (MP Biomedicals, France), 1 µl of template (DNA or 10-fold linearized standard plasmid dilution series from 108 to 102 gene copies per µl1) and 8 pmol of each primer. Reactions were heated at 95 °C for 5 min, followed by 45 cycles of 30 s at 95 °C, 30 s at 56 °C for bacterial 16S rRNA, 60 °C for archaeal 16S rRNA or 50 °C for 18 S rRNA, 30 s at 72 °C and 10 s at 82 °C to capture the fluorescence signal while dissociating primer dimers. Dissociation curves were obtained by heating reactions from 50 to 95 °C. Fractions identified as containing “heavy” DNA (fractions 8, 7, and 6 with buoyant density from 1.713 to 1.727 g ml−1) were pooled and quantified by Picogreen assay.

16S rRNA gene amplicon sequencing and analysis

Fragments of 430 bp covering the V3/V4 region of bacterial 16S rRNA genes were amplified from 12 samples of “heavy” DNA (pool of fractions 6–8) recovered from 13C-SIP and 12C-control samples and sequenced as described previously [12] with a dual-index paired-end strategy [44]. Amplicons were obtained by 28 cycles of PCR using Accuprime Super Mix (Invitrogen) on 1 µl template DNA, purified using the UltraClean-htp 96 Well PCR Clean-Up kit (MOBIO) and quantified by Picogreen assay. An equimolar pool at 10 nM was purified using Nucleospin PCR Clean-Up kit (Macherey-Nagel) and sequenced on a single lane of Illumina Miseq PE250 at the Georgia Genomics Facility (Athens, GA, USA). Paired-end reads were trimmed to a minimum Qscore of 20, joined with Pandaseq [45] and filtered for length in the 400–450 bp range with no ambiguous bases. Sequence data were analyzed as described previously [12] in QIIME v1.9 [46] with chimera detection and clustering in Operational Taxonomic Units (OTUs) at 97% using UCHIME and USEARCH v6.1, respectively [47, 48], followed by taxonomy assignment using the RDP classifier [49] with the Greengenes database v13_8 [50]. After removal of chloroplasts and mitochondria OTUs, datasets were rarefied to the lowest number of sequences per sample (9532 sequences). To identify 13C-labeled OTUs, we used an analytical approach that falls into the Method 2 of the “Heavy-SIP” category [51], whereby labeled taxa were identified as OTUs present in the “heavy” fractions of the 13C-SIP treatment and at a significantly lower abundance or absent in “heavy” fractions of the 12C-control treatment. To this end, a subset of data were produced to keep only OTUs (i) represented by at least 5 sequences in each triplicate 13C-SIP sample and (ii) with an average abundance higher in 13C-SIP samples compared to 12C-controls. This subset was then log-transformed and compared using Welch’s test with Benjamini-Hochberg correction of the p-value performed in R v3.1.3 [52], separately for bare and planted samples.

Shotgun metagenomic sequencing and analysis

“Heavy” DNA recovered from the six 13C-SIP samples was sequenced on 3 lanes of Illumina MiSeq PE300 at the Georgia Genomics Facility (Athens, GA, USA). Adapters removal and quality filtering was performed on raw reads using Trimmomatic v0.33 [53] with the following parameters: remove adapters (ILLUMINACLIP:adapters.fa:2:30:10), trim 5′- or 3′-bases if phred Qscore <25 (LEADING:25 TRAILING:25), trim read when average quality <25 (SLIDINGWINDOW:4:25) and discard reads shorter than 100 bp (MINLEN:100). Based on FastQC report, paired and unpaired reads 2 obtained after Trimmomatic were further cropped at 250 and 230 bp, respectively. Unassembled DNA sequences were uploaded to MG-RAST [54] and compared to the Subsystems database (November 2017) with default parameters. Proportions in functional profiles were compared using Welch’s two-sided test with Benjamini-Hochberg correction. Raw metagenomic reads were assembled separately for each sample using SPAdes v3.7.0 [55], with the –meta option and testing kmer sizes of 21, 33, 55, 77, 99 and 127. Taxonomic assignment of assembled contigs longer than 5 kb was performed using PhyloPythiaS [56]. Genomic features on assembled contigs were predicted and annotated using prokka v1.12 [57]. Predicted genomic features were also screened for functional genes encoding enzymes potentially involved in aerobic degradation of aromatic compounds using AromaDeg [58], with minimum BLAST homology of 50% and minimum alignment length of 150 bp. Sequences of selected AromaDeg enzyme families were further aligned using MAFFT [59] with the L-INS-i method. Alignments were manually edited in JalView [60] and used for maximum-likelihood phylogenetic tree construction in MEGA v6 [61] after selection of the best protein model. Genes encoding putative carbohydrate active enzymes (CAZymes) were annotated using dbcan [62]. The GhostKOALA annotation server [63] was used to assign KEGG orthologies to genes in the metagenomes and reconstruct metabolic pathways.

Data access

16S rRNA gene amplicon sequences are available at NCBI under BioProject ID PRJNA485442 (BioSamples: SAMN09791490-SAMN09791501). Raw and assembled shotgun metagenomics data are available on MG-RAST under study name RHIZORG_WGS and RHIZORG_ASSEMBLED, respectively.


Fate of phenanthrene

The fate of spiked PHE was followed in total soil, organic extracts, and soil residues after extraction using both δ 13C analyses (AE-IRMS) and direct 13C-PHE measurements (GC-MS) (Table 1). No significant decrease in 13C content was observed in total soil over the 10-day period, indicating no/low 13C-CO2 loss through mineralization. The proportion of 13C remaining in DCM extracts (containing PAHs and thus 13C-PHE) decreased ca. 25% in bare soil and only 10% in planted soil (Welch t-test P = 0.06). Consistency between AE-IRMS and GC-MS results indicates that most of the DCM-extractable 13C was 13C-PHE, without major contributions of 13C-labeled degradation by-products/metabolites. At day 0, 3.5% of the spiked 13C was already non-extractable with DCM and recovered in soil residues, suggesting a rapid sequestration of PHE on soil particles. After 10 days, this non-extractable fraction reached 13 and 5.8% in bare and planted microcosms, respectively. This increase likely reflects the incorporation of 13C in microbial biomass and the production of 13C-labeled hydrophilic intermediates that were not extracted with DCM. Aerial plant biomass was only weakly enriched in 13C (0.048% 13C content increase compared to 12C-controls).

Table 1 Proportion of 13C remaining (after 10 days incubation, D10) compared to the amount added at D0

Taxonomic characterization of active PHE degraders

Community genomic DNA extracted from 12C-controls and 13C-SIP incubations was separated by isopycnic centrifugation and fractionated. Quantification of bacterial 16S rRNA genes showed a 6-fold increase in fractions 6–8 (buoyant density 1.713–1.727 g ml−1) of 13C-SIP bare microcosms compared to 12C-controls (Figure S1A). This increase was only 2-fold in planted microcosms (Figure S1B), suggesting a lower incorporation of 13C within bacterial biomass in the presence of ryegrass. Archaeal and fungal rRNA genes were not enriched in any fractions from 13C-SIP compared to 12C-controls (not shown), suggesting that Archaea and Fungi did not metabolize a significant proportion of PHE in the tested conditions. Fractions 6 to 8 from each sample (13C-SIP and 12C-controls) were pooled (hereafter “heavy DNA”) and analyzed by high-throughput sequencing of the 16S rRNA genes, using a Heavy-SIP analytical approach [51]. Although other DNA-SIP approaches could have provided a higher sensitivity (i.e. a lower rate of false negatives), Method 2 of the Heavy-SIP approach was predicted to achieve a specificity comparable to HR-SIP and qSIP and largely insensitive to the atom % excess of DNA and the number of 13C-incorporating OTUs [51, 64]. Combined with the low variation observed between replicates, this ensures a high confidence in the 13C-labeled taxa we identified. We obtained a total of 405 345 quality-filtered paired-end 16S rDNA sequences from the 12 samples, ranging from 9 532 to 52 544 sequences per sample. Based on the rarefied dataset, sequences were clustered in 5690 OTUs at 97% identity. The taxonomic affiliation of OTUs differed between heavy DNA fractions of 13C-SIP incubations and 12C-controls in both bare and planted conditions (Fig. 1a). We detected 130 and 73 OTUs significantly enriched in 13C-SIP incubations compared to 12C-controls in bare and planted conditions, respectively, with 40 OTUs shared between the two conditions (Fig. 1b). These active PHE-degrading OTUs were affiliated to Actinobacteria, Alpha-, Beta-, and Gammaproteobacteria in both conditions, while Firmicutes were only enriched in heavy DNA fractions from bare soil. Arthrobacter, unknown Sphingomonadaceae and Gammaproteobacteria “PYR10d3” were present at a similar level in both conditions. PHE-degrading Sphingomonas and Alcaligenaceae OTUs were favored in bare soil compared to planted soil. Conversely, Sphingobium and unknown Micrococcaceae were more represented in planted soil. In each condition, one genus dominated the PHE-degrading population. Namely, Sphingomonas dominated in bare soil (43% of total sequences and 61% of total OTUs), while Sphingobium prevailed in planted soil (28% of total sequences, 41% of total OTUs).

Fig. 1
figure 1

a Taxonomic profiling of heavy DNA from 12C-controls and 13C-SIP incubations, in bare or planted microcosms. Values are means of three independent microcosms. b Abundance of taxonomic groups significantly enriched in heavy DNA fractions from 13C-SIP conditions compared to the 12C-controls in bare (white) and planted conditions (gray). Values are depicted on a log-scale and are mean ± s.e.m. (n = 3). Values in brackets beside each bar represent the number of OTUs

Functional profiling of 13C-labeled metagenomes

A total of ~90 million quality-filtered paired-end reads representing 20 Gb sequence data were obtained for the six heavy DNA samples from 13C-SIP incubations (Table S1). MG-RAST analysis showed that 14 out of 28 functional categories were differentially represented in the two conditions (Fig. 2a). The greatest differences between bare and planted conditions were found for the two categories: “Carbohydrates” (8.63 and 10.61% in bare and planted soil 13C-metagenomes, respectively; q-value = 0.016) and “Metabolism of aromatic compounds” (4.66 and 2.45% in bare and planted soil 13C-metagenomes, respectively; q-value = 0.015). Within the category “Carbohydrates” (Fig. 2b), the abundance of gene sequences affiliated to the metabolism of polysaccharides, monosaccharides, fermentation, di- and oligosaccharides, central carbohydrates, and aminosugars, were all significantly over-represented in the planted soil 13C-metagenomes. Within the category “Metabolism of aromatic compounds” (Fig. 2c), sequences matching anaerobic degradation genes were detected in identical relative abundance between the two conditions. Genes belonging to the three other sub-categories, i.e. peripheral pathways for catabolism of aromatic compounds, metabolism of central aromatic intermediates, and other, were significantly over-represented in bare soil 13C-metagenomes.

Fig. 2
figure 2

Functional analysis of shotgun metagenomic paired-end reads from bare and planted 13C-SIP experiments (mean ± SD, n = 3) based on subsystem categories in MG-RAST. Relative abundance of the 28 functional categories (a), the 12 carbohydrate metabolism categories (b), and the 4 categories concerning the metabolism of aromatic compounds (c), with some sub-categories shown when significant differences were obtained between bare and planted conditions. Relative abundances were expressed based on the total number of hits provided by MG-RAST analysis, i.e. 5,394,642, 5,631,202, and 4,264,598 for triplicates of bare soil metagenomes, and 4,060,525, 5,619,868, and 4,759,581 for triplicates of planted soil metagenomes. Asterisks indicate statistically significant differences (q-values < 0.05 after Benjamini–Hochberg correction) between bare and planted conditions

Genes encoding enzymes potentially involved in aerobic degradation of aromatic compounds were detected in assembled metagenomes (12,492 contigs having a size >5 kb) using AromaDeg, and their prevalence was calculated relative to the single-copy gene recA (Table 2). A significantly lower prevalence of genes encoding benzoate oxygenases, biphenyl oxygenases, extradiol dioxygenases of the vicinal oxygen chelate superfamily, homoprotocatechuate oxygenases and salicylate oxygenases was found in 13C-enriched metagenomes from planted condition compared to bare soil. We further analyzed selected members of two enzyme families involved in the first steps of PAH degradation, namely the biphenyl/naphthalene family of Rieske non-heme iron oxygenases and extradiol dioxygenases of the vicinal oxygen chelate family (EXDO I). The biphenyl/naphthalene family comprises most of the dioxygenases reported to date to activate PAHs for further aerobic degradation, whereas the EXDO I family comprises enzymes that fission the ring of pre-activated mono- or polyaromatic derivatives [58]. We notably detected 66 and 20 open reading frames (ORFs) encoding oxygenases of the biphenyl/naphthalene family from Proteobacteria (Clusters XXIV and XXVI) or Actinobacteria (Clusters I, II, and V), respectively, as well as 46 ORFS encoding proteobacterial EXDOs preferring bicyclic substrates related to dihydroxynaphthalene dioxygenases (Cluster XII). Phylogenetic analysis of biphenyl/naphthalene oxygenases from Proteobacteria (Fig. 3a) revealed that the vast majority (56/66) was closely related to known sequences from Sphingomonas, Sphingobium and Novosphingobium with equal distributions between bare and planted soil. Few additional sequences were related to Cycloclasticus, Burkholderia, or Acidovorax spp. No sequences were affiliated to Pseudomonas. Within Actinobacteria (Fig. 3b), 12 biphenyl/naphthalene dioxygenases were closely related to sequences from Arthrobacter phenanthrenivorans (Cluster II) and Arthrobacter keyseri (Cluster V). In planted conditions only, four additional sequences related to Mycobacterium and Terrabacter were detected in Cluster V, as well as 2 more divergent sequences. In Cluster XII of the EXDO I family (Fig. 4), most detected sequences (32 out of 46, representing 70%) grouped with known proteins from Sphingomonas, Sphingobium and Novosphingobium. Among these, a relatively divergent clade of 20 sequences emerged, with more detected members in bare soil (14 sequences) compared to planted soil. Finally, few additional EXDO I Cluster XII sequences were related to Sphingopyxis, Pseudomonas, Acidovorax or Burkholderia.

Table 2 Normalized prevalence of genes encoding enzymes for aerobic bacterial degradation of aromatics in 13C-labeled metagenomes, relative to recA
Fig. 3
figure 3

Maximum-likelihood (ML) phylogenetic reconstructions of dioxygenases (alpha-subunit of Rieske non-heme iron oxygenases) of the biphenyl/naphthalene family from a) Proteobacteria (AromaDeg Clusters XXIV and XXVI) and b) Actinobacteria (Clusters I, II, and V), identified in 13C-enriched assembled metagenomes from bare (orange; NPA, NPB, and NPC) and planted (green; RGA, RGB, and RGC) SIP microcosms. Identical sequences were collapsed and numbers of individual sequences in each condition are indicated. Sequences from reference strains are included with their accession number and experimentally validated substrate (Pht: phthalate; Non: unknown; Paa: polycyclic aromatic hydrocarbons (Actinobacteria); Pap: polycyclic aromatic hydrocarbons (Proteobacteria); Nah: naphthalene; Bph: biphenyl; Phn: phenanthrene; DbtA: dibenzothiophene; Dnt: dinitrotoluene). The LG+G+I (G = 1.62, I = 0.02) and LG+G (G = 0.57) substitution models were respectively used for a, b) after evaluating the best model in MEGA6. ML bootstrap support (100 re-samplings) are given. Bars represent fraction of sequence divergence

Fig. 4
figure 4

Maximum-likelihood (ML) phylogenetic reconstruction of extradiol dioxygenases of the vicinal oxygen chelate family (EXDO I) from AromaDeg Cluster XII, identified in 13C-enriched assembled metagenomes from bare (orange; NPA, NPB, and NPC) and planted (green; RGA, RGB, and RGC) SIP microcosms. Identical sequences were collapsed and numbers of individual sequences in each condition are indicated. Sequences from reference strains are included with their accession number and experimentally validated substrate (Dhb: 2,3-dihydroxybiphenyl; Dhn: Dihydroxynaphthalene; Dhp: 2,3-Dihydroxybiphenyl and dihydroxylated polycyclic aromatic hydrocarbons (probably dihydroxyphenanthrene); Dhe: 2,3-Dihydroxy-1-ethylbenzene; Thn: 1,2-Dihydroxy-5,6,7, 8-tetrahydronaphthalene; DbtC, 2,3-Dihydroxybiphenyl, probably dihydroxydibenzothiophene and dihydroxylated polycyclic aromatic hydrocarbons; Non: unknown). The LG+G+I substitution model was used (G = 1.09, I = 0.09) after evaluating the best model in MEGA6. ML bootstrap support (100 re-samplings) are given. The bar represents fraction of sequence divergence

Reconstruction of phenanthrene degradation pathways

Combining results of the above AromaDeg analysis and the GhostKOALA annotation pipeline, we reconstructed both the O-phthalate/protocatechuate pathway leading to the 3-oxoadipate, as well the naphthalene pathway leading to salicylate (Fig. 5; see the complete pathway information on supplemental Figure S2). GhostKOALA failed to recognize genes involved in the first steps of PHE degradation that were detected with AromaDeg, likely because the KEGG database only contains nahAc and nidA genes from Pseudomonas and Mycobacterium, respectively. Salicylate could be converted to gentisate or catechol, being further degraded through ortho- and meta-cleavage pathways leading to intermediates of the TCA cycle. All genes involved in these pathways were present in both conditions. Sequences affiliated to Sphingomonadales dominated the early steps of degradation leading to 1-hydroxy-2-naphthaldehyde and pyruvate, as well as later reactions converting naphthalene-1,2-diol to salicylaldehyde and pyruvate. Genes involved in the downstream conversion to gentisate were mainly detected from Betaproteobacteria (including Alcaligenaceae) and Sphingomonadales. Genes annotated in the meta- and ortho-cleavage pathways for catechol utilization were taxonomically more diverse, with sequences affiliated to Alpha-, Beta-, and Gammaproteobacteria, Actinobacteria and Firmicutes. Overall, Actinobacteria and Firmicutes had higher contributions to the phthalate and protocatechuate pathway than to the naphthalene and salicylate pathway.

Fig. 5
figure 5

Reconstruction of phenanthrene metabolic pathways. Red arrows represent genes described in Figs. 3 and 4 and enzyme families described in AromaDeg. Black arrows are genes found in both metagenomes using GhostKOALA (for details about reaction identifiers with enzyme names, EC number, gene names and identification numbers of each reaction see the supplementary Figure S2). The number of genes identified in the two conditions is added in orange (bare 13C-SIP: B) and green (planted 13C-SIP: P) above bar graphs with the taxonomic affiliation of identified genes. Note that 32.1% (114,211/355,456) and 29.8% (225,994/758,006) of the entries could be assigned to known functions for bare and planted 13C-SIP metagenomes, respectively

Focus on Sphingomonas and Sphingobium metagenomes

We further focused on the two dominant PHE-degrading populations of Sphingomonas and Sphingobium identified in bare and planted soil through SIP (Fig. 1b). After taxonomic affiliation of contigs larger than 5 kb, the best metagenome assemblies were obtained for the Sphingomonas population in bare soil and the Sphingobium population in planted soil, with maximum contig size larger than 1 Mb (Supplementary Table S2). The Sphingobium metagenome from bare soil was more fragmented (average size 71 kb, maximum size 278 kb). The single-copy RecA protein of 13C-enriched metagenomes of Sphingomonas was identical in the 3 bare soil microcosms and had its best blastp hit (91% identity) against Sphingomonas sp. MM-1 [65], while that of Sphingobium was identical in the 3 planted microcosms and had its best blastp hit (100% identity) against Sphingobium herbicidovorans NBRC 16415 [66] and Sphingobium sp. MI1205 [67]. Functional gene annotation revealed a large arsenal of dioxygenases and monooxygenases with some that could have potential activity on aromatic compounds in Sphingomonas metagenomes from bare soil (Supplementary Table S3), often grouped in genomic regions. The potential for aromatic compounds degradation was much more restricted in Sphingobium metagenomes from planted soil. We then hypothesized that the greater success of Sphingobium PHE-degraders compared to Sphingomonas in planted conditions might not be directly due to aromatic compound catabolism, but rather to a more efficient use of plant-derived carbon sources. To test this hypothesis, we screened the metagenomes for genes encoding carbohydrate active enzymes (CAZymes) in the groups carbohydrate esterases (CE), glycoside hydrolases (GH) and polysaccharide lyases (PL) (Table 3). Similar numbers of CAZymes were detected in Sphingomonas (36–47 total CAZymes) and Sphingobium (46–48 total CAZymes). However, a larger diversity of CAZy families was found in Sphingobium (27 different families) than in Sphingomonas (19–21) metagenomes. Families detected only in Sphingobium metagenomes include enzymes potentially involved in plant cell wall breakdown, including the degradation of xylan (CE6, CE7, GH10, GH115, GH43_12, and GH67), pectin (PL1_2) and other cell wall compounds (GH16), as well as in the use of disaccharides (maltose, trehalose) that are found in root exudates (GH65). Furthermore, the prevalence of three additional families also potentially involved in complex carbohydrate breakdown was higher in Sphingobium metagenomes compared to Sphingomonas (GH13, GH3, PL22).

Table 3 Detection of genes encoding putative carbohydrate active enzymes (CAZymes)


Phenanthrene is often highly concentrated in PAH-contaminated environments and is a model for research on PAH catabolism [1]. We used DNA-SIP to investigate the diversity and metabolic potential of microorganisms involved in phenanthrene degradation in historically contaminated soil. We further assessed the influence of ryegrass, a plant commonly used in phytoremediation studies on PAH-contaminated soils [2, 11, 68]. To our knowledge, this study is the first to use DNA-SIP combined with metagenomics to assess the influence of plants on bacteria actively involved in phenanthrene degradation in polluted soils. We showed a decreased dissipation of phenanthrene in the ryegrass rhizosphere, corroborating previous results [12, 14]. The rhizosphere environment is richer in nutrients relative to the surrounding bulk soil due to root exudates, which are comprised of an array of organic compounds such as carbohydrates, amino acids, proteins, flavonoids, aliphatic acids, organic acids, and fatty acids [10, 33]. Excessive nutrient availability can inhibit the biodegradation of pollutants [69,70,71]. Thus, rhizospheric bacteria may preferentially use labile carbon sources from exudates, leading to slower phenanthrene degradation. Our observations are most relevant to early plant establishment. The impact of plants may be quite different upon maturation. Indeed, there might be a shift between initial negative priming followed by positive priming effect, as previously described [72, 73]. Since rhizosphere processes are dynamic, PAH degradation could be enhanced after a longer period [4]. The phenanthrene losses we observed (20 and 10% in bare and planted soil in 10 days, respectively) were lower than previous reports, e.g. 68% in 12 days after addition of concentrated ryegrass root exudates to the same NM soil [14] or 60–70% in 9 days in slurries of soil from other locations [22, 25]. This discrepancy might partly be due to differences in the initial PHE concentrations (e.g. 1 mg kg−1 in ref. [22], 10 mg kg−1 in ref. [25], 250 mg kg−1 in ref. [14], and the present study). Furthermore, it might be linked to the more realistic conditions of the present experimental setup within a genuine plant rhizosphere, allowing constant input of rhizodeposits during the 10 days time course. The limited dissipation of 13C-PHE after 10 days minimized the risk of cross-feeding, ensuring reducing potential 13C-labeling of microorganisms other than primary degraders of phenanthrene in the NM soil.

Similar active 13C-PHE-degrading taxa were detected in bare and planted soil, but their relative abundance varied.13C-enriched OTUs affiliated to Arthrobacter, unclassified Sphingomonadaceae and Gammaproteobacteria “PYR10d3” were present at a similar level in both conditions. Members of the Actinobacteria and Sphingomonadaceae are considered potent PAH-degraders in soil and sediments [24, 74,75,76]. Representatives of the Arthrobacter genus degrade many organic pollutants [77]. Using DNA-SIP, some species were previously shown as the dominant phenanthrene degraders in soil supplemented with root exudates [14] and in activated sludge [27]. Interestingly, an Arthrobacter oxydans strain was previously isolated from the NM soil for its ability to degrade phenanthrene [78]. The Pyr10d3 candidate order is a separate branch in Gammaproteobacteria originally identified in a SIP experiment using 13C-pyrene as substrate [30]. Its abundance increased with higher concentrations of petroleum-hydrocarbons in a soil located close to a petrochemical plant [79].

Firmicutes affiliated to Paenibacillaceae were only active in bare soil while a previous study found them as phenanthrene degraders in soil supplemented with root exudates [14]. Paenibacillus spp. were previously enriched from hydrocarbon-contaminated sediment and salt marsh rhizosphere using either naphthalene or phenanthrene as the sole carbon source [80]. Our finding could indicate that Panibacillaceae members possess various physiological traits allowing adaption to a wide range of ecological niches.

The greatest influence of ryegrass on active PHE-degraders was found for Sphingomonas and Sphingobium-related 13C-enriched OTUs, which dominated in bare and planted soil, respectively. Although these two genera naturally feature high GC content genomes that would shift the migration of unlabeled DNA towards denser fractions compared to low GC content genomes, they were detected only in low proportions in the sequenced heavy DNA samples (pool of fractions 6–8) of the 12C-controls. This suggests that DNA of unlabeled Sphingomonads equilibrated in intermediate fractions (e.g. fractions 9–10), preventing bias in the detection of a significant enrichment in 13C-SIP incubations. Members of the Sphingomonads, belonging to the Sphingomonadaceae family within the Alphaproteobacteria, are known to utilize both substituted and unsubstituted mono- and poly-aromatic hydrocarbons with up to 5 rings [28] and have been widely identified as phenanthrene degraders [25, 81]. PAH-degrading sphingomonads are common Gram-negative, aerobic, chemoheterotroph bacteria adapted to oligotrophic environments [81]. They have evolved original strategies to enhance PAH bioavailability, e.g. hydrophobic and negatively charged cell surface, production of a specific sphingoglycolipid, formation of biofilms due to sphingans exopolysaccharide production, presence of high-affinity uptake system, and chemotactic response towards PAH [81]. Sphingomonas spp. are commonly encountered in PAH-contaminated environments as phenanthrene degraders [11, 23, 24, 74, 81,82,83,84,85] and the amount of phenanthrene available in soils influences the diversity of Sphingomonas [74]. Similarly, representatives of Sphingobium can degrade environmental pollutants such as PAHs [86,87,88,89].

In bare soil, 13C-enriched Sphingomonas OTUs coincided with a greater proportion of active Alcaligenaceae OTUs, while in rhizospheric soil Sphingobium OTUs coincided with Micrococcaceae. Alcaligenes representatives were previously identified during bioremediation of creosote-contaminated soil [90], and many studies showed PAH degradation by various Alcaligenes isolates [91,92,93]. Micrococcaceae are well known PAH degraders [94, 95] and their abundance was favored in planted compared to bare aged-PAH contaminated soil [35].

To date, the impact of plant rhizosphere on functional diversity was mostly assessed in pristine soils [96, 97]. Our study is the first to highlight the differences in metagenomes of PAH-degrading bacteria from bulk and rhizospheric soil in polluted environments. The 13C-enriched metagenomes from planted soil showed that ryegrass rhizosphere selected for an active population with specific functions compared to bare soil, i.e. a lower proportion of genes involved in aromatic compound utilization together with a higher and diversified capability for carbohydrate degradation. This confirms our first hypothesis that root exudates would favor the development of PAH-degrading bacteria with specific functional traits at the genome level. Here, the presence of ryegrass benefits bacteria that are able to use a larger diversity of carbon sources including carbohydrates from exudates. This is reminiscent of the higher transcription of genes related to carbon and amino acid utilization shown in the willow rhizosphere [98]. Moreover, the AromaDeg analysis confirmed the lower proportion of genes potentially involved in the first steps of aerobic degradation of aromatic compounds in planted soil metagenomes.

We further hypothesized that the PAH-degrading microbial guild is composed of diverse bacteria possessing complete catabolic pathways and acting in parallel. Reconstruction of complete catabolic pathways (Fig. 5) showed that several routes are used to mineralize phenanthrene both in bare and planted soil, since genes assigned to the o-phthalate/protocatechuate, gentisate and catechol ortho- and meta-cleavage pathways were detected. Our data further revealed that the complete mineralization of phenanthrene is achieved through combined activity of taxonomically diverse co-occurring bacteria, acting sequentially rather than in parallel. Hence, active PHE-degrading communities act as consortia. This strategy might limit the competition for substrates between different degrading populations and overall increase PAH dissipation efficiency. Previous studies reported synergistic effects and enhancement of PAH degradation for bacterial consortia compared to pure cultures, potentially due to individual strains having complementary degradation pathways [99]. Construction of consortia by mixing several known PAH-degraders has failed to maximize cooperation among different species [100], indicating a common genomic evolution among partners of the consortia. This bacterial cooperative mutualism was recently demonstrated between 2 mutant strains of Pseudomonas putida having incomplete but complementary toluene degradation pathways, resulting in a cross-feeding consortium [101]. Our findings fit the Black Queen theory [102] in which one individual produces a by-product that will enhance the fitness of other individuals able to use that product [103]. This novel view on PAH-degradation processes, based on bacterial interactions and metabolic cooperation, opens up new perspectives in microbial ecology.

Within the consortium, Sphingomonadales were the major taxa performing the first steps of phenanthrene degradation. They appear to degrade phenanthrene preferentially through the lower meta-cleavage pathway, as previously shown for Sphingobium chungbukense [104]. These results suggest that autochthonous Sphingomonadales could play a critical role to initiate in-situ PAH remediation in historically polluted soils by increasing PAH bioavailability and opening new substrate niches for other members of the consortium. The dominant PHE-degraders Sphingomonas and Sphingobium OTUs were active both in bare and planted soils but in different proportions (Fig. 1b), leading to a lower phenanthrene degradation rate in the rhizosphere where Sphingobium dominated. Previous studies did not evidence a specificity of Sphingomonas and Sphingobium spp. for environments with low and high nutrient levels, respectively. For example, Vinas et al. [90] detected Sphingomonas in soil bioremediation treatments both with and without nutrient amendment. Recently, comparative genomics of Novosphingobium strains showed that phylogenetic relationships were less likely to describe functional similarities in metabolic traits, than the habitats from which they were isolated [105]. Thus, catabolic differences among Sphingomonads appear strain-specific. Some Sphingomonad strains can utilize various mono-, oligo-, and polysaccharides [81, 106] rather than being specialists in the degradation of aromatic compounds. The greater success of 13C-enriched Sphingobium OTUs in planted soil, together with a larger diversity of CAZymes than the Sphingomonas metagenome from bare soil, suggests a similar behavior. In the conditions tested, PHE-degrading Sphingobium representatives likely took advantage of labile carbon compounds from root exudates, outcompeting the less versatile PHE-degrading Sphingomonas.


Improving soil PAH-rhizoremediation strategies depends on a better understanding of the factors involved in the variability of rhizospheric processes. Microbial diversity, activity and metabolism are key parameters that control PAH-degradation. This study showed that active phenanthrene degraders are diverse in aged-polluted soil and could act as a cooperative consortium whereby different taxa perform successive metabolic steps. Members of the Sphingomonadales were the dominant PHE-degraders identified through DNA-SIP, and the main actors of the first steps in the degradation pathways. Hence, they likely play a crucial role to initiate in-situ PAH remediation. Plant establishment, at least initially, reduces PAH degradation compared to bare soil due to differences in the PAH degrading bacterial consortia and their associated metabolic pathways. In particular, plants induced a drastic shift in the taxonomic composition of PHE-degrading Sphingomonadales, favouring the growth of Sphingobium populations with a more diverse repertoire of carbohydrate-active enzymes potentially targeting plant root material, to the detriment of less versatile Sphingomonas representatives that prevailed in bare soil. These findings pave the way for future studies of soils featuring contrasting physico-chemical characteristics, origin, and pollution history, as well as other plant species to deepen our comprehension of microbial cooperative interactions needed for organic pollutant degradation.