Cross-feeding niches among commensal leaf bacteria are shaped by the interaction of strain-level diversity and resource availability

Leaf microbiomes play crucial roles in plant health, making it important to understand the origins and functional relevance of their diversity. High strain-level leaf bacterial genetic diversity is known to be relevant for interactions with hosts, but little is known about its relevance for interactions with the multitude of diverse co-colonizing microorganisms. In leaves, nutrients like amino acids are major regulators of microbial growth and activity. Using metabolomics of leaf apoplast fluid, we found that different species of the plant genus Flaveria considerably differ in the concentrations of high-cost amino acids. We investigated how these differences affect bacterial community diversity and assembly by enriching leaf bacteria in vitro with only sucrose or sucrose + amino acids as possible carbon sources. Enrichments from F. robusta were dominated by Pantoea sp. and Pseudomonas sp., regardless of carbon source. The latter was unable to grow on sucrose alone but persisted in the sucrose-only enrichment thanks to exchange of diverse metabolites from Pantoea sp. Individual Pseudomonas strains in the enrichments had high genetic similarity but still displayed clear niche partitioning, enabling distinct strains to cross-feed in parallel. Pantoea strains were also closely related, but individuals enriched from F. trinervia fed Pseudomonas more poorly than those from F. robusta. This can be explained in part by the plant environment, since some cross-feeding interactions were selected for, when experimentally evolved in a poor (sucrose-only) environment but selected against in a rich (sucrose + amino acids) one. Together, our work shows that leaf bacterial diversity is functionally relevant in cross-feeding interactions and strongly suggests that the leaf resource environment can shape these interactions and thereby indirectly drive bacterial diversity.

Pseudomonas Fr +CA_2 in monoculture, or both strains growing in co-culture. A) Heatmap of taken up compounds. Each column represents a different repetition (N=6). The strains and peaks have been clustered by Euclidean distance. The yellow bars on the right indicate the area (log10) of the peak in the spent media of Pantoea Fr-CA_6 before growth of the Ps. Each peak is identified with a unique number; for information of each one (ionization mode, retention time, m/z value and putative annotation) see Supplementary Table 6. B) Venn Diagram showing the sets of compounds taken up by each strain alone and when co-cultured. An explanation for easier interpretation of the different areas of the diagram is provided. In both panels A and B only metabolites with significant uptake are shown (log2 FC <-2, FDR<0.05).
Supplementary Figure 9 Supplementary Figure 9. Compounds taken up from the spent media of Pantoea Fr-CA_6 by either P. siliginis Fr+CA_3:mOrange2 and P. siliginis Fr +CA_18 in monoculture, or both strains growing in co-culture. A) Heatmap of taken up compounds. Each column represents a different repetition (N=6). The strains and peaks have been clustered by Euclidean distance. The yellow bars on the right indicate the area (log10) of the peak in the spent media of Pantoea Fr-CA_6 before growth of the Ps. Each peak is identified with a unique number; for information of each one (ionization mode, retention time, m/z value and putative annotation) see Supplementary Table 6. B) Venn Diagram showing the sets of compounds taken up by each strain alone and when co-cultured. An explanation for easier interpretation of the different areas of the diagram is provided. In both panels A and B only metabolites with significant uptake are shown (log2 FC <-2, FDR<0.05). siliginis Fr +CA_2 in monoculture, or both strains growing in co-culture. A) Heatmap of taken up compounds. Each column represents a different repetition (N=6). The strains and peaks have been clustered by Euclidean distance. The yellow bars on the right indicate the area (log10) of the peak in the spent media of Pantoea Fr-CA_6 before growth of the Ps. Each peak is identified with a unique number; for information of each one (ionization mode, retention time, m/z value and putative annotation) see Supplementary Table 6. B) Venn Diagram showing the sets of compounds taken up by each strain alone and when co-cultured. An explanation for easier interpretation of the different areas of the diagram is provided. In both panels A and B only metabolites with significant uptake are shown (log2 FC <-2, FDR<0.05).

Supplementary Figure 10
Supplementary Figure 11 Supplementary Figure 11. Compounds taken up from the spent media of Pantoea Fr-CA_6 by either P. siliginis Fr-CA_5:mTagBFP2 and P. siliginis Fr +CA_18 in monoculture, or both strains growing in co-culture. A) Heatmap of taken up compounds. Each column represents a different repetition (N=6). The strains and peaks have been clustered by Euclidean distance. The yellow bars on the right indicate the area (log10) of the peak in the spent media of Pantoea Fr-CA_6 before growth of the Ps. Each peak is identified with a unique number; for information of each one (ionization mode, retention time, m/z value and putative annotation) see Supplementary Table 6. B) Venn Diagram showing the sets of compounds taken up by each strain alone and when co-cultured. An explanation for easier interpretation of the different areas of the diagram is provided. In both panels A and B only metabolites with significant uptake are shown (log2 FC <-2, FDR<0.05). .

Supplementary Methods
Recovery and metabolomic analysis of leaf apoplast fluid from lab plants To obtain the apoplast samples, well developed leaves were sampled and placed in a 60-cc syringe, which was filled with sodium phosphate (100 mM, pH 6.5). The plunger was pushed until the 50-cc mark to eject air, then pulled until the 55-cc mark and released back to the 50-cc mark; this was repeated several times until the leaves lost buoyancy. To recover the apoplast fluid wash (AFW), the leaves were placed on a sheet of parafilm, rolled around a 15-mL tube and placed inside a 50 mL tube. After three minutes of centrifugation at 2500 x g, the recovered AFW was transferred to a clean 1.5 mL tube. After storage at -20 °C, the samples were spiked with an internal standard of deuterated amino acids and subjected to metabolomic profiling via untargeted UHPLC-HRMS (Supp Table 10).

Plant sampling and preparation of leaf extracts
Cuttings from F. robusta and F. trinervia were grown in pots as described earlier and kept in our lab at

Characterization of culture-independent bacterial diversity in enrichments
The bacterial communities in the twelfth passage were characterized by 16S rRNA gene amplicon sequencing. Cell pellets were resuspended in 600 µL SDS extraction buffer (SDS extraction buffer, The purified DNA was stored at −20 °C until further amplification. Amplification of the 16S V3-V4 region was carried out in a two-step PCR; on the first step, the samples were amplified using the bacterial primers B341F and B806R(1,2) (Supp Table 9  Amplicon sequencing data was split on the indices and the adapters were trimmed from the read ends using Cutadapt. Th data was ust r d int a i n s qu n ing variants " Vs" using the R package dada2 (version 1.18.0: (4), setting the quality filter at (truncLen=c(200,200)). Next, the sequences were dereplicated to remove redundant reads, followed by denoising using the error rate and calling of ASVs in the forward and reverse reads. When merging the forward and reverse reads, only overlapping regions were kept. Chimeric sequences were removed and obtained a sequence table with the merged data. For taxonomy assignment, the Silva database (v132) was used.

Characterization of culturable bacterial diversity in original leaf extracts and in enrichments
Isolates were recovered from the initial leaf extract glycerol stocks and from the glycerol stocks from the twelfth enrichment passage of each condition. Serial dilutions from the glycerol stocks were prepared in 1x PBS and spread in R2A plates. From each condition, 25 random isolates were selected and re streaked to recover pure cultures (150 isolates total). To minimize isolate adaptation to plating, no further cultivation was performed and pure isolates were grown in R2A broth for 48 hours to prepare 20% glycerol stocks. All isolates were identified via DNA extraction and Sanger sequencing of the 16S rRNA gene. Liquid cultures in R2A broth of each isolate were pelleted by centrifugation at 20000 x g for 10 min and stored at -20 °C. DNA extraction and purification was done as described in the previous section. Whole 16S rRNA gene was amplified with the universal primers 8F and 1492R (Supp Table 9). The PCR mastermix contained: 8 µL KAPA 5x GC buffer, 0.32 µM

Whole genome assembly, annotation, ANI calculation and single nucleotide polymorphism detection
For each isolate, the raw forward and reverse reads were trimmed using Trimmomatic (version 0.39), which removed the Illumina adaptersand trimmed at quality below 15 in a 4-base wide sliding window.
The assembly of the genomes was done using SPADES (3.14.1) with the default parameters in "is at " de. Average nucleotide identity between the different isolates of each genera was calculated in Kbase (9). First, the final scaffolds from the SPADES assembly were imported into Kbase the average nucleotide identity was calculated using the FastANI app.
To do so, we used SNIPPY (version 4.6.0) with d au t ara t rs in "Mu ti" d to map the raw sequencing reads from each of the three Pseudomonas strains. This resulted in one variant table for each strain including the chromosome location, the gene it was in (if any) and the predicted effect. An R script was then used to parse these tables to determine which variants were shared or unique for each P. siliginis isolate. Unique variants were extracted for each isolate and were manually filtered for potentially disruptive mutations (disruptive inframe insertions/deletions, frameshift variants, missense variants, stop gains and initiator codon variants). The resulting lists are provided in Supplementary   Table 11. These lists were used to make word clouds (http://www.wordclouds.com) where the word size represents the relative frequency of disruptive variants in a particular gene product. All scripts and step-by-step instructions are available on Figshare.
Assessment of the isolates carbon preferences and cross-feeding potential

Evaluation of the isolates carbon preference
To test metabolic dependencies among the communities, we grew all 150 isolates individually in S-CA

Production of spent media and evaluation of cross-feeding interactions
To generate spent media Pa Fr-CA_6 was grown in 250 mL flasks with 80 mL of S-CA (sucrose-only) media at 26 °C and 220 rpm. After 48 hours, the cells were separated from the spent media by centrifuging at 5000 x g for 5 minutes and filter sterilizing twice through a 0.22 µm PES filter. Spent media were subjected to untargeted metabolomics (see details in the Metabolomics section) or used further to assay growth of Pseudomonas. The Pseudomonas isolates were grown in R2A broth for 24 hours at 28 °C and 220 rpm; the cells were washed twice and resuspended in 1x PBS to an OD600nm of 0.2. In 96-well plates, 180 µL of the sterile spent media and 20 µL of the bacterial suspension were mixed (final OD600nm 0.02). As a negative control, the isolates were also inoculated in full strength S-CA media. After or during 48 hours of incubation at 28 °C and 220 rpm, growth was recorded (OD600nm). To measure taken up compounds, the plate was centrifuged at 5000 x g for 5 min and the supernatants were transferred to a clean plate and stored at -20 °C until injection in the UHPLC-MS.
The same procedure was followed to test the cross-feeding potential of several Agrobacterium, Bacillus and Curtobacterium isolates from the leaf extracts of F. robusta and F. trinervia (FrLE and FtLE). Spent media from these isolates was produced in S-CA medium and tested as growth medium for several Pseudomonas strains. The metabolite uptake from FrLE_12 and FrLE_1 was also determined by UHPLC-HRMS.
Competition between Pseudomonas strains while cross-feeding

Tagging of Pseudomonas strains with fluorescent proteins and competition assay
To test whether faster growing Pseudomonas isolates would outcompete slow growing isolates either when growing along with Pantoea or in its spent media, we tagged the Pseudomonas Fr-CA_5 and Stereomicroscope (Zeiss). To rule out the selected colonies were not E. coli, a PCR using the specific primers FWD_uidA and REV_uidA (Supp Table 9) was performed. Additionally, the presence of the fluorescent protein coding sequence was checked with the primers FWD_Tn5/7_gt and REV_Tn5/7_g (Supp Table 9). In both cases, the Mastermix consisted of: 1x Buffer B (Biodeal), 0.4 mM dNTPs (Carl targeting the transposon, the cycling was the same, except for the annealing temperature, which was set at 56.5 °C. Once the colonies were confirmed to be fluorescent Pseudomonas, they were grown in liquid LB + Gen + Cam at 37 °C and 200 rpm to cure them from the plasmid. The efficacy of this last step was confirmed via PCR with the primers FWD_Tn7_gt, Rev_Tn7_gt and Tn7_gt (Supp Table 9) targeting the plasmid backbone. The Mastermix was prepared as detailed before, except the n ntrati n a h ri r was 0.12 μM and the annealing temperature was set to 62 °C. To and grown in R2A r th und r th sa nditi ns as r . CFU's w r unt d a t r 24 h und r UV light to differentiate each strain.

Competition experiments using tagged Pseudomonas strains
Using the tagged strains, three competition assays were established (see Supp Table 1  Serial dilutions were prepared and plated in LB agar. The plates were incubated for 48 h after which the colonies were counted.

Cross-feeding potential of diverse Pantoea isolates towards Pseudomonas
Pantoea isolates from other enrichments (Fr +CA, Ft -CA and Ft +CA) were tested for their crossfeeding potential towards Pseudomonas. The same procedure described for Exp. 3 was followedabove to obtain the spent media and prepare the Pseudomonas precultures was followed.
The assays were carried out in triplicates in 96-well plates by inoculating 180 µL of spent media with 20 µL of the corresponding bacterial suspension. The plates were incubated at 28 °C and 220 rpm in an orbital shaker. Growth (OD600nm) was determined after 48 hours. Samples for metabolomics were taken at the end of the experiment as described earlier.
To test for inhibitory effects, the spent media of Pantoea isolates Pa Ft-CA_14 and Pa  H I . C u n w was swit h d at 0.5 in r wast t th M and at 11.5 in again a k t th waste, to prevent source contamination. For monitoring two full scan modes were selected with the wing ara t rs. P arity: sitiv ; s an rang : 80 t 1200 /z; r s uti n: 70,000; GC targ t: 3 × 106; a i u IT: 200 s. G n ra s ttings: sh ath gas w rat : 60; au i iary gas w rat 20; sw gas w rat : 5; s ray v tag : 3.0 kV; a i ary t ratur : 360 °C; -lens RF level: 50; au i iary gas h at r t ratur : 400 °C; a quisiti n ti ra : 0.5 - 11.5 in. F r n gativ d , a values were kept instead of the spray voltage whi h was s t t 3.3 kV.

Metabolomics data analysis
The raw data files were converted into mzML format in MSConvert (ProteoWizard, Version 3.0.19246-075ea16f5). To reduce the size of the files, we filtered them by peak picking with the Vendor algorithm, followed by threshold peak filtering using absolute intensity at 1.0E2. Next, we imported them into MzMine (version 2.40.1) for processing. The spectrums were first separated into positive and negative ionization modes. The baseline was corrected at 1.0 m/z bin width and using the asymmetric correction method. The peaks were then selected by centroid detection and by setting the noise level at 1.0E2. The chromatograms were built with the ADAP Chromatogram builder. The settings to build the chromatograms were slightly modified for each run by verifying the right integration of the deuterated amino acids peaks. The minimum group size in a number of scans was set to 2-3, the group intensity threshold was either 1.0E4 or 6.0E4, the minimum highest intensity was set at the same value as the group intensity and the m/z tolerance was set at 0.001 m/z. The generated chromatograms were deconvoluted with the Wavelets (ADAP) algorithm, setting a signal to noise threshold of 10.0, a minimum feature height of 1.0E4, a coefficient/area threshold of 50, peak duration range 0-1 and a RT wavelet range of 0-0.1. Furthermore, the peaks were aligned together using the Join Aligner with m/z tolerance of 0.001, retention time tolerance of 0.1 min and setting the weight for m/z and RT at 75 ppm and 25 ppm respectively. Finally, the matrix of negative and positive mode peaks, with their corresponding ID, m/z, retention time and area were exported into R (version 3.6.0). Using an in-house script, uncommon (in less than 50% of the samples in the run) and small peaks (area <104) were filtered out. In the case of the AFW samples, the areas were corrected by the infiltration ratio.
To get a first overview of the distribution of the peaks across plant species, we conducted a principal coordinate analysis, constrained by plant species using Euclidean distance on the corrected peak matrix with the R packages phyloseq and vegan. Before plotting the data, the areas under the curve a h ak w r trans r d y g g2 = g2*{[ +sqrt 2+a2 ]/2}, wh r "a" is a nstant with a default value of 1. We used this transformation as it has been shown to emphasize biological variation over technical variation in metabolomic data (Parsons et al., 2007). To compare the amino acids, the peak matrix was first annotated against a custom library of 570 compounds including all proteinogenic amino acids (MSMLS, Sigma Aldrich) which had been developed in the same equipment and UHPLC-MS method we used. This was carried out in R with an in-house script (See data availability) by setting the m/z tolerance at 0.002 and the RT tolerance at 0.2 min. The area of each amino acid was corrected towards its deuterated internal standard as follows: C rr t d ar a = r a a in a id in sa * ̅ ar a d ut rat d a in a id in a sa s in th run ar a d ut rat d a in a id in sa Next, we calculated the concentration of each amino acid, based on the known concentrations of the standard (Supp Table 10).
To find out which compounds had been taken up via cross-feeding, the spent media was analyzed before and after growth of the consumer isolates. To focus exclusively on uptake of large peaks, only peaks present in high amounts in the spent media (before growth of the consumers) were considered.
This cutoff was defined for each dataset and ranged between 1.0E4 and 1.0E5. The log2 fold change, p-value and FDR were calculated for each peak. Metabolites with a fold change lower than -2 (after growth / before growth) and an FDR <0.05 were defined as significantly taken up. Venn Diagrams were created using the online tool Venny (version 2.1.0) and heatmaps were built in R with the package ComplexHeatmaps (11). The resulting peak tables were annotated against the custom library as described above.

Supplementary Notes
Analysis of reasons for differences in Pantoea cross-feeding Pantoea did not inhibit Pseudomonas growth (Supp Fig 17) and metabolome analysis suggested only minor differences in their exudates (see CAP analysis in and principal coordinates in Supp Fig 17 p value=0.7713) with several compounds previously identified as taken up (hypoxanthine and spermidine) present in all (Supp Table 8). We did identify a few compounds that were taken up uniquely from either of the Pantoea Fr spent media that were not abundant in the other Pantoea strains ( Supp Fig 18 and Supp Table 8). These are good metabolite candidates for what drove Pseudomonas growth, although we could not yet annotate them.

Detailed analysis of experimental evolution of cross-feeding
In the S-CA communities with Pantoea Fr-CA_6 (Supp Fig 19a), P. siliginis Fr-CA:5mBFP2 maintained a stable growth throughout the passages and the continuous passaging did not lead to major changes in the community, as the final intensity growth in S+CA ( ̅ =43393) resembled that of the first passage of the community exposed to S+CA (Fig 19b, ̅ =46279, p value=0.0877). Growth of P. siliginis Fr+CA_3:mOrange2 increased in three of the replicates over time (Fig 19a). Evolving to be better in S-CA positively affected growth in +CA growth, leading to higher intensities at the 4 th passage after the switch ( ̅ =66032) versus in the community exposed to S+CA (Fig 19b, ̅ =44939, p   value=0.0312). In the S+CA communities with Pantoea Fr-CA_6, P. siliginis Fr-CA:5mBFP2 was stable throughout the passages but decreased when switched to a S-CA (Fig 19b), reaching generally lower values ( ̅ =22958) than those measured in the original S-CA communities (Fig 19a, ̅ =32097, p value=0.0044). On the other hand, P. siliginis Fr+CA_3:mOrange2 apparently evolved a better response to a lack of amino acids, as the final intensity after the switch to S-CA (Fig 19b, ̅ =43964) was slightly higher than that in the original S-CA community, although this difference was not significant (Fig 19a, ̅ =24007, p value=0.069).
In the co-cultures with Pantoea Fr+CA_20 in absence of CA, P. siliginis Fr-CA:5mBFP2 experienced an increase towards the end of the experiment, suggesting a development of better cross-feeding interactions (Fig 19c). However, the passaging did not improve the overall cross-feeding interaction with this Pantoea isolate, as the intensity reached after the switch ( ̅ =32388) was lower than the intensity in the beginning of the S+CA community (Fig 19d, ̅ =45663, p value=0.0007). The intensity of P. siliginis Fr+CA_3:mOrange2 in general decreased over time when together with Pantoea Fr+CA_20and the addition of CA did not have an effect of half of the replicates. However, in replicate 2 changes must have occurred because when CA were added, the intensity increased to quite higher levels (61806) than those in the first passage of the S+CA community (Fig 19d ̅ =36152). Coculturing with Pantoea Fr+CA_20 in the S+CA communities seemed to have a negative effect on the interactions with both P. siliginis Fr-CA_5:mBFP2 and P. siliginis Fr+CA_3:mOrange2 as evidenced by the low values achieved after the switch to S-CA (Fig 19d, ̅ =4194 and ̅ =4196 respectively)and much lower than those in the S-CA communities (Fig 19c ̅ =8111 and ̅ =18620, p<0.0001).