Spontaneously established syntrophic yeast communities improve bioproduction

Nutritional codependence (syntrophy) has underexplored potential to improve biotechnological processes by using cooperating cell types. So far, design of yeast syntrophic communities has required extensive genetic manipulation, as the co-inoculation of most eukaryotic microbial auxotrophs does not result in cooperative growth. Here we employ high-throughput phenotypic screening to systematically test pairwise combinations of auxotrophic Saccharomyces cerevisiae deletion mutants. Although most coculture pairs do not enter syntrophic growth, we identify 49 pairs that spontaneously form syntrophic, synergistic communities. We characterized the stability and growth dynamics of nine cocultures and demonstrated that a pair of tryptophan auxotrophs grow by exchanging a pathway intermediate rather than end products. We then introduced a malonic semialdehyde biosynthesis pathway split between different pairs of auxotrophs, which resulted in increased production. Our results report the spontaneous formation of stable syntrophy in S. cerevisiae auxotrophs and illustrate the biotechnological potential of dividing labor in a cooperating intraspecies community.

Samples within experimental screen data had at least 4 replicates, which were either pinned in one output plate (e.g. co-cultures) or across two different plates (e.g. monocultures) (Extended Data Figure 2). Allocating replicates across multiple plates was a consequence of pinning microbial arrays with 96-well plating pads. Although operating with 8×12 arrays enabled rapid microplate preparation, the pinning procedure introduced the risk of inter-plate positional effects (i.e., sample replicates growing differently between plates). Beyond a certain threshold, sample variation between plates is unacceptable, as it becomes impossible to know which plate is showcasing the condition's "real" growth. This spatial, systematic bias can result in data skew, which can increase the risk of statistical errors during hit selection. To mitigate the impact of plate positional effects in our experimental screen data, we applied median absolute deviation (MAD) and a distance-based algorithm to categorise ∆OD600 replicate distributions (see Extended Data Figures 3 and 4). This quality control step was used to remove culture conditions exhibiting inconsistent growth.
MAD is a robust measure of data variability in a univariate sample and was calculated for each monoculture and co-culture. MAD values were aggregated and analysed as a population distribution, and outliers (values beyond Q3 + 1.5×IQR) were sent to a distance-based classification algorithm (db_characterize) for pattern recognition (Extended Data Figure 4a). db_characterize helps locate assay-specific spatial bias by classifying replicate distributions, in which distances (defined as the difference in OD600) were measured between all replicate points. A threshold of 1.5 units was applied to categorise every replicate-connection as being either "close" (distance under 1.5) or "far" (distance higher than 1.5). Culture conditions received a mix of "close" and "far" labels, which were then compiled into an "inlier: outlier" ratio (Extended Data Figure 4b). These ratios translated into specific univariate patterns, and thus enabled automatic pattern recognition of replicate distributions.
Growth complementation assays with all conditions displaying "tight" or "one anomaly" patterns were kept in the dataset for downstream analysis. In contrast, conditions with a "pairs" or "undefined distribution" pattern were discarded. The "pairs" pattern was always attributed to monocultures that grew differently between two plates-likely due to plate positional biases rather than bimodal growth behaviour.

Step 4. Assay design, quality assessments, and isolating auxotrophs
In a typical activation-based screen, unknown samples are assayed alongside positive and negative controls. These controls define the upper and lower bounds of assay activity, and samples then signal within this fixed activity range (i.e. within the window of separation between bounds). Activity ranges within the presented co-culture screen were defined by both prototrophs (fixed positive controls) and test strains in monoculture (assay-specific negative controls) (Extended Data Figure 4d). Prototrophic strains grew unabated in minimal media (100% assay activity), while test strains were expected to have stunted growth (0% assay activity).
It is worth noting that complementary growth is relative to the strains involved, and 0% assay activity is not identical to null OD600. Strains in monoculture could grow to negligible or moderate amounts, yet their average growth was still defined as baseline activity. Monoculture growth was equated to background noise so that any additional growth among co-cultures would translate as assay activity between 0-100%. Although competitive behaviour between strains could cause co-cultures to grow less than monocultures, these consortia are not indicative of cross-feeding and thus were ignored. The described assay identifies syntrophic relationships among co-cultures when 1) prototrophic references grow well and 2) all other monocultures grow poorly (Extended Data Figure 4e).
Incubating hundreds of distinct consortia in parallel can produce a wide range of growth activity. Unlike traditional high-throughput screening libraries (e.g., collections of smallmolecules), microbial libraries can evolve overtime and thus require active management. Some of our library strains grew unexpectedly (i.e. monocultures grew in minimal media; i.e., leaky auxotrophs) despite pre-screen efforts to curate only auxotrophs. High-growing monocultures are flagged in the presented analytical pipeline, as they cause growth complementation assays to have inadequate windows of separation between negative and positive controls (Extended Data Figure 4e).
We isolated auxotrophs in experimental screen data by conducting assay quality assessments. The Z-Factor is a screening window coefficient to measure effect size-a dimensionless statistic that is commonly used to evaluate assay quality in high-throughput screening (Extended Data Figures 3 and 4e) 8 . Windows of separation were calculated for each assay, in which variance and dynamic range were measured among test monocultures and prototrophs (producing a value between 0 and 1). Test strains that earned a Z-Factor of 0.50-1.00 were considered excellent quality, while earning less than 0.50 was interpreted as inadequate (i.e. too much signal overlap between negative and positive controls). Since test strains were considered negative controls, assay quality (i.e. separation between lower and upper bounds) was directly impacted by monoculture growth. As a consequence, isolating strains with a Z-Factor above 0.50 could select for both high-quality assays and low-growing monocultures (Extended Data Figures 4d-e). Therefore, high-growing monocultures are flagged and subsequently removed from downstream consideration, as they cause growth complementation assays to have inadequate windows of separation between negative and positive controls (Extended Data Figure 4e).

Step 5. Creating comparison groups (adding features to DataFrames)
Up to this point, samples were viewed as stand-alone observations in the dataset, with no data structure for connecting co-cultures (e.g. XY) to corresponding monocultures (e.g. X and Y). Culture conditions were assigned 'Assay ID' labels to bundle and filter raw data based on growth complementation assays. Experimental screen data was divided into smaller data frames (i.e., comparison.data) based on the test strain that each assay contained (Extended Data Figure 3). Each comparison DataFrame contained one of 92 distinct assays (e.g. X, Y, and XY), each of which was evaluated separately in downstream steps.

Step 6. Removing unsuitable growth complementation assays
Assays that failed to meet set criteria were removed from analysis. Steps 2-4 produced lists of flagged samples (i.e. co-cultures or monocultures with unacceptable variance, plate positional bias, median growth, or Z-Factors). These "flagged-for-removal" tables were referenced in a sequence of filtering steps to remove unsuitable assays from comparison.data (Extended Data Figure 3). Exclusion criteria included: 1. Assays with culture conditions found in flagged.t0s (i.e. outlier OD600 at Time: 0). 2. Assays with culture conditions labelled as "pairs" or "undefined" from db.characterize. 3. Assays with an "infeasible" activity range (i.e. monoculture earning a Z-Factor < 0.50). 4. Assays with inadequate sample size (i.e. culture conditions with less than 3 replicates)

Step 7. Characterising complementary effects in each consortium
Screening algorithms with inadequate sorting criteria will struggle to differentiate biological significance from false positives. In our specific case, for the presented coculture screen, not all instances of complementary growth in consortia are representative of metabolic cross-feeding. Synergy (i.e. AB > A + B) is associated but not sufficient for syntrophy. Categorising assays only by the presence/absence of synergy fails to distinguish between 1) cross-feeding behaviour, or 2) one strain augmenting the growth of a self-sufficient strain (Extended Data Figure 3). Therefore, the novelty of a consortium depends on how constituent strains grew in monoculture. The presented analytical pipeline differentiates among experimental screen data by adding descriptive annotations to each assay, which helps contextualise, compare, and rank complementary behaviour between co-cultures and monocultures (Extended Data Figure 3).
Assays were broken down into their primary components: monoculture X, monoculture Y, and co-culture XY. Unequal variance t-tests determined whether the difference in growth between co-cultures and the corresponding monocultures was significant (Extended Data Figures 3). Tests considered ∆OD600 sample distributions and produced p-values in relation to the following null and alternative hypotheses: H0: µco-culture = µmonoculture H1: Co-culture grew significantly different from its monoculture Two p-values were generated per assay, each of which compared a co-culture with one of its monocultures. Since larger p-values from Welch's t-tests correspond to less confidence in samples having different means, larger p-values indirectly select for monocultures whose growth behavior was most similar to its co-culture. The median of this more 'likened' monoculture thus served as a growth benchmark for evaluating consortia performance (Extended Data Figure 3). Fold difference ratios (co-culture over monoculture) were calculated to define the magnitude of difference (i.e., statistical effect size) between sample medians. Consortia were further annotated by their presence/absence of synergy, and the difference in growth between co-cultures and more 'likened' monocultures (Extended Data Figure 3) Step 8. Defining a "Hit Threshold" Growth complementation assays were mapped onto a volcano plot to visualize the log10(p-value) and log2(Fold Difference) of each co-culture (p-values were corrected for multiple testing using the Benjamini-Hochberg method.). Consortia that differentiated the most from its monocultures populated the top-right corner of the plot, and assays failing to earn p-values < 0.05 or fold difference ratios > 1.50 were not further considered ( Figure  1b). Remaining consortia could be filtered by their metadata annotations, such as their presence of synergy and differences in OD600 between co-cultures and "more-likened" monocultures.

Supplementary Table 1. 62 strains from the primary screen that grow well in SC but not in SM.
Strains were manually annotated based on any explicit linkages to amino acid or nucleotide biosynthesis (e.g., direct).

Condition
Associated product

Supplementary Table 6. Script for pinning microbial arrays with the Singer-ROTOR (manual mode).
A Singer-ROTOR was equipped with 96-density plating pads and was scripted to route liquid culture between source plates (a, b, x, y, z) and target plates (c1, c2, c3, n1, n2). A microbial library of 96 strains was distributed into output plates before test strains were overlaid. Output wells were pinned twice (receiving approximately 4 ul of liquid culture from source destinations). The 'Source Position' and 'Target Position' columns correspond to the 1:4 optional placement for pins when transferring liquid from 96-to 384-well microplates (see Extended Data Figure 2 for more details). Note: Test strain plates (x, y, z) were duplicated (e.g,. x.1 and x.2) to avoid back-contamination of pins (i.e., avoiding "dirty" pins re-entering source wells, and thereby compromising future target destinations). The script's pinning itinerary was designed to consume minimal plating pads while generating target plate conditions. Step