Introduction

Eukaryotic cells grow and divide using a highly conserved and integrated network of positive and negative controls that ensure genomic integrity and maintain cell size within reasonable bounds. Proper control of the cell division cycle is essential for competitive fitness, embryonic development and maturation, and tissue homeostasis. Failure in these control mechanisms may result in cell death, developmental defects, tissue dysplasia, or cancers. One of the foremost model organisms for unraveling the molecular mechanisms of cell cycle control is the budding yeast Saccharomyces cerevisiae. Several hundred yeast mutants, generated in dozens of research laboratories over the past 40 years, have led to the discovery and characterization of many genes and proteins that regulate progression through the cell cycle1. Because of the intense labor involved in these experiments, individual laboratories have tended to focus on small numbers of genes and proteins involved in subsections of the extensive network of gene/protein interactions that control cell cycle events. This reductionist approach was necessary in the early stages of identifying and characterizing the molecular regulatory system, but it carries with it the danger of missing higher levels of network organization and their phenotypic consequences2,3,4.

In contrast to a detailed, reductionist experimental approach, which builds a regulatory network from the bottom up, a systems-level approach seeks to provide a more global and less biased view of regulatory networks. Systems biologists can uncover key regulatory interactions and network architectures that bottom-up practitioners may have missed5,6. Unfortunately, the top-down, pan-genome approach, while good for generating hypotheses, is usually poor for testing hypotheses because the experiments are mostly correlative, and the data is often plagued by problems of accuracy and reproducibility. Combining a variety of ‘omics’ studies may help to overcome these challenges, but it is often difficult to integrate disparate data sets into a single network model7,8,9,10. Ideally, one should combine top-down and bottom-up data, but huge discrepancies of scale between these two data types present barriers to integrating and understanding the hypotheses derived from each approach11,12,13,14,15,16,17.

To mitigate these problems, many researchers, including ourselves, have developed detailed mathematical models that integrate top-down and bottom-up approaches in order to describe the molecular mechanisms that underlie cell cycle regulation in budding yeast4,17,18,19,20,21,22. The governing equations of the model are simulated on a computer, and the model (the ‘wiring diagram’ of molecular interactions) is adjusted until it generates dynamic behaviors that reflect the documented molecular changes and general network behaviors observed in cells (e.g., cell viability, timing of cell cycle events, cell size at birth, and response to DNA damage or chromosome misalignment at mitosis)23,24,25,26. Often, the documented data is missing detailed molecular information, such as protein concentrations and rate constants of crucial reactions, but fitting the model (i.e., fine-tuning the parameter values) to extensive sets of phenotypic data usually introduces strong constraints on these unknown parameters19,27. In this way, mathematical models can refine our understanding of the molecular mechanisms underlying cell cycle progression and test if the proposed network architecture and kinetic rate-constant estimates are consistent with both bottom-up and top-down observations.

One of the major problems when developing large mathematical models of the cell cycle has been the lack of consistent data sets. It has been challenging to compare data collected on cell cycle mutants that often have different genetic backgrounds, whose phenotypes are usually descriptive rather than quantitative, and whose phenotypes are assessed under inconsistent conditions. These problems leave the modeler with the difficult task of curating, interpreting and consolidating inconsistent, and sometimes unreliable experimental results.

A particularly pernicious example of this problem is the use of the ‘synthetic lethal’ (SL) phenotype of double-mutant yeast cells in the development and calibration of mathematical models of the budding yeast cell cycle. Synthetic lethality arises when viable yeast strains carrying deletions of two different genes are crossed to produce inviable, double-mutant progeny (i.e., gene1Δ and gene2Δ mutant strains are viable separately, but the double-mutant gene1Δ gene2Δ strain is inviable). Because they impose strong constraints on the control system, SL gene combinations are exceptionally useful in deducing the network wiring diagram and estimating the rate constants in the mathematical model. On the other hand, if the incomplete or inaccurate identification of SL combinations of genes can wreak havoc on a model by forcing the modeler to make adjustments that are unwarranted. Problems arise because the experimental identification of SL gene combinations is plagued by false-positives and false-negatives and by the fact that some synthetic-lethal interactions are dependent on the genetic background of the parental strain. Hence, for the purpose of modeling cell cycle control in budding yeast, it is crucial to have a reliable, well documented, independently confirmed set of SL gene combinations observed in a uniform genetic background.

We have addressed this problem by reconsidering the identification of SL gene combinations of ‘cell-cycle control’ genes in budding yeast through a disciplined construction of replicate double-mutant strains based on a synthetic gene array (SGA) technology28 pioneered by Tong and Boone29 and the E-MAP28 workflow described by Schuldiner30.

We focused on a set of only 36 cell cycle genes, most of which are functionally well-characterized (Fig. 1). This list comprises all the nonessential genes included in a recent mathematical model of the yeast cell cycle (herein referred to as the ‘Kraikivski’ model)19, as well as genes whose protein products have redundant functions or interact with the proteins represented in the model. In order to estimate the reproducibility of our data, we performed four different crosses to produce each of the 630 double mutants and we tested both mating types. We managed to produce and characterize a library of 6589 genetically distinct yeast strains. The unprecedented number of biological replicates included in this library and the variability of the phenotypic data it produced are raising new modeling challenges.

Fig. 1: Genes used in this experiment.
figure 1

List of 36 cell cycle regulator genes used in the crosses.

We first analyze the variability of SL screen results in our library and compare it with previously published SL interactions listed on The Saccharomyces Genome Database (SGD)31, and then we validate our conclusions by tetrad analysis (TA). We generate lists of ‘high confidence’ and ‘low confidence’ SL interactions. Next, we compare these high-confidence SL interactions with the predictions of our most recent and extensive mathematical model of budding-yeast cell-cycle controls19. We find that, in its present state, the model’s predictions of SL interactions are not very accurate because the predictions were based on parameter values estimated from a collection of SL gene combinations that misidentified some crucial genetic interactions. From our new collection of high-confidence and low-confidence SL gene combinations we reparametrize the model to get much better agreement with the data. We expect this newly parametrized version of the model will give more reliable predictions about the phenotypes of other types of budding yeast mutants as well.

Furthermore, we characterized the growth rates of our mutants under six different media conditions expected to differentially influence cell cycle progression, providing quantitative fitness data that can be used in the future development of more refined and stochastic models of the cell cycle.

Results

Identification of SL interactions

To assess all possible combinations of 36 cell cycle knockouts across multiple biological replicates, we generated 8 sets of independent parent lines to be used in 4 crosses. To avoid suppressor mutations—a feature of the commercial yeast haploid gene deletion collections—we generated 110 parent strains by tetrad dissection of commercial heterozygous diploid gene deletion strains (either before or after switching the kanMX marker to natMX), and we generated four parent strains by de novo gene deletion in BY4741 or BY4742. Neither the commercial SSA1/ssa1∆ strain nor any diploids produced by crosses with any de novo ssa1∆ mutant parent were able to sporulate, indicating that two copies of this HSP70 chaperone gene is essential for meiosis. Interestingly this was not the case for the Ssa1 co-chaperone, Ydj1. We also generated SGA haploid selection marker strains by mating and tetrad dissection of the aforementioned strains with the SGA strain developed by the Boone lab29 or by de novo gene deletion in that strain (55 and 70 parent strains, respectively). Each set of parent strains carried at least two differently marked deletions in all or most of the 36 genes for each of two different markers. According to the workflow described in Fig. 2, single mutant parent strains with opposite markers were crossed, and both MATa and MATα using SGA29 haploid selection markers. Overall we performed four series of 36*36 crosses and collected both mating types for each crosses. Out of the 10,368 cell lines that experimental design could generate, we obtained 6589 mutants corresponding to 36*35/2 = 630 double-mutant combinations.

Fig. 2: Simplified representation of experimental workflow.
figure 2

Example of a single cross plate where two MATa ‘bait’ strains in which the gene of interest (GOI) was knocked out (KO) with a kanamycin resistant marker (kanMX, also confers resistance to G418) are each crossed to the 36 MATa ‘hit’ strains in which the gene of interest was knocked out with a nourseothricin-resistant marker (natMX). Heterozygous diploids were selected for on media containing both antibiotics, and then sporulated on standard sporulation media. The sporulated colonies were pinned onto a series of specialized SGA media that select for MATa and MATα haploid progeny. Positions on the double mutant haploid plates that would have resulted in ‘monogenic crosses’ (where the same gene of interest was knocked out in both parents) or ‘single parent crosses’ (where one of the parent positions was empty) were monitored for potential false-negatives. The double mutant haploid progeny were used to identify synthetic lethal interactions and then pinned in quadruplicate on a fresh YPD plate. WT controls were added, and the resulting master plate was pinned onto six different media types for phenotyping. Phenotyping plates were imaged every 12 h to monitor growth rates.

Examining these 6589 mutants, we first flagged potential SL interactions by scoring each cross as ‘growth’ or ‘non-growth’, i.e., each double-mutant haploid colony as ‘present’ or ‘absent’ on double mutant haploid selection plates (Fig. 3).

Fig. 3: Binary assessment of colony growth for double mutants in all four sets of crosses.
figure 3

Figures on the left were derived from MATa progeny. Figures on the right were derived from the sister MATα progeny. MATa parents are organized along the x-axis and MATα parents are organized along the y-axis alphabetically by the gene that was knocked out. Rows or columns shaded light gray indicate positions on the plate that should have been empty, because the parent was never generated. The diagonal in each heat map indicates a cross between two parents in which the same gene was knocked out. These should not result in growth under selection. Red cells indicate unexpected growth and are an indication of the false negative rate. Rows and columns shaded dark gray indicate parents that were never generated or were excluded from the analysis, because at least one-third of the progeny resulting from that parent failed to grow. Duplicates of the same gene/marker combination within the same cross are not shown. Total number of crosses (excluding monogenic) = 7350. a, b Cross 1. c, d Cross 2. e, f Cross 3. g, h Cross 4.

No combination of genes produced the same results in every cross. In fact, the results among biological replicates varied considerably (Fig. 3, Table S4). Hence, we set a threshold for defining likely SL interactions. If evidence for synthetic lethality was observed four or more times irrespective of which parent strain the deletions were derived from, we flagged the combination as ‘likely SL’ in Fig. 4. The complete results for all progeny are compiled in Fig. 5.

Fig. 4: Curation of synthetic lethal interactions.
figure 4

For previously documented SL interactions, it is noted whether they were manually curated or derived from a high-throughput (HTP) screen. The ‘SL score’ highlights the number of crosses in our screen that support synthetic lethality out of the total number of replicates. Double mutant combinations are reported as synthetic lethal (SL), viable (V), or having reduced viability (RV). More than one reported TA result indicates that the TA was repeated for two or more crosses. Results marked with an asterisk are lower confidence due to poor spore viability. Cases where spore viability was too low to determine synthetic lethality are marked MD to indicate a possible meiotic defect. Rows shaded orange mark likely synthetic lethal interactions. Rows shaded blue mark likely viable interactions. Instances where the model does not support the ‘high confidence’ interactions are bolded.

Fig. 5: Likely synthetic lethal interactions determined by compiling data from all crosses.
figure 5

a Table documenting how many crosses supported synthetic lethality (no growth of the double mutant progeny). Synthetic lethal interactions that we designate as ‘high-confidence’ in Fig. 4 are outlined in black. b Venn-Diagram comparing observed SL interactions with those that have been previously documented and/or predicted by the Kraikivski model. Note: clb2∆ clb5∆ is excluded as these genes are linked.

A threshold of four was selected, because it ensures that the interaction was seen in at least two of the independent sets of crosses performed. This threshold also provided for the highest level of agreement between our screen and previously published results (discussed in the following section). For our set of 630 combinations, we observed 29 that exhibited synthetic lethality in at least four biological replicates.

Comparing the results of our screen with previously reported SL interactions

In Fig. 4, we compare our results to the 36 SL gene combinations documented on SGD for the 36 genes in our data set (excluding several curation errors which are listed in the supplement) and to the predictions of Kraikivski’s published model19. There are 58 lines in Fig. 4, referring to 58 (out of 630 possible) gene combinations for which one or more of the following statements is/are true:

  1. 1.

    the combination is documented as SL on SGD,

  2. 2.

    the combination is observed in our screen as likely SL, and

  3. 3.

    the combination has been predicted to be SL by Kraikivski’s model.

A Venn diagram indicating the overlap of these 58 gene combinations is provided in Fig. 5b.

Of the 36 gene combinations documented as SL on SGD, 16 were observed as likely SL in our screen (Fig. 5b). Meanwhile, 13 of our observed SL gene combinations are not listed on SGD. Hence, the overlap between the previously published SL interactions, and the combinations in our screen that exhibited synthetic lethality in at least four replicates is only ~50%. Dropping the threshold for likely SL interactions in our screen from four to three would have resulted in the identification of only one additional previously documented SL interaction (lte1∆ sic1∆), while adding 13 SL interactions that are not supported by the literature. Increasing the threshold to 5 would have excluded an additional 13 SL interactions that have been previously observed.

As a check on these comparisons, we performed TA on at least one cross for all of the combinations listed in Fig. 4 except for one (cdh1∆ ssa1∆) from which we failed to recover tetrads. Of the 13 SL gene combinations that we observed for the first time in this study, 6 were not SL by TA (bck2Δ cdh1Δ, bub2Δ cdc55Δ, bub3Δ swi4Δ, cdc55Δ nrm1Δ, cdc55Δ swi4Δ, and swi6Δ whi3Δ). The other seven (bub1Δ swi6Δ, bub3Δ swi6Δ, cdc55∆ cdh1∆, cdc55∆ clb5∆, cdc55∆ lte1∆, cd55∆ whi3∆, and cdh1∆ swi6∆) exhibited variable results or low spore viability regardless of genotype in at least one of the crosses, complicating the interpretation of the results. Of the 20 ‘documented’ SL interactions that were not observed in our screen, 17/20 tested by TA were viable. The other three (lte1Δ sic1Δ, lte1Δ ydj1Δ, and msn5Δ swi6Δ) varied by replicates or exhibited low spore viability overall, and thus remain ambiguous.

In summary, our screen identified 13 new potential SL interactions, but none of these were definitively validated by TA. Our screen failed to validate 20 previously published SL interactions. By TA, we determined that at least 17 of these are likely not SL. Of the 16 double-mutant combinations that were both ‘documented’ SL on SGD and ‘likely’ SL according to our screen, TA confirmed that nine combinations are indeed inviable. The other seven remain ambiguous.

Based on these comparisons, we re-classify the 58 gene combinations in Fig. 4 as ‘high-confidence synthetic-lethal’ combinations (shaded orange), ‘high-confidence viable’ double-mutants (shaded blue), and ‘uncertain’ (unshaded). Of those that remain uncertain, for five gene combinations which all include swi4Δ or swi6Δ (bub3Δ swi6Δ, clb5Δ swi6, msn5Δ swi4Δ, msn5Δ swi6Δ, and swi4Δ swi6Δ) additional replicates were attempted, but no tetrads were recovered.

Some of the variability observed between replicate tetrad analyses of the same genotype, as well as apparent meiotic defects, may be the result of chromosome loss. For instance, Bub1 and Bub3, which are involved in regulating the SAC and tension sensing in spindles32,33, exhibited unusual behavior in halo assays indicative of chromosome loss (see additional data).

Using our screen to refine a previously published model of the cell cycle

In addition to SGD, we compared our ‘likely’ SL interactions with those that were predicted by Kraikivski’s model19. Of the 22 predicted SL gene combinations in Fig. 4, 10 are both documented and confirmed by our screen, two (bck2Δ swi6Δ and cln3Δ swi4Δ) were documented but not observed by us, and one (cdh1Δ swi6Δ) was observed by us but not documented on SGD. Nine predicted SL gene combinations were neither observed by us nor documented on SGD. We tested eight of these by TA and found six to be viable, while two (cdh1∆ lte1∆ and cdh1∆ ydj1∆) remain uncertain (Fig. 4). Five of these ‘orphan’ predictions involve cdh1∆, suggesting an overemphasis of Cdh1 activity in the model. We tested four of these combinations by TA and found that two were viable (cdh1Δ clb5Δ and cdh1Δ cln3Δ), while two remain uncertain (cdh1Δ lte1Δ and cdh1Δ ydj1Δ). Six SL gene combinations that were both documented and observed by us were not analyzed in Kraikivski’s model.

In summary, Kraikivski’s model makes 37 predictions (22 SL + 15 V) concerning the genetic interactions listed in Fig. 4. Of these predictions, 16 are consistent with our ‘high-confidence’ SL/V phenotypes, 7 are inconsistent (bolded in Fig. 4), and 14 are ambiguous. Hence, the accuracy of the published model is ~50%, comparable to the agreement between our screen and the literature.

The limited accuracy of the model’s predictions is likely due to the fact that the parameter values in the model were estimated by fitting the model to ‘documented’ SL gene combinations that are themselves unreliable. To correct this problem, we have re-parametrized the model in light of the ‘high confidence’ SL and viable (V) interactions (shaded orange and blue, respectively in Fig. 4), allowing for some flexibility for the uncertain interactions.

In reparameterizing the model, we had two intentions: (a) to maximize the number of correctly explained mutant phenotypes in Fig. 4, and (b) to simulate correctly those mutant strains with well-characterized phenotypes that were previously explained by the model. Guided by these two criteria, we manually adjusted 13 parameter values in the model (Fig. S3), as follows:

  • Because of the central roles played by SBF, MBF, and Cln3 in the START transition of the budding yeast cell cycle, we addressed our new results suggesting a viable phenotype for swi4Δ cln3Δ double-mutant cells in opposition to previous reports that swi4Δ cln3Δ is a SL strain25. To ‘rescue’ swi4Δ cln3Δ cells, we significantly increased the activation of MBF (Swi6:Mbp1) by Bck2 (the only activator of MBF in the absence of Cln3), while simultaneously increasing the inactivation of MBF by Clb2 and decreasing slightly the activation of MBF by Cln3, in order to keep the level of MBF activity similar to that of the previous model, thus minimizing the perturbations to all other mutants that were previously explained by the model. Because Ydj1 is a regulator of Cln3 activity, the phenotype of swi4Δ ydj1Δ agreed with new data too.

  • The viability of swi6Δ clb2Δ suggests that Swi4 alone (without Swi6) can successfully initiate the Start transition, and then the cell cycle can be completed without Clb2 (with Clb1 alone). To correctly simulate this mutant, we had to significantly increase the weight of Swi4 in the transcriptional regulation of the Start transition.

  • We also made adjustments to account for the five mutant strains involving cdh1∆ that our original model did not predict correctly (Table S6). In the model, cell division (upon exit from mitosis) is determined by Clb2 activity dropping below a certain threshold, which is in turn governed by Cdh1 (involved in Clb2 degradation during telophase) and Sic1 (an inhibitor of Clb2-dependent kinase activity as cells return to G1). Hence, the inviability of cdh1Δ sic1Δ cells is the crucial mutant defining the cell-cycle exit threshold, and it was correctly predicted by the original model. In this double-mutant, Clb2-dependent kinase activity is down-regulated in anaphase only by Cdc20-dependent degradation of Clb2. (In reality, of course, Clb2 activity depends on many upstream regulators—such as Ydj1, Clb5, Ssa1, Cln3, and Swi6—that affect cell mass at division.) Our new assessment of SL interactions allows for better ‘tuning’ of the parameters that govern Clb2 regulation by Cdh1, Cdc20, and Sic1. In addition, when originally constructing and parametrizing our model, we did not have many lte1∆ mutant strains to constrain Lte1-related parameters, so we adjusted parameters to correctly explain lte1∆ mutants.

The revised model is provided as SBML34 file in the Figshare dataset (see Data availability). The 2015 model is robust enough that the dynamics of WT and most mutants is the same for both models. Only some mutants show the different phenotype and distinct time course of cell cycle variables (Fig. S4). We have updated the list of mutant phenotypes predicted by the revised model and compared it with the predictions of the 2015 model (Fig. S5). We simulated the phenotype of 38 new mutants included in our experimental design. The predicted viability of 12 mutants changed as a result of the parameter adjustment and 15 mutants exhibit inconsistencies between the predicted and observed viability 321 simulated mutants. Most of these discrepancies involve specific types of mutations (multiple gene copies, GAL-inducible genes, and destruction box deletions) that are known to be less reproducible than simple deletions. In addition, the predicted viability of these mutants can be very sensitive to the parameters introduced in the model to capture the quantitative effects of these mutations.

Predictions of the newly parametrized model are given in the last column of Fig. 4. Our expertize in cell cycle regulation and mutant behavior allowed us to make these parameter adjustments manually; however, computational algorithms for reparameterization may be required if a larger number of novel mutant phenotypes becomes available in the future.

Inherent limitations of synthetic lethality screens

The SGA process relies on efficient production of double mutant haploid progeny from crosses. Mating defects, low sporulation efficiencies, meiotic defects causing poor spore viability, poor spore germination, or technical problems with the pinning process can prevent the transfer of double mutant cells during haploid selection, resulting in false positives (i.e., poorer growth than there should be29,35). Genetic interactions resulting in reduced fitness are also subject to significant selection for genetic mishaps that improve fitness, resulting in false negatives (i.e., better growth than there should be29,35). Genetic mishaps resulting in false negatives can include spontaneous mutation to introduce a suppressor mutation36, or disomy. Disomy can result from chromosome nondisjunction during sporulation, or gene conversion resulting in escape of heterozygous diploids from haploid selection35,37. False negatives can also result from contamination from outside sources or cross-contamination during replica-pinning.

Following the presence or absence of colonies throughout the SGA process, we found that all crosses produced diploids (see Additional data). Therefore, failure to mate did not produce any false positives. False positives can also result from inefficient pinning or systematic problems with the parents resulting in overall low viability. Parent lines that resulted in fewer than 12/36 viable haploid progeny were excluded from the analysis, but some false positives likely persisted. For instance, seven of the SL gene combinations observed in our screen involved cdc55∆, which was problematic in most genetic contexts due to inconsistent pinning (cells were very dry and clumpy and did not adhere well to pins). By TA, we identified 6 out of 29 SL gene combinations observed in our screen to be definitive false positives.

Our experimental design makes it possible to get rough estimates of false negative rates by monitoring positions on each plate that should have been empty for growth. We designed our screen such that ‘hit’ strains were arrayed the same way (alphabetically by the gene knocked out) for every cross, leaving empty spaces for any parent that was not generated (Fig. 2). In this way, some positions in the cross had only one parent crossed to an empty position (‘single parent’ in Fig. 2), and some positions had two parents that were mutant for the same allele (‘monogenic cross’ in Fig. 2). Neither of these ‘crosses’ should result in colonies during the final round of double-mutant haploid selection, as they will not contain both of the antibiotic resistance markers. Colonies at ‘single parent’ or ‘monogenic cross’ positions are indicative of a false-negative event (red cells in Fig. 3).

To estimate the contribution of contamination to false negatives, we identified colonies in empty plate positions. All plates were devoid of contaminating colonies in empty positions (Table S3). In positions containing only one parent strain, only 3/570 positions on the haploid progeny plates had any contaminating colonies (Table S3). Therefore, contamination is a negligible contributor to observed false negatives.

To identify false negatives arising from genetic mishaps, we identified colonies produced by crosses between two strains carrying deletions of the same gene. Out of 202, 77 monogenic crosses resulted in progeny on the final haploid selection plates (Table S3), indicating a coarsely estimated false-negative rate of 38%.

These false-negative events occurred more frequently for MATαATcurred more MATa progeny (Table S3). This is to be expected, because MATa progeny can escape selection for MATαATogeny can escape selection for expecteSTE3pr-LEU2 and leu2∆0, but gene conversion cannot occur between STE2pr-SpHIS3 (Schizosaccharomyces pombe orthologue) and his3∆1 to allow MATα cells to escape MATa selection29,35. If MATa progeny persist through the MATα selection due to gene conversion, they can mate with the neighboring MATα progeny producing diploids that are heterozygous for both markers.

Although few SGA or E-MAP studies report them, it is well-established that these screens have high, but variable, false-negative and false-positive rates from 17% to 70%30,37,38,39,40 and 5% to 90%39,41,42,43,44, respectively. The false-positive and -negative rates observed in our study are thus in the normal range for large genetic screens.

Quantifying fitness and genetic interactions across six media types

As the most extreme genetic interaction, synthetic lethality has a powerful influence on models of cell-cycle regulating genes. However, due to the limitations of synthetic lethality screens more accurate models call for more nuanced phenotypic markers.

To identify interactions between the 36 genes that do not result in synthetic lethality, we monitored the growth rate of the viable double mutants over a time course. Each mutant was assigned a fitness score according to how the growth rate compared with wild-type controls on the same plate. Using this approach, we identified ~100 gene combinations that were not SL but had fitness scores more than six standard deviations below wild-type under normal growth conditions (Fig. 6).

Fig. 6: Comparison of fitness scores for double mutants in all four sets of crosses on YPD media.
figure 6

Figures on the left were derived from MATa progeny. Figures on the right were derived from the sister MATα progeny. MATa parents are organized along the x-axis and MATα parents are organized along the y-axis alphabetically by the gene that was knocked out. Rows and columns shaded dark gray indicate crosses that were not performed or were excluded from the analysis. a, b Cross 1. c, d Cross 2. e, f Cross 3. g, h Cross 4. Fitness heatmaps for the remaining five media types are available is Figs. S3S7. Duplicates of the same gene/marker combination within the same cross are not shown.

More importantly, by comparing fitness scores of the double mutant progeny with those of their single mutant parents, we calculated genetic interaction (GI) scores for all viable mutants. GI scores45,46,47 are a function of the parent and progeny fitness and illustrate the direction (positive or negative) and the magnitude of the interaction for each of the ~600 viable gene combinations. Non-zero GI scores indicate a possible epistatic relationship. Negative GI scores suggest that the genes involved may have redundant functions, while positive GI scores indicate that one mutation may have a rescuing effect over the other.

As with synthetic lethality, we observed a considerable amount of variability in fitness scores and GI scores for mutants of the same genotype in different crosses (biological replicates, Fig. 6). To identify trends within the variability, GI scores for a given genotype were sorted into different bins, and the bin that contained the largest number of biological replicates was used to determine a consensus GI score which is represented in Fig. 7 and Fig. 8. From the distribution of overall GI scores for a given media, we flagged those with a consensus score at the extreme positive and negative ends. Those gene combinations with consensus GI scores in the top or bottom 5% of all GI scores are reported in Fig. 8.

Fig. 7: Genetic interaction scores on each media type determined by compiling data from all crosses.
figure 7

Heat maps show the distribution of binned genetic interaction (GI) scores for each mutant combination. Brighter green and darker red squares correspond to higher positive and lower negative GI scores, respectively. Gray squares denote gene combinations for which three or fewer crosses were generated. Histograms show the overall distribution of GI scores for each media type. The dotted red lines distinguish the lower and upper 5% of interactions. a YPD, b YPD + raffinose, c YPD + galactose, d YPD + benomyl, e YPD + camptothecin, f YPD + hydroxyurea.

Fig. 8: Genetic interaction score outliers for each media type.
figure 8

Consensus genetic interaction (GI) scores are reported for each gene combination that had a score in the top or bottom 5% of overall GI scores for one or more media types. The number reported reflects the midpoint of the bin occupied by the largest number of biological replicates. Red and green shading highlights negative and positive scores, respectively. The top 2.5% and 5% are shaded light and dark green, respectively. The bottom 2.5% and 5% are shaded light and dark red, respectively. Interactions which we determined to be very likely synthetic lethal are shown in bold. ‘ND’ indicates GI scores that were not determined.

To further identify GIs among our set of cell cycle regulator genes that may not be apparent under standard growth conditions, we also calculated fitness and GI scores for all double mutant progeny and single mutant parents in the presence of two different carbon sources and in the presence of three checkpoint activating drugs (Figs. S6–S10).

YPDextrose served as a control (mass doubling time ~100 min). Mass doubling times are longer on YPGalactose (~150 min) and even longer on YPRaffinose (~200 min)48. Slower growth rates can enable positive regulators to build up such that mutants which would normally grow very slowly due to the stochasticity of cell cycle transitions can exhibit some level of rescue on YPG or YPR23,49.

The distribution of GI scores that we observed was comparable for YPD and YPG, but the GI scores occupied a narrower range for mutants grown on YPR (Fig. 7), suggesting that the very slow growth rate provided by YPR might allow the growth of mutants that have more extreme phenotypes on YPD to normalize on YPR.

The drugs Benomyl (Ben), camptothecin (CPT), and hydroxyurea (HU) activate checkpoints50,51,52,53,54,55,56,57,58,59,60. Mutants defective in these checkpoints will rush through the cell cycle and accumulate genetic/chromosome defects leading to slower growth due to decreased viability. We expect known checkpoint mutants to exhibit reduced fitness under these conditions, but interactions with other cell cycle regulators (including other checkpoint genes) can enhance or suppress the checkpoint defects32,61,62,63,64,65,66,67.

In several cases, gene combinations that had a GI score within the normal distribution on YPD, showed a much more extreme GI on one or more of the other five media types (Fig. 8). For instance, the GI score for fkh1∆ fkh2∆ on YPD was negative, but not remarkably so. However, on YPD + Ben and YPD + CPT, the consensus GI scores for fkh1∆ fkh2∆ were in the lower 5% and 2.5%, respectively. Fkh1 and Fkh2 both promote the transition from G2 to M, so the double mutant is likely to cause stalling at G2. Ben prevents spindle assembly while activating the spindle assembly checkpoint, so that cells move forward to M phase despite not properly forming a mitotic spindle. CPT causes DNA damage during M phase. So, cells that make it to M phase in an fkh1∆ fkh2∆ mutant would likely arrest in the presence of Ben or CPT, thus exacerbating the mutant phenotype.

Relative GI scores for a family of gene combinations also reflect the role of those genes within the cell-cycle regulatory network. For instance, Bub1 and Bub3 function along with Mad1, Mad2, and Mad3 to arrest cells in metaphase in response to defective attachments of kinetochores to spindle microtubules—a mechanism called the spindle assembly checkpoint (SAC)32,33. However, Bub1 and Bub3 also have a role in tension sensing in spindles independent of their role in the SAC32. This can be seen in the observation that bub1/3 mutants have lower GI scores in benomyl than mad1–3 mutants (Fig. 8).

Interestingly, several other mutants did not show reduced fitness in benomyl but did display a chromosome loss phenotype (Fig. 8). These mutants were also SL or synthetic sick with bub1/3 mutants. Clb5 is one such mutant and has previously been predicted to have a role in tension sensing32,33, which the GI suggests works independently of Bub1/Bub3. Interestingly, although Sic1 works to inhibit CDK-Clb68,69,70, including Clb5, the sic1∆ phenotypes were similar to those of clb5 mutants. Since Sic1 is important for suppressing CDK/Clb activity and is activated by the mitotic exit network, we hypothesize that elevated CDK/Clb may prolong anaphase resulting in spindle positioning defects, or defects in SAC silencing.

Although slow growth of swi6∆ mutants made it difficult to assess halos, like clb5∆ and sic1∆ mutants, they also appeared to increase chromosome loss. However, unlike Clb5 and Sic1, Swi6 has no direct role in mitosis. Nevertheless, reduced viability in bub1/3 swi6 double mutants suggests some interaction. We propose that reduced activity of the MBF and SBF at START perturbs expression of proteins important for spindle function or chromosome cohesion, exacerbating the chromosome segregation defects of the bub1/3 mutants.

It is important to note that not all of the gene combinations that we identified as ‘high-confidence’ SL had remarkably negative GI scores in our screen. There are two plausible explanations for this discrepancy. First, the use of in-plate wild-type controls prohibited the use of antibiotics in the phenotyping screen, so false negatives (growth where growth is not expected) due to contamination are more likely. Second, for gene combinations that are truly SL, any living colonies are necessarily the result of false negatives due to genetic mishaps. These gene combinations are thus more prone to result in outliers with higher than expected GI scores and should be interpreted with caution.

Discussion

The selective pressure applied by SL screens leads to genetic mishaps that rescue mutants that would otherwise be lethal29,35; conversely, the low fitness of many of the single mutant parents can add up and result in offspring with such a low fitness that they may be interpreted as SL. These false-negative and false-positive events lead to very high levels of variability (see Table S5 for an example). We accounted for this variability by probing a relatively small number of genes with an unprecedented number of biological replicates. While E-MAP screens generally incorporate four biological replicates24, and SGA screens rely on technical replicates alone29, most of the GIs tested in this study included between 8 and 16 independent biological replicates (Table S4). We also compared our results with previous publications and resolved discrepancies via TA in order to generate a list of ‘high confidence’ SL interactions which informed a new iteration of a previously published cell cycle model.

Variability in SL screens is a major challenge for modelers. The ~100 tetrad analyses performed in this study demonstrate an unexpectedly high level of variation even among low-throughput, manual experiments. For this reason, synthetic lethality may not be the best marker for parameterizing models. In addition, models based on synthetic lethality are inherently deterministic; yet, it is well-known that many of the processes governing progression through the cell cycle are stochastically regulated. Modeling stochasticity will require a more fine-grained dataset that provides quantitative phenotypes based on parameters such as growth rate, rather than deterministic phenotypes such as lethality or checkpoint arrest.

The scope of the 2015 Kraikivski model represents a subset of the genes involved in cell proliferation71,72,73. It would have been desirable to add new genes to the model so that the revised model could predict the phenotype of all the mutants included in our experiment. Some of the genes are not immediately related to the cycle regulatory network. Adding them to the 2015 model would require an extensive modeling effort. In addition, accounting for the variability of synthetic lethality data and the fitness estimates would require a very substantial modification of the 2015 modeling framework that can only predict viability and is unable to account for growth rates.

The results presented here demonstrate that quantitative cell phenotyping can be readily performed in a high-throughput workflow. By comparing colony sizes over time, we generated a quantitative picture of growth rates for over 6500 mutants. This more sensitive approach enabled us to identify interesting GIs with less extreme phenotypes than synthetic lethality (e.g., whi3∆ ydj1∆) and gene combinations that provided a rescue effect (e.g., bub3∆ cdh1∆). We also show that our workflow can be expanded to include different test conditions. By quantitatively phenotyping our mutants on six different media types, we demonstrate that our approach is sensitive enough to capture environmental variability. Data for the ~40,000 gene-by-media combinations is available in the supplement and can be used to develop more elaborate models of cell cycle regulatory control.

This potential for rapid generation of complex datasets capturing cellular response to multifactorial perturbations makes it considerably more challenging to build and update models that can explain the data and make testable, novel predictions. At one end of the spectrum are approaches that can use gene expression measurements to predict growth rates74,75. These methods can be trained rapidly but do not provide insights into underlying molecular mechanisms. In contrast, detailed, kinetic models such as the Kraikivski model encompass a wealth of molecular mechanisms. They can make predictions on the effect of multigene perturbations. However, they are challenging to parameterize, and it can be expensive to update and expand them in the face of new datasets. In order for the analysis of data and modeling to keep up with the rapid acceleration of data production it will be necessary to develop more scalable frameworks. In order for the analysis of data to keep up with the rapid acceleration of data production, it will be necessary to adopt more scalable modeling frameworks73,76.

Methods

Experimental workflow

To generate the double mutants, we used a modified epistasis miniarray profile (E-MAP) workflow30. The E-MAP workflow is a modification of the synthetic genetic array (SGA) protocol30. In a typical SGA screen, a single query strain is crossed to all viable deletion strains (over 4000)29,41. The query strain includes a set of reporter genes that allow selection of haploid progeny of one mating type or another. E-MAP screens use the same series of selection conditions, but generally involve a few hundred deletion strains crossed to produce every possible combination of double-gene deletions30.

Our experimental design most closely follows the E-MAP approach but with a few significant differences. First, we focused on a set of only 36 cell cycle genes. Second, we used eight sets of parent strains in four sets of crosses, increasing the number of biological replicates to eight from four in a standard E-MAP or one in a standard SGA (which use technical replicates29,30):

(1) MATa/genei::kanMX

(5) MATa/genei::kanMX/SGA

(2) MATa/genei::natMX

(6) MATa/genei::natMX/SGA

(3) MATα/genei::kanMX

(7) MATα/genei::kanMX/SGA

(4) MATα/genei::natMX

(8) MATα/genei::natMX/SGA

genei::kanMX refers to genei knocked-out with a kanamycin-resistance marker, genei::natMX refers to genei knocked-out with a nourseothricin-resistance marker, and ‘SGA’ refers to the haploid-selection markers can1Δ::STE2pr-Sphis5 and lyp1Δ::STE3pr-LEU2 used in SGA screens. Details for how each of the parent strains were generated can be found in the online supplement. These parent strains were confirmed by PCR and used in four sets of crosses:

Cross 1: Strain(1, genei) × Strain(8, genek)

Cross 2: Strain(2, genei) × Strain(7, genek)

Cross 3: Strain(3, genei) × Strain(6, genek)

Cross 4: Strain(4, genei) × Strain(5, genek)

From these crosses, we selected double-mutant progeny of both mating types, further increasing the biological replicates to 16. (SGA and E-MAP screens select only MATa progeny29,30). All media were standard recipes for SGA29 (see online supplement). Mating type of the double mutant progeny and the single mutant parents was confirmed via Halo assays77 (See online supplement). With this protocol, we generate (in principle) 16 biological replicates of each double-mutant, geneiΔ genekΔ. In certain cases, two parents of the same genotype were generated independently, such that the total number of biological replicates is up to 20 for some double mutant combinations.

We measured colony growth rates on three different growth media (YPD, YPG, and YPR) and in the presence of three different checkpoint activating drugs: Benomyl (Ben), Camptothecin (CPT), and Hydroxyurea (HU). Ben disrupts attachment of kinetochores to the mitotic spindle and activates the spindle assembly checkpoint (SAC; dependent on Bub1,3 and Mad1–3)50,78,79. CPT inhibits topoisomerase resulting in DNA entanglements and double strand breaks upon chromosome segregation80,81. HU inhibits ribonucleotide reductase82, which leads to replication fork stalling83.

We first derived 384 template arrays from 96 array of the haploid progeny (which were in 96 array) using a Rotor HDA (Singer Instruments, Somerset, UK). The 96 progeny array and the 384 template array were grown on YPD + G418(600 µg/ml)/nat(150 µg/ml). Rows A, B, I and J were left empty for in-plate wild-type controls colonies. At the same time, we set up YPD plates with the wild-type parent strains BY4741 and BY4742 arrayed at 384 density, occupying positions in rows A, B, I, and J. We incubated both sets of plates at 30 °C for 2 days.

We then replica pinned the wild-type controls onto new YPD plates (using a new source plate whenever the colonies began to look depleted). After visual inspection of the plates to ensure even transfer of the wild-type controls, we replica pinned the set of double mutant colonies to the templates. Plates were imaged after 12, 24, 36, 48, and 60 h of growth at 30 °C (Fig. S1A).

We imaged all diploid selection plates, final haploid progeny selection plates, halo assay plates, and phenotyping plates using the Phenobooth (Singer Instruments, Somerset, UK) imaging platform and software. To maintain consistency, all images were collected in the same order at the same resolution and camera settings, and were batch processed to crop the image, perform background subtraction and colony identification whenever possible. We then exported the raw colony size data for analysis.

The yeast mutants are available upon request to the corresponding author under the terms of the OpenMTA.

Data analysis

Plate-to-plate variation was accounted for by normalizing colony size using in-plate wild-type controls. Edge-effects were accounted for by adjusting the growth rates such that the mean growth rates of edge-adjacent colonies and internal colonies were comparable (Fig. S1B and Fig. S2). Jack-knife filtering was used in a small number of cases to remove colonies that behaved as outliers within quadruplicates (four technical replicates).

Growth rates, fitness scores, and GI scores45,46,47 were calculated using a linear model for growth rate according to the following equations:

$$\begin{array}{*{20}{l}} {{\mathrm{Growth}}\,{\mathrm{rate}}\,\left( {r} \right):} \hfill & {{s}_{\mathrm{t}} = {r} \cdot {t} + {s}_0} \hfill \\ {{\mathrm{Fitness}}\,{\mathrm{score}}\,\left( {W} \right):} \hfill & {{W} = {r}_{{\mathrm{mutant}}}/{r}_{{\mathrm{WT}}}} \hfill \\ {{\mathrm{Genetic}}\,{\mathrm{interaction}}\,{\mathrm{score}}\,\left( \varepsilon \right):} \hfill & {\varepsilon = {W}_{{\mathrm{AB}}} - {W}_{\mathrm{A}}{W}_{\mathrm{B}}} \hfill \end{array},$$

WAB = fitness score for the double-mutant progeny, WA = fitness score for the MATa parent, WB = fitness score for the MATα mutant, st = colony size at time t, and s0 = colony size at time 0.

A histogram binning procedure was used to estimate the mode for GI scores across biological replicates (up to 20 independent crosses). The ‘consensus’ GI score reported in Figs. 7 and 8 is the midpoint of the bin containing the maximum number of values (additional details in the Supplement).

Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.