Systematic perturbation of yeast essential genes using base editing

Base editors derived from CRISPR-Cas9 systems and DNA editing enzymes offer an unprecedented opportunity for the precise modification of genes, but have yet to be used at a genome-scale throughput. Here, we test the ability of an editor based on a cytidine deaminase, the Target-AID base editor, to systematically modify genes genome-wide using the set of yeast essential genes. We tested the effect of mutating around 17,000 individual sites in parallel across more than 1,500 genes in a single experiment. We identified over 1,100 sites at which mutations have a significant impact on fitness. Using previously determined and preferred Target-AID mutational outcomes, we predicted the protein variants caused by each of these gRNAs. We found that gRNAs with significant effects on fitness are enriched in variants predicted to be deleterious by independent methods based on site conservation and predicted protein destabilization. Finally, we identify key features to design effective gRNAs in the context of base editing. Our results show that base editing is a powerful tool to identify key amino acid residues at the scale of proteomes.

constructions. An alternative approach would be to use base editors, which allow the introduction 62 intronic sequences ( Figure 1A, Figure S1). Because all essential genes have the same fitness 88 effects when deleted (Giaever et al. 2002), focusing on these genes allowed to limit the variation 89 in fitness that could be due to the relative importance of individual genes for growth rather than to 90 the importance of specific positions. 91 To ensure we could predict gRNA mutational outcomes with accuracy, we included in the library 92 only gRNAs with one to two nucleotides with a high probability of being edited based on the known 93 activity window of Target-AID in yeast (Nishida et al. 2016). We could then predict mutagenesis 94 outcomes for gRNAs computationally. We took into account that Target-AID is produces both C-95 to-G and C-to-T mutations in yeast, with a 1.5 to 2 fold preference for C-to-G (Nishida et al. 2016; 96 Després et al. 2018). We also extended the analysis to include other point mutants at possible 97 secondary editing sites within the activity window (see methods). As such, we could associate 98 most gRNAs targeting protein-coding DNA to a primary C-to-G and C-to-T outcome (C-to-G #1 99 and C-to-T #1), as well as to possible secondary outcomes if applicable (C-to-G #2 and C-to-T 100 #2). We did not consider gRNAs that did not target between the 0.5th and 75th percentile of the 101 length of annotated genes to limit position biases that could influence the efficiency of stop-codon 102 generating guides (Doench et al. 2014;Michel et al. 2017). 103 The gRNA library was cloned into a high-throughput co-selection base editing vector (Després et 104 al. 2018). We performed pooled mutagenesis followed by bulk competition ( Figure S2) to identify 105 mutations with significant fitness effects. As the relative abundance of each gRNA in the extracted 106 plasmid pool depends on the abundance of the subpopulation of cells bearing these gRNAs, any 107 fitness effect caused by the mutation they induce will influence their relative abundance. Variation 108 in plasmid abundance was measured using targeted next-generation sequencing of the variable 109 gRNA locus on the base editing vector in a manner similar to GeCKO approaches (Sanjana et al. Tetrad dissection of a heterozygous deletion mutant bearing an   133   empty vector results in only two viable spores, while the wild-type copy in the same vector restores growth.   134   Dissection of the two heterozygous mutants bearing a plasmid with the most probable single mutant based   135   on the known activity window of Target-AID shows both mutations are lethal.   136   137 After applying a stringent filtering threshold based on barcode read count at the mutagenesis step 138 ( Figure S2), we identified a total of ~17,000 gRNAs for which we could evaluate fitness effects. 139 Replicate data for gRNAs passing the minimal read count selection criteria show high correlation 140 across experimental time points ( Figure S3) and cluster by experimental step (Figure S4), 141 showing that the approach is reproducible. Using the distribution of abundance variation of non-142 functional gRNAs with synthesis errors as a null distribution (see methods), we identified 1,118 143 gRNAs across 605 genes or loci with significant negative effects (GNE) on cell survival or 144 proliferation using at an estimated 5% false positive rate. GNEs are distributed evenly across the 145 yeast genome ( Figure 1B and 1C), suggesting no inherent bias against specific regions. 146 An example of barcode abundance variation through time for all gRNAs (both GNEs and NSGs) 147 targeting GLN4 is shown in Figure 1D. GLN4 is an essential gene coding for a glutamine t-RNA 148 synthetase. To confirm the deleteriousness of the predicted mutations, we transformed a 149 centromeric plasmid bearing a wild-type or mutated copy of the gene under the control of its native 150 promoter (Ho et al. 2009) in a heterozygous deletion background (Giaever et al. 1999). Following 151 dissection, spore survival was compared between wild-type and mutated copy of GLN4 ( Figure  152 S5). Using this approach, we confirmed the strong fitness effect of the best scoring GNE for GLN4, 153 as the most probable mutations generated are in fact lethal ( Figure 1D). 154

Comparison of GNE induced mutations with variant effect predictions 155
If GNEs indeed induce specific deleterious mutations, these mutations should be predicted to be 156 more deleterious than those of Non-Significant gRNAs (NSG). We tested two recently published 157 resources for variant effect prediction: Envision (Gray et al. 2018) and Mutfunc (Wagih et al. 158 2018). Envision is based on a machine learning approach that leverages large-scale saturated 159 mutagenesis data of multiple proteins to perform quantitative predictions of missense mutation 160 effects on protein function. The lower the Envision score, the higher the effect on protein function. 161 Mutfunc aggregates multiple types of information such as residue conservation through the use 162 of SIFT (Ng and Henikoff 2003) as well as structural constraints to provide a binary prediction of 163 variant effect based on multiple quantitative and qualitative values. Mutations with a low SIFT 164 score have a lower chance of being tolerated, while those with a positive ∆∆G are predicted to 165 destabilize protein structure or interactions. Both Envision and the Mutfunc aggregated SIFT data 166 cover the majority of the most probable mutations generated by the gRNA library ( Figure S6A). 167 The structural modeling information had much lower coverage, covering at best around 12% of 168 the most probable mutations ( Figure S6B As expected, mutations generated by GNEs showed significantly lower SIFT scores (Figure 2A) 186 and showed enrichment for strong effects predicted by SIFT, and Envision. Indeed, all four most 187 probable substitutions created by GNEs are about twice more likely to be predicted to have a 188 large deleterious effect by Envision or a very low chance of being tolerated as predicted by SIFT 189 compared to NSG gRNAs. The high homogeneity of Envision scores across the proteome makes 190 it harder to interpret. As such, the shift in score values is more subtle but supports that GNE 191 mutations are generally more likely to be deleterious as well ( Figure S6C, Figure S7A). 192 Mutation with destabilizing effects as predicted by structural data also appeared to be enriched 193 for the most probable mutations but low residue coverage limits the strength of this association.

Sensitive sites provide new biological insights 204
Because our screen specifically targeted essential genes, many gRNAs cause mutations in highly 205 conserved regions with high functional importance. To illustrate this, we focus on the highest 206 scoring GNE targeting GLN4, a tRNA synthetase, shown in Figure 1D. The gRNA 33725 mutates 207 a glycine at position 267 into either an arginine or a serine. Glycine 267 is part of the "HIGH" motif, 208 characteristic of class I tRNA synthetases, and is involved in ATP binding and catalysis and is 209 highly conserved through evolution (Eriani et al. 1990). As expected, the region around the "HIGH" 210 motif shows both a low evolutionary rate based on inter-species comparisons and a much lower 211 variant density in yeast populations compared to other domains of Gln4 ( Figure S3B), showing 212 conservation both on a short and long timescales. Surprisingly, mutagenesis experiments in the 213 bacterial homolog MetRS concluded that mutating this residue from glycine to alanine did not alter 214 significantly catalysis while mutating it to proline had a strong disruptive effect (Schmitt et al. 215 1995). We found that mutating Gly 267 either to Arg and Ser was enough to cause protein loss of 216 function ( Figure 1D). Other sensitive sites identified in GLN4 by our screen are also clustered in 217 regions with slow evolutionary rates. Interestingly, one of these mutations affects residue R568, 218 which has been hypothesized to play a conserved role from bacteria to yeast in the anti-codon 219 and glutamine recognition process (Grant et al. 2013). 220 Since Target-AID can only generate a limited range of amino acid substitutions from a specific 221 coding sequence, we investigated whether any of these mutational patterns were enriched in 222 GNEs ( Figure 3A, source data in Supplementary tables S2, S3, and S4). We found several 223 deviations from random expectations in both C-to-G and C-to-T mutation ratios as well as in 224 mutation combination ratios. Three out of four of the mutation pair patterns involving glycine were 225 enriched in GNEs. For example, the Glycine to Arginine or Serine substitutions (as exemplified 226 by guide 33725 targeting GLN4) is the second most enriched pattern, being almost four-fold 227 overrepresented in GNE outcomes. This pattern is consistent with the fact that Arginine has 228 properties highly dissimilar to those of Glycine (Sneath 1966), making these substitutions highly 229 deleterious. Furthermore, as Glycine residues are often important components of cofactor binding 230 motifs (eg.: Phosphates) (Copley and Barton 1994)

this observation might reflect a tendency for 231
GNEs to alter these sites. Interestingly, genes for which more than one GNE were detected were 232 enriched for molecular function terms linked to cofactor binding (Supplementary table 5). This 233 suggests that the GNEs might indeed have a tendency to affect protein function through 234 mechanisms other than protein or interaction interface destabilization. These protein properties 235 depend on many residues, making them more robust to single amino acid substitutions, whereas 236 cofactor binding may depend specifically on a handful of residues, making these sites critical for 237 function. 238 As expected, there is a strong enrichment for patterns that result in mutation to stop codons: both 239 C-to-G patterns (Tyrosine to stop and Serine to stop) but only one C-to-T pattern (Tryptophan to 240 stop) was overrepresented significantly. Substitutions to stop codon in one outcome also drove 241 enrichment in the other: for example, the link between Serine to Stop (C-to-G) appears to be the 242 cause of the Serine to Leucine (C-to-T) overrepresentation. Both mutation pairs involving mutating 243 a Tryptophan to a stop via a C-to-T mutation: this is not surprising, as the alternative mutations 244 Tryptophan to Serine or Cysteine are also highly disruptive (Sneath 1966). Changes between 245 similar amino acids, which are expected to be tolerable, were also generally depleted in GNE (ex.: The precise targeting of our method also allows us to investigate amino acid residues with known 264 functional annotations such as post-translational modifications. We found no significant 265 enrichment for gRNAs mutating directly annotated PTMs (ratio GNE PTM = 19/1118, ratio NSG PTM 266 243/15536, Fisher's exact test p=0.71). This is consistent with the hypothesis that many PTM 267 sites may have little functional importance (Landry et al. 2009) and thus their mutations may have 268 no detectable effects for a large part. The same was also observed for gRNAs mutating residues 269 near known PTMs that could disturb recognition sites (ratio GNE nearPTM = 130/1118, ratio NSG nearPTM 270 = 1698/15536, Fisher's exact test p=0.43). However, GNEs that do target annotated PTM sites 271 might provide additional evidence supporting the importance of these sites in particular. For 272 example, the best scoring GNE in the well-studied transcriptional regulator RAP1 is predicted to 273 mutate residue T486. This threonine has been reported as phosphorylated in two previous studies 274 (Albuquerque et al. 2008;Holt et al. 2009), but the functional importance of this phosphorylation 275 has not been explored yet. Residue T486 is located in a disordered region in the DNA binding 276 domains (Konig et al. 1996), which part of the only RAP1 fragment essential for cell growth 277 (Graham et al. 1999;Wu et al. 2018). 278 Because the available wild-type RAP1 plasmid (see methods) does not complement gene 279 deletion growth phenotype, we used a different strategy for confirmation that relied on CRISPR-280 mediated knock-in (see methods and Figure S8). While we could not confirm that the two most 281 likely mutations predicted to be caused by the GNE had a detectable fitness effect in these 282 conditions, we found that phosphomimetic mutations at this position were lethal ( Figure 3C and 283 D) but most other amino acids were well tolerated. This suggests that the constitutive 284 phosphorylation of this residue would be highly deleterious. We could also confirm deleterious 285 effects for GNE induced mutations targeting residues R523 and A540, while mutations at residue 286 A510 had no detectable effect on fitness ( Figure 3C and D). As we only tested progeny survival 287 on rich media and at a permissive temperature and the screen was performed in synthetic media 288 at 30°C, these mutants might still affect cell phenotype but in an environment-dependent manner.

319
Data from the original Target-AID study (Nishida et al. 2016) suggests that the most prevalent 320 outcome for an edited site is a C-to-G transversion. Our data support this observation, as gRNAs 321 which would lead to a C-to-G mutation at the highest activity site of the editing window have the 322 highest GNE detection rate ( Figure 4B). It was also suggested that Target-AID could modify 323 multiple nucleotides within the activity window that could be edited during mutagenesis. Our data 324 support this observation, as gRNAs for which two outcomes have the potential to generate a stop 325 codon are markedly more efficient than those with only one stop codon outcome ( Figure 4C). This 326 finding also extends to gRNAs that do not generate stop codons ( Figure S9A). 327 We observed that the targeted strand relative to transcription greatly influenced editing efficiency 328 ( Figure 4D). This strand effect can be explained by multiple factors. First, there are multiple 329 outcomes leading to mutation to a stop codon starting from a TGG codon (shown in Figure 4A). 330 This codon is the only one that can be targeted on the non-coding strand to generate a stop 331 codon. Second, repair efficiency has been shown to be higher for the transcribed strand in yeast 332 (Reis et al. 2012). Finally, as the non-coding strand is the one which is transcribed, a deamination 333 event there might lead to consequences at the protein level more rapidly because it does not 334 need DNA replication to be present on both strands. gRNAs that do not generate stop codons 335 also have a higher chance of having a fitness effect if they target the non-coding strand ( Figure  336 S9B), but we did not observe any effects of the chromosomal strand on efficiency ( Figure S9C). 337 One other parameter with a high impact on mutagenesis rate is the predicted melting temperature 338 of the RNA-DNA duplex formed by the gRNA sequence and its target DNA sequence ( Figure 4E). 339 The distribution of the melting temperature shows a clear shift between stop codon generating 340 gRNAs that have an effect on fitness and those that do not. gRNAs with low values have a lower 341 chance of being detected as having effects, while gRNAs with higher values are enriched for GNE 342 ( Figure 4F). This observation also extends to gRNAs that do not generate stop codons ( Figure  343 S9D, E). This enrichment cannot be attributed to technical biases in library preparation or high-344 throughput sequencing that would tend to lower their abundance as melting temperature shows 345 practically no correlation with read count at every time point ( Figure S10). Furthermore, this effect 346 is not caused by target position bias within target genes or a strong correlation between GC 347 content and the targeted position ( Figure S11). As binding energy can differ drastically even within 348 groups of gRNAs with similar GC content ( Figure S9F), this could provide a useful criterion to help 349 select efficient gRNAs. 350

352
We tested whether the Target-AID base editor is amenable for genome-wide mutagenesis. Using 353 the yeast essential genes as test cases, we identified hundreds of gRNAs targeting residues with 354 significant effects on cellular fitness when mutated. The precision and traceability of Target-AID 355 genome editing allowed us to predict the mutational outcomes of GNE and to confirm their effects 356 using orthogonal approaches. We used this data to investigate which factors influence base 357 editing efficiency and found multiple gRNAs and target properties that affect mutagenesis and 2013) and applied several selection criteria. Since the screen was to be performed in the BY4741 407 strain, all gRNAs (unique seed sequence, no NAG site) within the database were aligned to the 408 reference genome of that strain using Bowtie (Langmead et al. 2009). Only gRNAs with a single 409 perfect alignment were kept for subsequent steps. To select gRNAs amenable to Target-AID base 410 editing, we selected gRNAs with cytosines within the highest activity window of the editor 411 (positions -17 to -19 starting from the PAM). To limit the total number of possible mutational 412 outcomes, gRNAs with three cytosines within the window were removed as well as those with two 413 cytosines at the highest activity positions. Next, we filtered out any gRNA containing a BsaI 414 restriction site to prevent errors during the library cloning step. 415 The list of essential genes (n=1156) Giaever et al. 2002) was used to 416 discriminate between gRNAs targeting essential or non-essential genes (retrieved from 417 http://www-sequence.stanford.edu/group/yeast_deletion_project/Essential_ORFs.txt).
Among 418 non-essential genes, data from Qian et al. 2012 (Qian et al. 2012) was used to create categories 419 of fitness effects. If the fitness score (averaged across media and replicates) of a gene was below 420 0.75, it was categorized as "high effect" on fitness. We excluded auxotrophic marker genes as 421 well as CAN1, LYP1, and FCY1 because those could be used as co-selection markers (Després 422 et al. 2018). Gene deletions with an averaged fitness score between 0.999 and 1.001 were 423 categorized as having "no detectable effect" on fitness. We selected gRNAs targeting essential 424 and high effect genes, as well as gRNAs targeting a set of 38 randomly chosen no effect genes. 425 To further limit the space of gRNAs examined, only gRNAs mapping from the 0.5 th percent to the 426 75 th percent of coding sequences were chosen. We also added gRNAs targeting all known yeast  Figure S1. 431

Library construction 432
The plasmids, oligonucleotides, and media used in this study are presented as supplementary 433 tables S6, S7 and S8 respectively. The oligo pool was synthesized by Arbor Biosciences 434 (Michigan, USA) and was cloned into the pDYSCKO vector using Golden Gate Assembly ( Cohen 1980) using a standard chemical transformation protocol and plated on ampicillin selective 440 media to select for transformants. Serial dilution of cells after outgrowth were plated and then 441 used to calculate the total number of clones produced by the cloning reaction. Quality control of 442 the assembly was performed by Sanger sequencing ~10 clones per assembly reaction. Cells 443 were scraped from plates by adding ~5 ml of sterile water, incubating a few minutes at room 444 temperature, and then using a glass rake to resuspend colonies. Resuspended plates were then 445 pooled together in a single flask per reaction, which was then used to make glycerol stocks of the 446 library and cell pellets for plasmid extraction. The Qiagen Midi-Prep kit (Qiagen, Germany) was 447 used to extract plasmid DNA from cell pellets by following the manufacturer's instructions. The 448 DNA concentration of each eluate was then measured using a NanoDrop (Thermofisher,  449 Massachusetts, USA), and a normalized master library for yeast transformation was assembled 450 by combining equal quantities of each assembly pool.

Target-AID mutagenesis and competition screening 465
The mutagenesis protocol is an upscaled version of our previously published method and is 466 shown in Figure S2. Transformants were scraped by spreading 5 ml sterile water on plates and 467 then resuspending cells using a glass rake. All plates were pooled together in the same flask, and 468 the OD of the yeast resuspension was measured using a Tecan Infinite F200 plate reader (Tecan,469 Switzerland). Pellets corresponding to about 6 x 10 8 cells were washed twice with SC-UL without 470 a carbon source and then used to inoculate a 100 ml SC-UL +2% glucose culture at 0.6 OD two 471 times to generate replicates A and B. Cells were allowed to grow for 8 hours before 1 x 10 9 cells 472 were pelleted and used to inoculate a 100 ml SC-UL + 5% glycerol culture. After 24 hours, 5 x 473 10 8 cells were pelleted and either put in SC-UL + 5% galactose for mutagenesis or SC-UL + 5% 474 glucose for a mock induction control. Target-AID expression (from pKN1252) was induced for 12 475 hours before 1 x 10 8 cells were pelleted and used to inoculate a canavanine (50 μg/ml) co-476 selection culture in SC-ULR. After 16 hours of incubation, 5 x 10 7 cells of each culture were used 477 to inoculate 100 ml SC-UR, which was grown for 12 hours before 5 x 10 7 cells were used to 478 inoculate a final 100 ml SC-UR culture which was grown for another 12 hours. Cell pellets were 479 washed with sterile water between each step, and all incubation occurred at 30°C with agitation.

Large-scale screen sequencing data analysis 512
The custom Python scripts used to analyze the data will be made available on github 513 (https://docker.pkg.github.com/Landrylab), packages and software used are presented in 514 Supplementary table 9. Raw sequencing files have been deposited on the NCBI SRA, accession 515 number PRJNA552472. Briefly, reads were separated into three subsequences for alignment: the 516 P5 barcode, the gRNA, and the P7 barcode. Each of these was aligned using Bowtie (Langmead 517 et al. 2009) to an artificial reference genome containing either the barcodes or gRNA sequences 518 flanked by the common amplicon sequences. The gRNA sequences are aligned both with 0 or 1 519 mismatch allowed, and misalignment position and type were stored. Information on barcode and 520 gRNA alignment for each read was stored and combined to generate a barcode count per library 521 table, a list of mismatches in alignments for each gRNA in each library, as well as mismatch types 522 and counts for the same gRNA across all libraries. 523 Synthesis error within oligonucleotide libraries is one of the major limits of current large-scale 524 genome editing screening methods. These errors can introduce gRNA sequences that cannot 525 perform mutagenesis because the gRNA sequence does not match a site in the genome. We 526 refer to those gRNAs as SE gRNAs. In our experiment, the stringent selection criteria used to 527 select gRNAs limited the risk of off-target effects even for gRNAs with one mismatch, minimizing 528 the risk that a synthesis error gRNA could lead to editing at another site in the genome. We 529 therefore decided to use highly abundant SE gRNAs as negative controls to obtain a null 530 distribution of abundance variation for gRNAs with no fitness effects. To differentiate synthesis 531 errors from sequencing errors, we used the mismatch type and count table to assess whether a 532 particular mismatched gRNA constitutes a too large fraction of the reads associated with a gRNA 533 to be simply a repeated sequencing error. For each error, we test if: 534 ℎ > 0.075 535 and discarded the reads associated with the specific mismatch alignment. This threshold was 536 obtained by iteratively testing different threshold values in an effort to maximize the gain in gRNA 537 counts while minimizing the noise added by incorrect assignments. Read counts per library for 538 abundant ( ℎ > 1,000) SE gRNAs were kept to serve as negative controls when 539 measuring fitness effects, resulting in a set of 1,032 abundant SE gRNAs. gRNAs absent from 540 more than half of the libraries (4446 out of 39,989) were removed from the analysis before gRNA 541 abundance calculations. 542

Detecting mutations with high fitness effects 543
Barcode sequencing competition experiments use DNA barcodes to measure the relative 544 abundance of many different subpopulations of cells grown in the same pool (Robinson et al. 545 2014). Since each gRNA is linked to its possible mutagenesis outcomes, we can use relative 546 gRNA abundance to detect mutations with significant fitness effects. To do so, the log2 of the 547 relative abundance of a barcode after mutagenesis is compared with its abundance at the end of 548 the screen: 549 For each gRNA, the measured fitness effect is the product of the effect of the mutational outcomes 551 on growth and of the mutation rate within the cell subpopulation bearing this particular gRNA. 552 Relative counts will also vary stochastically because of variation in sequencing coverage 553 depending on the time point and replicate. To reduce the impact of these effects, a minimal read 554 count at the end of the galactose induction step was used to filter out low abundance gRNAs. We 555 found a minimal read threshold of n=54 provided a good tradeoff between the number of gRNAs 556 eligible for analysis and inter-replicate correlation. 557 Using the distribution of ∆log2 values, we calculated a z-score for each gRNA in both replicates. 558 We then averaged z-scores between replicates and compared the score distributions between 559 SE and Non-SE gRNAs. This revealed the presence of a left-skewed tail in the z-score distribution 560 of valid gRNAs, which is absent in the SE. Because the number of SE gRNAs is smaller than the 561 one of functional gRNAs by almost two orders of magnitude, a type I error (false positives) 562 empirical threshold based solely on a weighted SE z-score distribution was not practical. To 563 resolve this, we fitted a Gumbell left skewed distribution to the SE gRNAs z-score distribution and 564 used it to approximate the type I error rate as a function of the z-score. We set a significance 565 threshold such as that all gRNAs at z-scores for which the estimated false positive rate is below 566 or equal to 5% are considered GNEs. 567

Complementation assays 568
Experiments were performed in heterozygous deletion mutants from the YKO project 569 heterozygous deletion strain set (Dharmacon, Colorado, USA). For each gene, a single colony 570 streaked from the glycerol stock was used to prepare competent cells using the previously 571 described lithium acetate protocol. To generate mutant alleles of the genes of interest, we 572 performed site-directed mutagenesis on the appropriate MoBY collection plasmid (Ho et al. 2009 (Tarassov et al. 2008). The 603 mDHFR[1,2]-FLAG cassette was amplified using gene-specific primers and previously described 604 reaction parameters (Tarassov et al. 2008). Cells were transformed with the cassette using the 605 previously described transformation protocol and were plated on YPD+Nourseothricine (YPD+Nat 606 in Media table). Positive clones were identified by colony PCR and successful fragment fusion 607 was confirmed by Sanger sequencing (CHUL sequencing platform). We then mated the confirmed 608 clones with strain Y8205 (Matα can1::STE2pr-his5 lyp1::STE3prLEU2 ∆ura3 ∆his3 ∆leu2, Kindly 609 gifted by Charlie Boone) by inoculating a 4ml YPD culture with overnight starter cultures of both 610 strains and letting the culture grow overnight. Cells were then streaked on YPD+Nat and diploid 611 cells were identified by colony PCR using mating type diagnosis primers (Huxley et al. 1990). 612 To create heterozygous deletion mutants of the target gene, we amplified a modified version of 613 the URA3 cassettes that could then be targeted with the CRISPR-Cas9 system to integrate our 614 mutations of interest using homologous recombination at the target locus. The oligonucleotides 615 we used differ from those commonly used in that they amplify the cassette without the two LoxP 616 sites present at both ends. We found it necessary to remove those sites as one common 617 mutational outcome after introducing a double-stranded break in the URA3 cassette was inter- biases. Spearman rank correlation between replicate averaged read count and predicted 1000 gRNA/DNA duplex melting temperature is shown across timepoints. The minimal read count after 1001 galactose induction, which served as a filtering criterion, is shown on the galactose subpanels. 1002 gRNAs for which no reads were detected in one of the time points were included when computing 1003 the correlation but are not shown on the graphs because of log scaling. 1004