Introduction

In model organisms, genetic screens have long been used to characterize gene functions, to define gene networks, and to identify the mechanism-of-action of drugs1,2,3,4. The genetic relationships identified by such screens have been shown to involve positive and negative feedbacks, backups and cross-talks that would have been extremely difficult to discover using other approaches5. Currently, the large majority of reported screens in model organisms and in mammalian-cell systems have used gene-deletion libraries and/or methodologies to inactivate gene functions, such as short-interfering RNA, CRISPR-Cas9 or transposon-mediated mutagenesis6,7. While powerful, such approaches usually identify loss-of-function phenotypes, and only rarely uncover separation-of-function or gain-of-function mutations. Gene overexpression screens have successfully identified gain-of-function alleles, but these screens often involve non-physiological protein levels. This limitation is significant because such separation- or gain-of-function mutations – which can arise spontaneously or via the action of genotoxic agents – can dramatically affect cell functions or cellular response to chemicals, and can have profound impacts on human health and disease8,9. Suppressor screens, either based on lethal genetic deficiencies and/or the use of drugs, have also facilitated the characterization of functionally relevant protein domains and sites of post-translational protein modification through the identification of relevant single nucleotide DNA variants (SNV)s10.

In their simplest experimental setup, suppressor screens based on point-mutagenesis rely on four tools: (i) a genetically amenable organism or cell; (ii) a selectable phenotype; (iii) a method to create a library of mutants; and (iv) a method to identify mutations driving the suppressor phenotype amongst all the mutations in the library. Reflecting their relative amenability, these screens have mostly been carried out in microorganisms, either bacteria or yeasts, both of which benefit from the ability to survive in a stable haploid state. Despite not being strictly essential for such studies, a haploid state facilitates the identification of loss-of-function or separation-of-function recessive alleles, which would be masked in a heterozygous diploid cell state11. While the first three tools mentioned above are often amenable to a researcher, the lack of fast and efficient methods to bridge the knowledge-gap between phenotype and genotype has discouraged the widespread implementation of suppressor screens based on point-mutagenesis. Indeed, until recently, recessive suppressor alleles could only be identified by labor-intensive methods involving genetic mapping and cloning in yeast, whereas the natural diploid state of mammalian cells largely precluded straightforward SNV suppressor screens in such systems.

Here, we describe an approach to overcome the above limitations that is based on sequencing of genomic DNA extracted from various independent suppressor clones, followed by bioinformatic analysis. With small adaptations, this method can be applied to both the budding yeast Saccharomyces cerevisiae and other haploid model organisms, as well as to haploid mammalian cells (Fig. 1). To highlight the utility of this approach, we describe its application to study resistance to the anti-cancer drugs camptothecin or olaparib, leading to the identification of various mutations in yeast TOP1 and in mouse Parp1, respectively. Importantly, we establish that drug target identification and mechanisms of drug resistance can be unveiled without a priori knowledge of the drug target. Furthermore, if a sufficient number of chemical-genetic suppressors is screened, this method also allows identification of functional protein domains required to drive drug sensitivity and resistance.

Figure 1
figure 1

Experimental workflow for a suppressor screen. The typical workflow of a suppressor screen using S. cerevisiae (left) or mouse embryonic stem cells (right) is depicted. Details of differences between the two systems are illustrated where appropriate. Variation in mutation numbers for an organism can be due to the choice of background strain, mutagenizing agent and other experimental factors.

Results

Identification of TOP1 mutations conferring camptothecin resistance

To demonstrate the utility of the procedure described above, we sought to identify mutations imparting resistance to camptothecin, a DNA topoisomerase 1 inhibitor12,13,14. To do this, we employed yeast strains carrying mutations inactivating pathways required for camptothecin resistance. These specific mutations (rad50S, sae2-F267A, rtt107∆, tof1∆, sae2∆mre11-H37R-tel1∆) were chosen for their ability to induce camptothecin hypersensitivity15,16,17,18. To maximize the variety of potential mutations driving drug resistance, two of the five strains used were mutagenized with ethyl methane sulfonate (EMS), an alkylating agent that induces SNVs19, before plating them in the presence of camptothecin. In all cases, camptothecin-resistant colonies were readily detectable after 2–3 days of growth at 30 °C.

Genomic DNA sequencing of the resistant clones highlighted TOP1 as the gene carrying the largest number of unique mutations in our dataset, as expected for it being the drug target. The second most mutated gene — PDR1 that is known to regulate a pathway of multi-drug resistance20 carried 11 unique mutations, 10 of which did not co-occur with mutations in TOP1, whereas all the mutations found in the third most mutated gene (GLT1) co-occurred with mutations in either TOP1 or PDR1, suggesting that GLT1 mutations do not drive camptothecin resistance per se (Fig. 2a and data not shown). It is possible that the unexpectedly high frequency of GLT1 mutations could arise from them somehow enhancing cell survival to EMS, as these mutations are only found in EMS-mutagenized samples. Globally, out of the 251 yeast strains sequenced, 191 contained one or more mutation in TOP1 (Fig. 2b, light yellow). Furthermore, by manual inspection, we found that 27 additional strains carried mutations in TOP1 (Fig. 2b, dark yellow); the inability to automatically detect these mutations was caused by the fact that these strains were either not pure clones, or they carried large (>25 bp) deletions in TOP1 (Fig. 2b and Supplementary Figure S1). To the list of TOP1-mutated camptothecin-resistant strains, we added another 38 strains bearing TOP1 mutations that we had identified in previous, published screens16,21, bringing the total number of TOP1 mutants analyzed to 256.

Figure 2
figure 2

Screening for camptothecin resistance highlights TOP1 and its functional domains. (a) Genes found mutated in screenings for camptothecin resistance sorted by the number of non-synonymous, independent mutations identified in each gene. (b) Fraction of strains identified as mutated in TOP1 by initial analysis (light yellow) and upon manual inspection of the TOP1 gene (dark yellow); blue represents clones where no TOP1 mutation was identified. (c) Mutation types identified in EMS mutagenized samples and non-EMS mutagenised samples. Location of nonsense (d), frameshift (e) and missense (f) mutations in the TOP1 gene with respect to the primary protein sequence. Mutations identified in EMS-treated samples are colored red, while those from non-mutagenized samples are colored green.

Missense, nonsense and frameshift TOP1 mutations were roughly equally represented in the non-mutagenized samples. However, where samples had been mutagenized with EMS the vast majority of mutations were nonsense or missense base substitutions (Fig. 2c). In the few cases in which the same suppressor clone contained missense and nonsense mutations in TOP1, the suppressive effect was attributed to the gained STOP codon.

When the positional distribution of each mutation type was plotted, nonsense and frameshift mutations were shown to be quite evenly distributed along the length of the TOP1 open reading frame (Fig. 2d and e). The prediction is that such mutations either result in null alleles – as the prematurely-terminated messenger RNA (mRNA) would be degraded by nonsense-mediated decay mechanisms22 – or would give rise to an unstable protein or a truncated version that could retain partial activity. Since the Y727 residue is essential for the catalytic activity of Top1, truncation before this residue is predicted to produce a non-functional protein23,24. As might be expected, the distribution of nonsense mutations loosely correlated with them arising from codons in the open reading frame that only required one nucleotide change to change them to a STOP codon (Supplementary Figure S1). Notably, the observed enrichment of frameshifts near the 5′ end of the TOP1 transcript was localized to an 8-nucleotide homopolymeric adenine tract that is presumably particularly susceptible to mutagenesis (Supplementary Figure S1).

In striking contrast to the situation with nonsense or frameshift mutation, missense mutations were localized to specific regions of the TOP1 protein-coding sequence, overlapping with known functional domains of Top1. Indeed, the vast majority of mutations identified localized within three distinct regions of the larger DNA binding and catalytic domain, while a minority was located in the smaller C-terminal domain, essential for catalysis (Fig. 2f).

Functional consequences of the amino acid residue changes induced by missense mutations were assessed by using PROVEAN and PredicProt25,26. These tools use chemical properties of amino acid residues and phylogenetic conservation to predict whether or not a particular substitution is likely to be functionally tolerated by the protein analyzed. Both these methods suggested that the vast majority of the TOP1 mutations we identified in camptothecin resistant strains were likely to produce deleterious effects (PROVEAN score < −2.5; PredictProtein score >50) (Fig. 3a). Notably, missense mutations located in the C-terminal domain of Top1 affected both conserved and non-conserved residues and were primarily positioned in the vicinity of the catalytic residue Y727, although three substitutions were closer to the C-terminus of the protein (Fig. 3b).

Figure 3
figure 3

Mutations conferring camptothecin resistance affect key functional residues in TOP1. (a) Computational predictions for consequences of TOP1 missense mutations as predicted by PredictProtein (y-axis; a score above 50 is considered deleterious) and PROVEAN (x-axis; a score below −2.5 is considered deleterious). Datapoints are colored by the domain that the missense mutations occur in: red and orange denote mutations falling within the Lip1 and Lip2 domains, respectively, blue represents the catalytic domain before the Linker, green the C-terminal catalytic domain. (b) Multi-species alignment of the Top1 protein catalytic domain, with the tyrosine residue critical for catalysis (Y727) being highlighted in orange and mutations identified in this screen being highlighted in boldface red. (c) Model of the Top1 mode of DNA binding. The protein wraps around to double-stranded DNA with its two Lip domains, like a grasping hand. (d) Missense mutations identified in the DNA-binding clamp and active site region with respect to the Lip DNA-binding regions and the active site. The region critical for DNA binding and catalysis is highlighted in yellow, and mutations are represented by blue squares. Below is a depiction of the sequence and loop structure of the Lip1 and Lip2 regions in yellow, with specific mutations indicated in light blue.

Top1 binds to DNA via a clamp-like mechanism in which DNA binding stimulates a conformational change in the protein. Thus, opposable “lip” domains encircle the DNA, stabilizing binding through establishing non-covalent protein-DNA and lip-lip interactions (Fig. 3c)27,28. Approximately two thirds of the missense suppressor mutations identified in the DNA binding domain clustered within the Lip1 and Lip2 regions, highlighting their importance for Top1 function (Fig. 3d; the Lip2 domain also contains an active-site residue, R420). Remaining mutations clustered between amino acid residues 500 and 600, which encompass the end of the DNA binding/catalytic domain and the base of the coiled-coil linker domain. In this region two other active site residues (R517 and H558) are located (Fig. 3d).

Collectively, these results showed that even with no a priori knowledge, our approach for identifying drug-resistant strains and associated mutations would have identified Top1 as the likely target of camptothecin and would have highlighted the critical Top1 domains functionally relevant for Top1 activity and drug hypersensitivity.

Identification of Parp1 mutations conferring olaparib resistance

Based on a similar approach to that described above, we recently identified genes whose mutation in haploid mammalian cells causes resistance to the anti-metabolite drug 6-thioguanine29. To further highlight the wider applicability of our approach in mammalian cell systems, we carried out a screen to identify mutations that allow haploid mouse cells to survive in the presence of the anti-cancer agent olaparib, a potent poly ADP-ribose polymerase (PARP) small-molecule inhibitor30,31. Thus, wild-type, haploid mouse embryonic stem cells (mESCs) were mutagenized by using EMS, and mutant libraries were screened for resistance to olaparib (Fig. 1a). Forty-five olaparib-resistant clones were isolated and subjected to whole-exome sequencing.

Analysis of ensuing sequence data for putative, acquired mutations, revealed Parp1 as the most mutated gene in the dataset with 25 different mutations detected (Fig. 4a, Supplementary Table S3). Globally, 40 out of the 45 clones harbored Parp1 mutations (Fig. 4b, Supplementary Table S3). Further manual examination of the aligned sequencing data from the five remaining clones revealed that four of these also likely carried mutations affecting PARP1 (Supplementary Figure S2). Two of those five (A7, B7) likely carry the R138C missense mutation identified in another clone (Supplementary Figure S2, Fig. 4c), while two other clones (A9, H10) harbored nonsense mutations at codon 341 (Supplementary Figure S2). Importantly, mutations in the second and third most mutated genes (Ttn and Plch1 with 9 and 5 different mutations, respectively) never occurred in isolation in the absence of Parp1 mutations, while Parp1 mutations also occurred in the absence of Ttn or Plch1 mutations. These data thus highlighted how such an analysis would have identified PARP1 as the likely prime driver of olaparib sensitivity without any knowledge about the drug’s mechanism-of-action (see below for further discussion).

Figure 4
figure 4

A screen for olaparib resistance indicates critical functional residues in PARP1. (a) Genes found mutated in the olaparib resistant clones sorted by the number of non-synonymous, independent mutations per gene. (b) Fraction of strains identified by analysis workflow to carry a mutation in Parp1. (c) Locations of mutations with respect to the protein sequence. The lower panel shows the distribution of missense mutations, while the upper panel contains all mutations likely to lead to a loss of PARP1 protein: nonsense, frameshift and splice acceptor/donor nucleotide mutations as well as mutations of the start codon. (d) The likelihood of being deleterious of Parp1 missense mutations as predicted by PredictProtein (y-axis; a score above 50 is considered deleterious) and PROVEAN (x-axis; a score below −2.5 is considered deleterious). Datapoints are coloured by the domain the missense mutations occur in: Light blue, dark blue and green are the three Zinc fingers, respectively, purple denotes the BRCT domain, while red indicates the WGR domain.

Of the Parp1 mutations we detected, more than half led to premature termination codons, splice acceptor/donor, or frameshift mutations, which would presumably lead to the production of aberrant mRNAs subject to nonsense-mediated decay and/or the generation of unstable, truncated PARP1 protein. As we previously noted for premature-termination mutations in yeast TOP1, these mutations did not cluster in any particular domain(s) of the Parp1 open reading frame (Fig. 4c). Furthermore, similar to what we observed in yeast, EMS treatment resulted in an overrepresentation of single nucleotide variants, compared to frameshift mutations (Fig. 4c).

Strikingly, missense mutations detected in Parp1 were more frequent in the N-terminal part of the protein, where the DNA-binding domains reside, while no such mutations were observed in the catalytic domain (Fig. 4c). Analysis of PARP1 protein levels indicated that other than G400R and A610V, which resulted in complete loss or marked reduction of the PARP1 protein product, no other of the identified missense mutations impacted on PARP1 protein stability (Fig. 5a, Supplementary Figure S3). Computational predictions for the likely consequences of these remaining missense mutations on protein structure and function suggested that all of them were functionally deleterious (Fig. 4d). Since all these missense mutations localized within domains known to be involved in DNA binding, we examined their locations relative to the DNA-protein interface as defined by previously published PARP1 structures32,33 (Fig. 5b). Notably, most of the missense mutations affecting residues in the DNA binding domains clustered at the DNA-protein interface, and did so in proximity to residues that make key DNA contacts33.

Figure 5
figure 5

Missense mutations interfere with PARP1 DNA binding. (a) Olaparib resistant clones were assayed for the presence of PARP1 protein. A representative selection is shown, along with the Parp1 mutation carried by each clone: unknown indicates that no mutation could be automatically detected. Upon further, manual inspection, the mutations identified in brackets were discovered. More clones can be seen in Supplementary Figure 3a. (b) Location of missense mutations that do not lead to a loss of PARP1 protein are indicated on partial PARP1 structures of the ZN2 domain by itself (PDB code 3ODC) and the PARP1 protein except ZN2 and the BRCT domain (PDB code 4DQY). (c) Clones carrying mutations that do not lead to loss of/reduction in PARP1 protein were assayed for the protein’s ability to bind a double-stranded DNA oligonucleotide. One experiment each was quantified. All blots can be viewed in Supplementary Figure 3b. (d) Cells with either wild type or mutant PARP1 protein were assayed for their ability to PARylate. Cells were either left untreated or treated with hydrogen peroxide (H2O2). Wild type cells were also treated with olaparib overnight before H2O2 treatment. (e) Summary of olaparib resistant clone analysis. The inner circle shows the mutations that have been identified in every clone (dark yellow signifies mutations identified by manual inspection of the data). The outer circle shows the respective effect on PARP1 protein accounting for the mechanism of olaparib resistance.

Without any a priori knowledge about how olaparib causes cell toxicity, the above data would have suggested that such toxicity is largely driven by a mechanism connected to PARP1 DNA binding. To test this idea, we assessed the missense mutations identified in the DNA binding domains of PARP1 for their potential effects on the ability of PARP1 to bind a double-stranded DNA oligonucleotide. Significantly, this analysis revealed that all the point mutants that did not reduce PARP1 levels showed reduced levels of DNA binding when compared to the wild-type PARP1 protein (Fig. 5c, Supplementary Figure S3). Consistent with PARP1 DNA binding triggering its auto-modification by poly ADP-ribose, we found that the PARP1 S568F mutation, which impairs DNA binding, did not exhibit evidence of PARylation when cells were treated with hydrogen peroxide (Fig. 5d). These findings were therefore in accord with the fact that toxicities of PARP inhibitors such as olaparib are linked to their ability to trap PARP1 on DNA by blocking its catalytic activity30.

The last clone without an assigned mutation driving resistance to olaparib (C1) may also carry one or more mutations in the non-exonic regions of the Parp1 gene, or epigenetic modifications altering Parp1 expression, since we could not detect the presence of PARP1 in protein extracts (Fig. 5a). Taken together, these results are consistent with a model in which olaparib resistance can arise either from loss of PARP1 or from its decreased ability to bind DNA (Fig. 5e).

Discussion

Various approaches have been described for systematic identification of genetic and chemo-genetic interactions. Until recently, this search has been largely conducted using approaches based on gene inactivation, either in arrayed or pooled assay formats. While these approaches have played crucial roles in determining gene-gene and gene-drug interactions, their limited power of resolution does not in general provide information regarding the functional protein domains relevant for the identified interaction. While transposon-based mutagenesis and dense CRISPR gRNA approaches have recently been shown to provide some information at the domain level, this approach is only applicable to loss-of-function mutations, and is biased towards C-terminal domains of proteins34. In contrast, SNV based approaches can provide a higher level of resolution, and in many cases produce unanticipated results10,35,36,37. Lack of rapid and facile procedures to bridge the phenotype-to-genotype gap has until recently, however, precluded the use of these techniques on a high-throughput scale.

The approach we have described allows the identification of SNVs driving drug resistance or resistance to essentially any selective growth condition in a systematic and unbiased way (other than any bias imposed by the mutagenic agent of choice). Importantly, this approach can equally be applied to yeast and to more complex eukaryotes, bringing the power of high-resolution haploid genetic screens to mammalian systems. It should however be noted that certain limitations of haploid genetics, such as the inability to formally distinguish between recessive and dominant alleles, will also apply to mammalian systems. While we acknowledge that we have carried out our screens with strong selectable cell-viability phenotypes, we envisage applicability in more complex scenarios, for example involving FACS-based selection or cell migration, motility or attachment assays. Highlighting this potential, our results show how, with no previous knowledge, Top1 and PARP1 would have been identified as the most likely targets for the drugs camptothecin and olaparib, respectively.

Toxicity to PARP inhibitors was initially linked to the involvement of PARP1 in the repair of single-strand DNA breaks38,39, but more recent data challenged this view40. The fact that loss of PARP1 drives resistance to PARP inhibitors in wild-type genetic backgrounds41 indeed suggests that inhibition of PARP1 catalytic activity — and not the accumulation of unrepaired DNA lesions — is the major effector of toxicity in such genetic backgrounds. Indeed, recent findings suggest that PARP1 trapping onto DNA, caused by inhibition of its catalytic activity, is the main cause of toxicity31. Our data further support this model, as all the suppressor clone variants which we identified that did not result in loss of PARP1 protein negatively affected its binding to DNA. Preventing PARP1 binding to DNA thus appears to be sufficient to circumvent the toxicity of PARP inhibitors, at least in homologous-recombination proficient cells; and the fact that we identified no mutants specifically defective in catalytic function reinforces the idea of trapped PARP1 as the main cytotoxic lesion for olaparib in wild-type mammalian cells. Further work in a diploid setting will be required to demonstrate whether the DNA-binding deficient alleles isolated are recessive, and that the PARP1-olaparib complex indeed acts as a dominant negative entity. This may not only be important for our understanding of how PARP inhibitors function but also for understanding mechanisms of intrinsic or evolved tumor resistance towards such clinical agents. In particular, it will be interesting to determine whether olaparib resistance in patients with BRCA1/2 mutant cancers can be driven by loss of PARP binding to DNA.

As exemplified by our analyses of Top1 and PARP1, the level of detail on critical functional domains and residues increases with the number of samples sequenced. Because of their genome size, screens based in mammalian systems require greater sequencing power than screens conducted in simpler organisms such as yeast. Moreover, as compared to yeasts, the more complex genome architecture in mammalian systems – where there is more intergenic DNA, a larger number of genes and an abundance of intronic sequences – increases the chances of isolating variants affecting protein levels, rather than protein function. One solution to bypass such issues will be to run two-tiered screens, initially using whole exome sequencing on a subset of suppressors in order to identify top gene hits driving resistance, and then using targeted exome sequencing to test the rest of the samples, either through analysis of various individual clones or bulk sequencing of the resistant population. In addition, we can envision alternative scenarios where a gene identified in an initial screen could be marked/tagged in a way to allow selection of mutations that affect protein function but not protein levels. This approach can also be combined with CRISPR-Cas9-mediated in vivo targeted mutagenesis, via a library of gRNAs directed towards the exonic regions of the gene (Pettitt et al., BioRxiv https://doi.org/10.1101/203224). We anticipate that such developments, along with expected further increases in sequencing throughput and associated cost reductions, will pave the way for hitherto unprecedented genetic analyses on comprehensive and systematic scales.

Methods

Yeast suppressors of camptothecin sensitivity

S. cerevisiae strains used were derived from W303. All gene deletions were introduced by using one-step gene disruption, and were confirmed by PCR and whole-genome sequencing. Full genotypes of strains are described in Supplementary Table 1. Standard growth conditions (1% yeast extract, 2% peptone, 2% glucose, 40 mg/l adenine) were used. Strains YFP1001 and YFP1073 were mutagenized by adding 4.5% ethyl methane sulfonate (EMS) to liquid cultures in logarithmic growth-phase, pelleted by centrifugation and then resuspended in 50 mM K-phosphate buffer for 10 minutes, followed by EMS inactivation with 1 volume of 10% sodium thiosulfate. Suppressors were obtained by plating each strain on 10 YPD plates supplemented with 5 µg/ml of camptothecin (approximately 107 cells per plate). Resistant colonies were picked after 2–3 days of growth at 30 °C and isolated by streaking on YPD plates. Suppression was confirmed by retesting camptothecin sensitivity of the isolated strains. Confirmed suppressors were processed for DNA extraction shortly thereafter, in parallel with 2–3 colonies of the initial strain (Fig. 1a).

Mouse embryonic stem cell suppressors of olaparib sensitivity

Haploid mouse AN3-12 embryonic stem cells (mESCs)42,43 were used for all the experiments and were free from mycoplasma. Cells were grown in DMEM high glucose (Sigma) supplemented with glutamine, fetal bovine serum, streptomycin, penicillin, non-essential amino acids, sodium pyruvate, 2-mercaptoethanol and Leukemia inhibitory factor (LIF). All plates and flasks were gelatinized before cell seeding.

Cell sorting for DNA content was performed on mESCs by using a MoFlo flow sorter (Beckman Coulter) after staining with 15 μg/ml Hoechst 33342 (Invitrogen). The 1n peak was purified to enrich for haploid mESCs.

Mutagenesis with EMS was performed as described previously44 with the following adjustments: after cell sorting, haploid-enriched cells were grown in DMEM plus LIF for overnight EMS treatment. After EMS treatment, cells were cultured for five passages in DMEM plus LIF and plated into 6-well plates at a density of 5 × 105 cells per well. Cells were then treated with 6 μM of olaparib (AZD2281; Stratech Scientific Ltd.) for 6 days, supplying new medium with olaparib daily. Cells were then grown for another four days without olaparib until mESC colonies could be isolated.

Genomic DNA isolation

S. cerevisiae DNA isolation

Resistant colonies were inoculated in 1.8 ml of YPAD in 96-deep-well plates and grown for 48 hours. Pelleted cells were re-suspended in 500 μl of spheroplasting solution (1 M sorbitol, 0.1 M EDTA, 14 mM 2-mercaptoethanol, 1 mg/ml RNAse A, containing 5 mg/ml zymolyase) and incubated for 2 hours at 37 °C. Spheroplasts were subsequently re-suspended in 200 μl of lysis buffer (80% ATL buffer [QIAGEN #19076], 10% Proteinase K [QIAGEN #19133] and 10% RNAse A (10 mg/ml)] and incubated overnight (>16 h) at 56 °C. Genomic DNA was extracted from the resulting solution by using the Corbett X-Tractor Gene Robot with the following buffers: AL [QIAGEN #19075; diluted 50% with ethanol], DXW [QIAGEN #950154], DXF [QIAGEN #950163], and E [QIAGEN #950172] (Fig. 1a).

Mouse genomic DNA isolation

mESC clones were grown into 12-well plates. After trypsinising and resuspension in 200 μl PBS and 200 μl Buffer AL [QIAGEN], a proteinase K [QIAGEN, 20 μl] and RNase [QIAGEN, 0.4 mg] digestion step was performed (incubating 10min at 56 °C). After adding 200 μl 96–100% ethanol the solutions were applied to QIAamp Mini spin columns following the QIAamp DNA Blood Mini Kit [QIAGEN] manufacturer’s protocol from there. Genomic DNA was eluted from the columns using 200 μl distilled water. A second elution was performed if the yield of the genomic DNA obtained was lower than 2 μg. Genomic DNA was stored at −20 °C short-term before sequencing (Fig. 1a).

Illumina library preparation and sequencing

Extracted DNA was tested for total volume, concentration and total amount by using gel electrophoresis and the Quant-iTTM PicoGreen® dsDNA Assay Kit (ThermoFisher Scientific). Genomic DNA −500 µg (yeast) or 1–3 μg (mouse) – was fragmented to an average size of 100–400 bp (mouse) or 400–600 bp (yeast) by using a Covaris E210 or LE220 device (Covaris, Woburn, MA, USA), size-selected and subjected to DNA library creation via established Illumina paired-end protocols. Adaptor-ligated libraries were amplified and indexed via PCR. A portion of each library was used to create an equimolar pool comprising 45 indexed libraries for mouse samples, and 96 indexed libraries for yeast samples. For mouse whole-exome sequencing, pools were hybridized to SureSelect RNA baits (Mouse_all_exon; Agilent Technologies) (Fig. 1b).

Mouse libraries were sequenced at 15 samples per lane. Yeast libraries were sequenced at up to 96 samples per lane. Libraries were sequenced by using the HiSeq 2500 (Illumina) to generate 75 (mouse), or 100/125 (yeast) base paired-end reads according to the manufacturer’s recommendations.

Analysis of DNA sequence data to identify suppressor mutations

Alignment of DNA sequencing data

Sequencing reads were aligned to the appropriate reference genome using BWA aln (v0.5.9‐r16)45. The S. cerevisiae S288c assembly (R64-1-1) from the Saccharomyces Genome Database was obtained from the Ensembl genome browser. For mouse samples, the Mus musculus GRCm38 (mm10) was used. Where appropriate, all lanes from the same library were merged into a single BAM file, and PCR duplicates were marked by using Picard Tools (Picard version 1.128). The quality of the sequencing data post-alignment was assessed by using SAMtools stats and SAMtools flagstats (1.1+htslib−1.1), plot-bamstats, bamcheck and plot-bamcheck46 (Fig. 1c).

Variant calling, consequence annotation and filtering

SNVs and small insertions/deletions (INDELs) were identified using SAMtools mpileup (v.1.3)46, followed by BCFtools call (v.1.3)46. The following parameters were used for SAMtools mpileup: -g -t DP,AD -C50 -pm3 -F0.2 -d10000. Parameters for BCFtools call were: -vm -f GQ. All variants were annotated by using the Ensembl Variant Effect Predictor (VEP) v8247. To exclude low quality calls, variants were filtered by using VCFtools vcf-annotate (v.0.1.12b)48 with options −H −f +/q = 25/SnpGap = 7/d = 5, and custom filters were written to exclude variants with a Genotype Quality (GQ) score of less than 10. In the case of whole-exome sequencing data, variants called outside of targeted regions were excluded. INDELs were left-aligned using BCFtools norm46.

Removal of background mutations

Variants that confer resistance are absent in the initial strain/cell line, as it is sensitive to the drug used. Bedtools intersect was therefore used to remove variants present in any S. cerevisiae control samples to eliminate variation of the background relative to the reference genome from the dataset. Variant calls from mouse samples were filtered by removing all variants identified in sequencing data of three olaparib-sensitive AN3-12 clones using VCFtools vcf-isec48; INDELs were further verified by using the microassembly-based variant caller Scalpel49. To address the high false positive rate in INDEL variant calls, only the INDELs that were identified by both variant callers and have passed the filters were retained.

Variant prioritization

Variants were prioritized by their Ensembl VEP47 predicted consequence: we retained variants predicted to cause a frameshift, a premature stop codon, a missense mutation, a lost start/stop codon, a synonymous mutation, an in-frame insertion or deletion, and in the case of mouse data, those annotated to affect splice donor/acceptor bases. Genes were prioritized by ranking them by the number of distinct mutations identified in each gene.

Missense mutations identified in genes of interest were ranked by using predictions of PROVEAN/SIFT25,50,51,52 and PredictProtein26. Scores below −2.5 for PROVEAN, above 50 for PredictProtein and below 0.05 for SIFT indicate likely deleterious consequences to protein function.

Analysis of olaparib resistant cell lines

Molecular modeling

Molecular models were generated by using PyMOL. Crystal structure data were obtained from RCSB Protein Data Bank (PDB). The codes for PARP1 structures were 4DQY (human PARP1 without ZN2 or BRCT domain) and 3ODC (human PARP1 ZN2).

Antibodies for immunoblotting

Anti-PARP1 (Cell Signalling, 9542; 1:1000 in TBST 5% milk), anti-HDAC1 (Abcam ab19845; 1:1000 in TBST 1% BSA), anti-PAR (Trevigen 4336-BPC-100, 1:1000 in PBST[0.05%Tween-20] 5% milk), and anti beta-actin (Cell Signaling #4970, 1:1000 in TBST 5% BSA) were incubated with western-blot membranes at 4 °C overnight with gentle rocking.

DNA binding assays

DNA binding of PARP1 was assayed by using a 26-bp palindromic DNA duplex (5′GCCTACCGGTTCGCGAACCGGTAGGC3′33) immobilized on Dynabeads™ M-280 Streptavidin (Invitrogen, 10 mg/ml). Individual incubations used 500 µg of protein extract and a buffer containing 10 mM HEPES (p.H. 7.4), MgCl2 (1.5 mM), 25% glycerol, KCl (200 mM), EDTA (0.2 mM), Roche protease inhibitor cocktail (0.7X), DTT (0.5 mM) and AEBSF (0.1 mM). Experiments were repeated at least twice.

PARylation assay

Cells were treated with 6 μM olaparib overnight, followed 1 mM H2O2 for 10 minutes in the dark, washed with ice-cold PBS and collected. Cells were lysed in Laemmli buffer (120 mM TrisHCl pH6.8, 4% SDS, 20% glycerol) and lysates were separated on 4–20% Tris-Glycine gradient gel followed by transfer onto PVDF membrane. Membranes were immunoblotted with the appropriate antibodies. Experiments were repeated at least twice.

Data availability

The datasets generated and/or analysed during the current study are available in the European Nucleotide Archive (ENA) under the accession code PRJEB2977 (yeast) and PRJEB13612 (mouse). Access codes for specific samples are detailed in Supplementary Table 2. Source data for Figs 2 and 3 are provided in Supplementary Table S2; source data for Figs 4 and 5 are provided in Supplementary Table S3 and Supplementary Figure S3.

Code availability

Custom code used to analyse sequencing data and to draw figures is available in the following repository: http://github.com/fabiopuddu/Herzog2018.