Detection of functional protein domains by unbiased genome-wide forward genetic screening

Herzog, Mareike; Puddu, Fabio; Coates, Julia; Geisler, Nicola; Forment, Josep V.; Jackson, Stephen P.

doi:10.1038/s41598-018-24400-4

Download PDF

Article
Open access
Published: 18 April 2018

Detection of functional protein domains by unbiased genome-wide forward genetic screening

Mareike Herzog¹^na1,
Fabio Puddu¹^na1,
Julia Coates¹,
Nicola Geisler¹,
Josep V. Forment¹^nAff2 &
…
Stephen P. Jackson¹

Scientific Reports volume 8, Article number: 6161 (2018) Cite this article

4008 Accesses
12 Citations
6 Altmetric
Metrics details

Subjects

Abstract

Establishing genetic and chemo-genetic interactions has played key roles in elucidating mechanisms by which certain chemicals perturb cellular functions. In contrast to gene disruption/depletion strategies to identify mechanisms of drug resistance, searching for point-mutational genetic suppressors that can identify separation- or gain-of-function mutations has been limited. Here, by demonstrating its utility in identifying chemical-genetic suppressors of sensitivity to the DNA topoisomerase I poison camptothecin or the poly(ADP-ribose) polymerase inhibitor olaparib, we detail an approach allowing systematic, large-scale detection of spontaneous or chemically-induced suppressor mutations in yeast or haploid mammalian cells in a short timeframe, and with potential applications in other haploid systems. In addition to applications in molecular biology research, this protocol can be used to identify drug targets and predict drug-resistance mechanisms. Mapping suppressor mutations on the primary or tertiary structures of protein suppressor hits provides insights into functionally relevant protein domains. Importantly, we show that olaparib resistance is linked to missense mutations in the DNA binding regions of PARP1, but not in its catalytic domain. This provides experimental support to the concept of PARP1 trapping on DNA as the prime source of toxicity to PARP inhibitors, and points to a novel olaparib resistance mechanism with potential therapeutic implications.

A scalable platform for efficient CRISPR-Cas9 chemical-genetic screens of DNA damage-inducing compounds

Article Open access 30 January 2024

Revolutionizing DNA repair research and cancer therapy with CRISPR–Cas screens

Article 13 February 2023

Genetic screens in isogenic mammalian cell lines without single cell cloning

Article Open access 06 February 2020

Introduction

In model organisms, genetic screens have long been used to characterize gene functions, to define gene networks, and to identify the mechanism-of-action of drugs^1,2,3,4. The genetic relationships identified by such screens have been shown to involve positive and negative feedbacks, backups and cross-talks that would have been extremely difficult to discover using other approaches⁵. Currently, the large majority of reported screens in model organisms and in mammalian-cell systems have used gene-deletion libraries and/or methodologies to inactivate gene functions, such as short-interfering RNA, CRISPR-Cas9 or transposon-mediated mutagenesis^6,7. While powerful, such approaches usually identify loss-of-function phenotypes, and only rarely uncover separation-of-function or gain-of-function mutations. Gene overexpression screens have successfully identified gain-of-function alleles, but these screens often involve non-physiological protein levels. This limitation is significant because such separation- or gain-of-function mutations – which can arise spontaneously or via the action of genotoxic agents – can dramatically affect cell functions or cellular response to chemicals, and can have profound impacts on human health and disease^8,9. Suppressor screens, either based on lethal genetic deficiencies and/or the use of drugs, have also facilitated the characterization of functionally relevant protein domains and sites of post-translational protein modification through the identification of relevant single nucleotide DNA variants (SNV)s¹⁰.

In their simplest experimental setup, suppressor screens based on point-mutagenesis rely on four tools: (i) a genetically amenable organism or cell; (ii) a selectable phenotype; (iii) a method to create a library of mutants; and (iv) a method to identify mutations driving the suppressor phenotype amongst all the mutations in the library. Reflecting their relative amenability, these screens have mostly been carried out in microorganisms, either bacteria or yeasts, both of which benefit from the ability to survive in a stable haploid state. Despite not being strictly essential for such studies, a haploid state facilitates the identification of loss-of-function or separation-of-function recessive alleles, which would be masked in a heterozygous diploid cell state¹¹. While the first three tools mentioned above are often amenable to a researcher, the lack of fast and efficient methods to bridge the knowledge-gap between phenotype and genotype has discouraged the widespread implementation of suppressor screens based on point-mutagenesis. Indeed, until recently, recessive suppressor alleles could only be identified by labor-intensive methods involving genetic mapping and cloning in yeast, whereas the natural diploid state of mammalian cells largely precluded straightforward SNV suppressor screens in such systems.

Here, we describe an approach to overcome the above limitations that is based on sequencing of genomic DNA extracted from various independent suppressor clones, followed by bioinformatic analysis. With small adaptations, this method can be applied to both the budding yeast Saccharomyces cerevisiae and other haploid model organisms, as well as to haploid mammalian cells (Fig. 1). To highlight the utility of this approach, we describe its application to study resistance to the anti-cancer drugs camptothecin or olaparib, leading to the identification of various mutations in yeast TOP1 and in mouse Parp1, respectively. Importantly, we establish that drug target identification and mechanisms of drug resistance can be unveiled without a priori knowledge of the drug target. Furthermore, if a sufficient number of chemical-genetic suppressors is screened, this method also allows identification of functional protein domains required to drive drug sensitivity and resistance.

Results

Identification of TOP1 mutations conferring camptothecin resistance

To demonstrate the utility of the procedure described above, we sought to identify mutations imparting resistance to camptothecin, a DNA topoisomerase 1 inhibitor^12,13,14. To do this, we employed yeast strains carrying mutations inactivating pathways required for camptothecin resistance. These specific mutations (rad50S, sae2-F267A, rtt107∆, tof1∆, sae2∆mre11-H37R-tel1∆) were chosen for their ability to induce camptothecin hypersensitivity^15,16,17,18. To maximize the variety of potential mutations driving drug resistance, two of the five strains used were mutagenized with ethyl methane sulfonate (EMS), an alkylating agent that induces SNVs¹⁹, before plating them in the presence of camptothecin. In all cases, camptothecin-resistant colonies were readily detectable after 2–3 days of growth at 30 °C.

Genomic DNA sequencing of the resistant clones highlighted TOP1 as the gene carrying the largest number of unique mutations in our dataset, as expected for it being the drug target. The second most mutated gene — PDR1 that is known to regulate a pathway of multi-drug resistance²⁰ —carried 11 unique mutations, 10 of which did not co-occur with mutations in TOP1, whereas all the mutations found in the third most mutated gene (GLT1) co-occurred with mutations in either TOP1 or PDR1, suggesting that GLT1 mutations do not drive camptothecin resistance per se (Fig. 2a and data not shown). It is possible that the unexpectedly high frequency of GLT1 mutations could arise from them somehow enhancing cell survival to EMS, as these mutations are only found in EMS-mutagenized samples. Globally, out of the 251 yeast strains sequenced, 191 contained one or more mutation in TOP1 (Fig. 2b, light yellow). Furthermore, by manual inspection, we found that 27 additional strains carried mutations in TOP1 (Fig. 2b, dark yellow); the inability to automatically detect these mutations was caused by the fact that these strains were either not pure clones, or they carried large (>25 bp) deletions in TOP1 (Fig. 2b and Supplementary Figure S1). To the list of TOP1-mutated camptothecin-resistant strains, we added another 38 strains bearing TOP1 mutations that we had identified in previous, published screens^16,21, bringing the total number of TOP1 mutants analyzed to 256.

Missense, nonsense and frameshift TOP1 mutations were roughly equally represented in the non-mutagenized samples. However, where samples had been mutagenized with EMS the vast majority of mutations were nonsense or missense base substitutions (Fig. 2c). In the few cases in which the same suppressor clone contained missense and nonsense mutations in TOP1, the suppressive effect was attributed to the gained STOP codon.

When the positional distribution of each mutation type was plotted, nonsense and frameshift mutations were shown to be quite evenly distributed along the length of the TOP1 open reading frame (Fig. 2d and e). The prediction is that such mutations either result in null alleles – as the prematurely-terminated messenger RNA (mRNA) would be degraded by nonsense-mediated decay mechanisms²² – or would give rise to an unstable protein or a truncated version that could retain partial activity. Since the Y727 residue is essential for the catalytic activity of Top1, truncation before this residue is predicted to produce a non-functional protein^23,24. As might be expected, the distribution of nonsense mutations loosely correlated with them arising from codons in the open reading frame that only required one nucleotide change to change them to a STOP codon (Supplementary Figure S1). Notably, the observed enrichment of frameshifts near the 5′ end of the TOP1 transcript was localized to an 8-nucleotide homopolymeric adenine tract that is presumably particularly susceptible to mutagenesis (Supplementary Figure S1).

In striking contrast to the situation with nonsense or frameshift mutation, missense mutations were localized to specific regions of the TOP1 protein-coding sequence, overlapping with known functional domains of Top1. Indeed, the vast majority of mutations identified localized within three distinct regions of the larger DNA binding and catalytic domain, while a minority was located in the smaller C-terminal domain, essential for catalysis (Fig. 2f).

Functional consequences of the amino acid residue changes induced by missense mutations were assessed by using PROVEAN and PredicProt^25,26. These tools use chemical properties of amino acid residues and phylogenetic conservation to predict whether or not a particular substitution is likely to be functionally tolerated by the protein analyzed. Both these methods suggested that the vast majority of the TOP1 mutations we identified in camptothecin resistant strains were likely to produce deleterious effects (PROVEAN score < −2.5; PredictProtein score >50) (Fig. 3a). Notably, missense mutations located in the C-terminal domain of Top1 affected both conserved and non-conserved residues and were primarily positioned in the vicinity of the catalytic residue Y727, although three substitutions were closer to the C-terminus of the protein (Fig. 3b).

Top1 binds to DNA via a clamp-like mechanism in which DNA binding stimulates a conformational change in the protein. Thus, opposable “lip” domains encircle the DNA, stabilizing binding through establishing non-covalent protein-DNA and lip-lip interactions (Fig. 3c)^27,28. Approximately two thirds of the missense suppressor mutations identified in the DNA binding domain clustered within the Lip1 and Lip2 regions, highlighting their importance for Top1 function (Fig. 3d; the Lip2 domain also contains an active-site residue, R420). Remaining mutations clustered between amino acid residues 500 and 600, which encompass the end of the DNA binding/catalytic domain and the base of the coiled-coil linker domain. In this region two other active site residues (R517 and H558) are located (Fig. 3d).

Collectively, these results showed that even with no a priori knowledge, our approach for identifying drug-resistant strains and associated mutations would have identified Top1 as the likely target of camptothecin and would have highlighted the critical Top1 domains functionally relevant for Top1 activity and drug hypersensitivity.

Identification of Parp1 mutations conferring olaparib resistance

Based on a similar approach to that described above, we recently identified genes whose mutation in haploid mammalian cells causes resistance to the anti-metabolite drug 6-thioguanine²⁹. To further highlight the wider applicability of our approach in mammalian cell systems, we carried out a screen to identify mutations that allow haploid mouse cells to survive in the presence of the anti-cancer agent olaparib, a potent poly ADP-ribose polymerase (PARP) small-molecule inhibitor^30,31. Thus, wild-type, haploid mouse embryonic stem cells (mESCs) were mutagenized by using EMS, and mutant libraries were screened for resistance to olaparib (Fig. 1a). Forty-five olaparib-resistant clones were isolated and subjected to whole-exome sequencing.

Analysis of ensuing sequence data for putative, acquired mutations, revealed Parp1 as the most mutated gene in the dataset with 25 different mutations detected (Fig. 4a, Supplementary Table S3). Globally, 40 out of the 45 clones harbored Parp1 mutations (Fig. 4b, Supplementary Table S3). Further manual examination of the aligned sequencing data from the five remaining clones revealed that four of these also likely carried mutations affecting PARP1 (Supplementary Figure S2). Two of those five (A7, B7) likely carry the R138C missense mutation identified in another clone (Supplementary Figure S2, Fig. 4c), while two other clones (A9, H10) harbored nonsense mutations at codon 341 (Supplementary Figure S2). Importantly, mutations in the second and third most mutated genes (Ttn and Plch1 with 9 and 5 different mutations, respectively) never occurred in isolation in the absence of Parp1 mutations, while Parp1 mutations also occurred in the absence of Ttn or Plch1 mutations. These data thus highlighted how such an analysis would have identified PARP1 as the likely prime driver of olaparib sensitivity without any knowledge about the drug’s mechanism-of-action (see below for further discussion).

Of the Parp1 mutations we detected, more than half led to premature termination codons, splice acceptor/donor, or frameshift mutations, which would presumably lead to the production of aberrant mRNAs subject to nonsense-mediated decay and/or the generation of unstable, truncated PARP1 protein. As we previously noted for premature-termination mutations in yeast TOP1, these mutations did not cluster in any particular domain(s) of the Parp1 open reading frame (Fig. 4c). Furthermore, similar to what we observed in yeast, EMS treatment resulted in an overrepresentation of single nucleotide variants, compared to frameshift mutations (Fig. 4c).

Strikingly, missense mutations detected in Parp1 were more frequent in the N-terminal part of the protein, where the DNA-binding domains reside, while no such mutations were observed in the catalytic domain (Fig. 4c). Analysis of PARP1 protein levels indicated that other than G400R and A610V, which resulted in complete loss or marked reduction of the PARP1 protein product, no other of the identified missense mutations impacted on PARP1 protein stability (Fig. 5a, Supplementary Figure S3). Computational predictions for the likely consequences of these remaining missense mutations on protein structure and function suggested that all of them were functionally deleterious (Fig. 4d). Since all these missense mutations localized within domains known to be involved in DNA binding, we examined their locations relative to the DNA-protein interface as defined by previously published PARP1 structures^32,33 (Fig. 5b). Notably, most of the missense mutations affecting residues in the DNA binding domains clustered at the DNA-protein interface, and did so in proximity to residues that make key DNA contacts³³.

Without any a priori knowledge about how olaparib causes cell toxicity, the above data would have suggested that such toxicity is largely driven by a mechanism connected to PARP1 DNA binding. To test this idea, we assessed the missense mutations identified in the DNA binding domains of PARP1 for their potential effects on the ability of PARP1 to bind a double-stranded DNA oligonucleotide. Significantly, this analysis revealed that all the point mutants that did not reduce PARP1 levels showed reduced levels of DNA binding when compared to the wild-type PARP1 protein (Fig. 5c, Supplementary Figure S3). Consistent with PARP1 DNA binding triggering its auto-modification by poly ADP-ribose, we found that the PARP1 S568F mutation, which impairs DNA binding, did not exhibit evidence of PARylation when cells were treated with hydrogen peroxide (Fig. 5d). These findings were therefore in accord with the fact that toxicities of PARP inhibitors such as olaparib are linked to their ability to trap PARP1 on DNA by blocking its catalytic activity³⁰.

The last clone without an assigned mutation driving resistance to olaparib (C1) may also carry one or more mutations in the non-exonic regions of the Parp1 gene, or epigenetic modifications altering Parp1 expression, since we could not detect the presence of PARP1 in protein extracts (Fig. 5a). Taken together, these results are consistent with a model in which olaparib resistance can arise either from loss of PARP1 or from its decreased ability to bind DNA (Fig. 5e).

Discussion

Various approaches have been described for systematic identification of genetic and chemo-genetic interactions. Until recently, this search has been largely conducted using approaches based on gene inactivation, either in arrayed or pooled assay formats. While these approaches have played crucial roles in determining gene-gene and gene-drug interactions, their limited power of resolution does not in general provide information regarding the functional protein domains relevant for the identified interaction. While transposon-based mutagenesis and dense CRISPR gRNA approaches have recently been shown to provide some information at the domain level, this approach is only applicable to loss-of-function mutations, and is biased towards C-terminal domains of proteins³⁴. In contrast, SNV based approaches can provide a higher level of resolution, and in many cases produce unanticipated results^10,35,36,37. Lack of rapid and facile procedures to bridge the phenotype-to-genotype gap has until recently, however, precluded the use of these techniques on a high-throughput scale.

The approach we have described allows the identification of SNVs driving drug resistance or resistance to essentially any selective growth condition in a systematic and unbiased way (other than any bias imposed by the mutagenic agent of choice). Importantly, this approach can equally be applied to yeast and to more complex eukaryotes, bringing the power of high-resolution haploid genetic screens to mammalian systems. It should however be noted that certain limitations of haploid genetics, such as the inability to formally distinguish between recessive and dominant alleles, will also apply to mammalian systems. While we acknowledge that we have carried out our screens with strong selectable cell-viability phenotypes, we envisage applicability in more complex scenarios, for example involving FACS-based selection or cell migration, motility or attachment assays. Highlighting this potential, our results show how, with no previous knowledge, Top1 and PARP1 would have been identified as the most likely targets for the drugs camptothecin and olaparib, respectively.

Toxicity to PARP inhibitors was initially linked to the involvement of PARP1 in the repair of single-strand DNA breaks^38,39, but more recent data challenged this view⁴⁰. The fact that loss of PARP1 drives resistance to PARP inhibitors in wild-type genetic backgrounds⁴¹ indeed suggests that inhibition of PARP1 catalytic activity — and not the accumulation of unrepaired DNA lesions — is the major effector of toxicity in such genetic backgrounds. Indeed, recent findings suggest that PARP1 trapping onto DNA, caused by inhibition of its catalytic activity, is the main cause of toxicity³¹. Our data further support this model, as all the suppressor clone variants which we identified that did not result in loss of PARP1 protein negatively affected its binding to DNA. Preventing PARP1 binding to DNA thus appears to be sufficient to circumvent the toxicity of PARP inhibitors, at least in homologous-recombination proficient cells; and the fact that we identified no mutants specifically defective in catalytic function reinforces the idea of trapped PARP1 as the main cytotoxic lesion for olaparib in wild-type mammalian cells. Further work in a diploid setting will be required to demonstrate whether the DNA-binding deficient alleles isolated are recessive, and that the PARP1-olaparib complex indeed acts as a dominant negative entity. This may not only be important for our understanding of how PARP inhibitors function but also for understanding mechanisms of intrinsic or evolved tumor resistance towards such clinical agents. In particular, it will be interesting to determine whether olaparib resistance in patients with BRCA1/2 mutant cancers can be driven by loss of PARP binding to DNA.

As exemplified by our analyses of Top1 and PARP1, the level of detail on critical functional domains and residues increases with the number of samples sequenced. Because of their genome size, screens based in mammalian systems require greater sequencing power than screens conducted in simpler organisms such as yeast. Moreover, as compared to yeasts, the more complex genome architecture in mammalian systems – where there is more intergenic DNA, a larger number of genes and an abundance of intronic sequences – increases the chances of isolating variants affecting protein levels, rather than protein function. One solution to bypass such issues will be to run two-tiered screens, initially using whole exome sequencing on a subset of suppressors in order to identify top gene hits driving resistance, and then using targeted exome sequencing to test the rest of the samples, either through analysis of various individual clones or bulk sequencing of the resistant population. In addition, we can envision alternative scenarios where a gene identified in an initial screen could be marked/tagged in a way to allow selection of mutations that affect protein function but not protein levels. This approach can also be combined with CRISPR-Cas9-mediated in vivo targeted mutagenesis, via a library of gRNAs directed towards the exonic regions of the gene (Pettitt et al., BioRxiv https://doi.org/10.1101/203224). We anticipate that such developments, along with expected further increases in sequencing throughput and associated cost reductions, will pave the way for hitherto unprecedented genetic analyses on comprehensive and systematic scales.

Methods

Yeast suppressors of camptothecin sensitivity

S. cerevisiae strains used were derived from W303. All gene deletions were introduced by using one-step gene disruption, and were confirmed by PCR and whole-genome sequencing. Full genotypes of strains are described in Supplementary Table 1. Standard growth conditions (1% yeast extract, 2% peptone, 2% glucose, 40 mg/l adenine) were used. Strains YFP1001 and YFP1073 were mutagenized by adding 4.5% ethyl methane sulfonate (EMS) to liquid cultures in logarithmic growth-phase, pelleted by centrifugation and then resuspended in 50 mM K-phosphate buffer for 10 minutes, followed by EMS inactivation with 1 volume of 10% sodium thiosulfate. Suppressors were obtained by plating each strain on 10 YPD plates supplemented with 5 µg/ml of camptothecin (approximately 10⁷ cells per plate). Resistant colonies were picked after 2–3 days of growth at 30 °C and isolated by streaking on YPD plates. Suppression was confirmed by retesting camptothecin sensitivity of the isolated strains. Confirmed suppressors were processed for DNA extraction shortly thereafter, in parallel with 2–3 colonies of the initial strain (Fig. 1a).

Mouse embryonic stem cell suppressors of olaparib sensitivity

Haploid mouse AN3-12 embryonic stem cells (mESCs)^42,43 were used for all the experiments and were free from mycoplasma. Cells were grown in DMEM high glucose (Sigma) supplemented with glutamine, fetal bovine serum, streptomycin, penicillin, non-essential amino acids, sodium pyruvate, 2-mercaptoethanol and Leukemia inhibitory factor (LIF). All plates and flasks were gelatinized before cell seeding.

Cell sorting for DNA content was performed on mESCs by using a MoFlo flow sorter (Beckman Coulter) after staining with 15 μg/ml Hoechst 33342 (Invitrogen). The 1n peak was purified to enrich for haploid mESCs.

Mutagenesis with EMS was performed as described previously⁴⁴ with the following adjustments: after cell sorting, haploid-enriched cells were grown in DMEM plus LIF for overnight EMS treatment. After EMS treatment, cells were cultured for five passages in DMEM plus LIF and plated into 6-well plates at a density of 5 × 10⁵ cells per well. Cells were then treated with 6 μM of olaparib (AZD2281; Stratech Scientific Ltd.) for 6 days, supplying new medium with olaparib daily. Cells were then grown for another four days without olaparib until mESC colonies could be isolated.

Genomic DNA isolation

S. cerevisiae DNA isolation

Resistant colonies were inoculated in 1.8 ml of YPAD in 96-deep-well plates and grown for 48 hours. Pelleted cells were re-suspended in 500 μl of spheroplasting solution (1 M sorbitol, 0.1 M EDTA, 14 mM 2-mercaptoethanol, 1 mg/ml RNAse A, containing 5 mg/ml zymolyase) and incubated for 2 hours at 37 °C. Spheroplasts were subsequently re-suspended in 200 μl of lysis buffer (80% ATL buffer [QIAGEN #19076], 10% Proteinase K [QIAGEN #19133] and 10% RNAse A (10 mg/ml)] and incubated overnight (>16 h) at 56 °C. Genomic DNA was extracted from the resulting solution by using the Corbett X-Tractor Gene^™ Robot with the following buffers: AL [QIAGEN #19075; diluted 50% with ethanol], DXW [QIAGEN #950154], DXF [QIAGEN #950163], and E [QIAGEN #950172] (Fig. 1a).

Mouse genomic DNA isolation

mESC clones were grown into 12-well plates. After trypsinising and resuspension in 200 μl PBS and 200 μl Buffer AL [QIAGEN], a proteinase K [QIAGEN, 20 μl] and RNase [QIAGEN, 0.4 mg] digestion step was performed (incubating 10min at 56 °C). After adding 200 μl 96–100% ethanol the solutions were applied to QIAamp Mini spin columns following the QIAamp DNA Blood Mini Kit [QIAGEN] manufacturer’s protocol from there. Genomic DNA was eluted from the columns using 200 μl distilled water. A second elution was performed if the yield of the genomic DNA obtained was lower than 2 μg. Genomic DNA was stored at −20 °C short-term before sequencing (Fig. 1a).

Illumina library preparation and sequencing

Extracted DNA was tested for total volume, concentration and total amount by using gel electrophoresis and the Quant-iTTM PicoGreen® dsDNA Assay Kit (ThermoFisher Scientific). Genomic DNA −500 µg (yeast) or 1–3 μg (mouse) – was fragmented to an average size of 100–400 bp (mouse) or 400–600 bp (yeast) by using a Covaris E210 or LE220 device (Covaris, Woburn, MA, USA), size-selected and subjected to DNA library creation via established Illumina paired-end protocols. Adaptor-ligated libraries were amplified and indexed via PCR. A portion of each library was used to create an equimolar pool comprising 45 indexed libraries for mouse samples, and 96 indexed libraries for yeast samples. For mouse whole-exome sequencing, pools were hybridized to SureSelect RNA baits (Mouse_all_exon; Agilent Technologies) (Fig. 1b).

Mouse libraries were sequenced at 15 samples per lane. Yeast libraries were sequenced at up to 96 samples per lane. Libraries were sequenced by using the HiSeq 2500 (Illumina) to generate 75 (mouse), or 100/125 (yeast) base paired-end reads according to the manufacturer’s recommendations.

Analysis of DNA sequence data to identify suppressor mutations

Alignment of DNA sequencing data

Sequencing reads were aligned to the appropriate reference genome using BWA aln (v0.5.9‐r16)⁴⁵. The S. cerevisiae S288c assembly (R64-1-1) from the Saccharomyces Genome Database was obtained from the Ensembl genome browser. For mouse samples, the Mus musculus GRCm38 (mm10) was used. Where appropriate, all lanes from the same library were merged into a single BAM file, and PCR duplicates were marked by using Picard Tools (Picard version 1.128). The quality of the sequencing data post-alignment was assessed by using SAMtools stats and SAMtools flagstats (1.1+htslib−1.1), plot-bamstats, bamcheck and plot-bamcheck⁴⁶ (Fig. 1c).

Variant calling, consequence annotation and filtering

SNVs and small insertions/deletions (INDELs) were identified using SAMtools mpileup (v.1.3)⁴⁶, followed by BCFtools call (v.1.3)⁴⁶. The following parameters were used for SAMtools mpileup: -g -t DP,AD -C50 -pm3 -F0.2 -d10000. Parameters for BCFtools call were: -vm -f GQ. All variants were annotated by using the Ensembl Variant Effect Predictor (VEP) v82⁴⁷. To exclude low quality calls, variants were filtered by using VCFtools vcf-annotate (v.0.1.12b)⁴⁸ with options −H −f +/q = 25/SnpGap = 7/d = 5, and custom filters were written to exclude variants with a Genotype Quality (GQ) score of less than 10. In the case of whole-exome sequencing data, variants called outside of targeted regions were excluded. INDELs were left-aligned using BCFtools norm⁴⁶.

Removal of background mutations

Variants that confer resistance are absent in the initial strain/cell line, as it is sensitive to the drug used. Bedtools intersect was therefore used to remove variants present in any S. cerevisiae control samples to eliminate variation of the background relative to the reference genome from the dataset. Variant calls from mouse samples were filtered by removing all variants identified in sequencing data of three olaparib-sensitive AN3-12 clones using VCFtools vcf-isec⁴⁸; INDELs were further verified by using the microassembly-based variant caller Scalpel⁴⁹. To address the high false positive rate in INDEL variant calls, only the INDELs that were identified by both variant callers and have passed the filters were retained.

Variant prioritization

Variants were prioritized by their Ensembl VEP⁴⁷ predicted consequence: we retained variants predicted to cause a frameshift, a premature stop codon, a missense mutation, a lost start/stop codon, a synonymous mutation, an in-frame insertion or deletion, and in the case of mouse data, those annotated to affect splice donor/acceptor bases. Genes were prioritized by ranking them by the number of distinct mutations identified in each gene.

Missense mutations identified in genes of interest were ranked by using predictions of PROVEAN/SIFT^25,50,51,52 and PredictProtein²⁶. Scores below −2.5 for PROVEAN, above 50 for PredictProtein and below 0.05 for SIFT indicate likely deleterious consequences to protein function.

Analysis of olaparib resistant cell lines

Molecular modeling

Molecular models were generated by using PyMOL. Crystal structure data were obtained from RCSB Protein Data Bank (PDB). The codes for PARP1 structures were 4DQY (human PARP1 without ZN2 or BRCT domain) and 3ODC (human PARP1 ZN2).

Antibodies for immunoblotting

Anti-PARP1 (Cell Signalling, 9542; 1:1000 in TBST 5% milk), anti-HDAC1 (Abcam ab19845; 1:1000 in TBST 1% BSA), anti-PAR (Trevigen 4336-BPC-100, 1:1000 in PBST[0.05%Tween-20] 5% milk), and anti beta-actin (Cell Signaling #4970, 1:1000 in TBST 5% BSA) were incubated with western-blot membranes at 4 °C overnight with gentle rocking.

DNA binding assays

DNA binding of PARP1 was assayed by using a 26-bp palindromic DNA duplex (5′GCCTACCGGTTCGCGAACCGGTAGGC3′³³) immobilized on Dynabeads™ M-280 Streptavidin (Invitrogen, 10 mg/ml). Individual incubations used 500 µg of protein extract and a buffer containing 10 mM HEPES (p.H. 7.4), MgCl2 (1.5 mM), 25% glycerol, KCl (200 mM), EDTA (0.2 mM), Roche protease inhibitor cocktail (0.7X), DTT (0.5 mM) and AEBSF (0.1 mM). Experiments were repeated at least twice.

PARylation assay

Cells were treated with 6 μM olaparib overnight, followed 1 mM H₂O₂ for 10 minutes in the dark, washed with ice-cold PBS and collected. Cells were lysed in Laemmli buffer (120 mM TrisHCl pH6.8, 4% SDS, 20% glycerol) and lysates were separated on 4–20% Tris-Glycine gradient gel followed by transfer onto PVDF membrane. Membranes were immunoblotted with the appropriate antibodies. Experiments were repeated at least twice.

Data availability

The datasets generated and/or analysed during the current study are available in the European Nucleotide Archive (ENA) under the accession code PRJEB2977 (yeast) and PRJEB13612 (mouse). Access codes for specific samples are detailed in Supplementary Table 2. Source data for Figs 2 and 3 are provided in Supplementary Table S2; source data for Figs 4 and 5 are provided in Supplementary Table S3 and Supplementary Figure S3.

Code availability

Custom code used to analyse sequencing data and to draw figures is available in the following repository: http://github.com/fabiopuddu/Herzog2018.

References

Hillenmeyer, M. E. et al. The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science (New York, N.Y.) 320, 362–365 (2008).
Article ADS CAS Google Scholar
Parsons, A. B. et al. Exploring the mode-of-action of bioactive compounds by chemical-genetic profiling in yeast. Cell 126, 611–625 (2006).
Article CAS PubMed Google Scholar
Nijman, S. M. B. Functional genomics to uncover drug mechanism of action. Nat. Chem. Biol. 11, 942–948 (2015).
Article CAS PubMed Google Scholar
van Leeuwen, J. et al. Exploring genetic suppression interactions on a global scale. Science (New York, N.Y.) 354, aag0839–aag0839 (2016).
Article Google Scholar
Forsburg, S. L. The art and design of genetic screens: yeast. Nature reviews. Genetics 2, 659–668 (2001).
Article CAS PubMed Google Scholar
Grimm, S. The art and design of genetic screens: mammalian culture cells. Nature reviews. Genetics 5, 179–189 (2004).
Article CAS PubMed Google Scholar
Horner, V. L. & Caspary, T. Creating a ‘hopeful monster’: mouse forward genetic screens. Methods Mol. Biol. 770, 313–336 (2011).
Article CAS PubMed Google Scholar
Trahey, M. & McCormick, F. A cytoplasmic protein stimulates normal N-ras p21 GTPase, but does not affect oncogenic mutants. Science (New York, N.Y.) 238, 542–545 (1987).
Article ADS CAS Google Scholar
INGRAM, V. M. Gene mutations in human haemoglobin: the chemical difference between normal and sickle cell haemoglobin. Nature 180, 326–328 (1957).
Article ADS CAS PubMed Google Scholar
Rolef Ben-Shahar, T. et al. Eco1-dependent cohesin acetylation during establishment of sister chromatid cohesion. Science (New York, N.Y.) 321, 563–566 (2008).
Article ADS Google Scholar
Carette, J. E. et al. Haploid genetic screens in human cells identify host factors used by pathogens. Science (New York, N.Y.) 326, 1231–1235 (2009).
Article ADS CAS Google Scholar
Hsiang, Y. H., Hertzberg, R., Hecht, S. & Liu, L. F. Camptothecin induces protein-linked DNA breaks via mammalian DNA topoisomerase I. The Journal of biological chemistry 260, 14873–14878 (1985).
CAS PubMed Google Scholar
Bjornsti, M.-A., Benedetti, P., Viglianti, G. A. & Wang, J. C. Expression of human DNA topoisomerase I in yeast cells lacking yeast DNA topoisomerase I: restoration of sensitivity of the cells to the antitumor drug camptothecin. Cancer Res. 49, 6318–6323 (1989).
CAS PubMed Google Scholar
Pommier, Y., Sun, Y., Huang, S.-Y. N. & Nitiss, J. L. Roles of eukaryotic topoisomerases in transcription, replication and genomic stability. Nature reviews. Molecular cell biology 17, 703–721 (2016).
Article CAS PubMed Google Scholar
Rouse, J. Esc4p, a new target of Mec1p (ATR), promotes resumption of DNA synthesis after DNA damage. The EMBO journal 23, 1188–1197 (2004).
Article CAS PubMed PubMed Central Google Scholar
Puddu, F. et al. Synthetic viability genomic screening defines Sae2 function in DNA repair. The EMBO journal 34, 1509–1522 (2015).
Article CAS PubMed PubMed Central Google Scholar
Huertas, P., Cortés-Ledesma, F., Sartori, A. A., Aguilera, A. & Jackson, S. P. CDK targets Sae2 to control DNA-end resection and homologous recombination. Nature 455, 689–692 (2008).
Article ADS CAS PubMed PubMed Central Google Scholar
Mimitou, E. P. & Symington, L. S. Ku prevents Exo1 and Sgs1-dependent resection of DNA ends in the absence of a functional MRX complex or Sae2. EMBO J 29, 3358–3369 (2010).
Article CAS PubMed PubMed Central Google Scholar
Coulondre, C. & Miller, J. H. Genetic studies of the lac repressor. IV. Mutagenic specificity in the lacI gene of Escherichia coli. J. Mol. Biol. 117, 577–606 (1977).
Article CAS PubMed Google Scholar
Reid, R. J., Kauh, E. A. & Bjornsti, M.-A. Camptothecin sensitivity is mediated by the pleiotropic drug resistance network in yeast. The Journal of biological chemistry 272, 12091–12099 (1997).
Article CAS PubMed Google Scholar
Puddu, F. et al. Chromatin determinants impart camptothecin sensitivity. EMBO reports e201643560, https://doi.org/10.15252/embr.201643560 (2017).
Mitchell, P. & Tollervey, D. An NMD pathway in yeast involving accelerated deadenylation and exosome-mediated 3′–5′ degradation. Molecular Cell 11, 1405–1413 (2003).
Article CAS PubMed Google Scholar
Lynn, R. M., Bjornsti, M.-A., Caron, P. R. & Wang, J. C. Peptide sequencing and site-directed mutagenesis identify tyrosine-727 as the active site tyrosine of Saccharomyces cerevisiae DNA topoisomerase I. Proc Natl Acad Sci USA 86, 3559–3563 (1989).
Article ADS CAS PubMed PubMed Central Google Scholar
Eng, W. K., Pandit, S. D. & Sternglanz, R. Mapping of the active site tyrosine of eukaryotic DNA topoisomerase I. The Journal of biological chemistry 264, 13373–13376 (1989).
CAS PubMed Google Scholar
Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature protocols 4, 1073–1081 (2009).
Article CAS PubMed Google Scholar
Yachdav, G. et al. PredictProtein–an open resource for online prediction of protein structural and functional features. Nucleic acids research 42, W337–43 (2014).
Article CAS PubMed PubMed Central Google Scholar
Redinbo, M. R., Stewart, L., Kuhn, P., Champoux, J. J. & Hol, W. G. Crystal structures of human topoisomerase I in covalent and noncovalent complexes with DNA. Science (New York, N.Y.) 279, 1504–1513 (1998).
Article ADS CAS Google Scholar
Stewart, L., Redinbo, M. R., Qiu, X., Hol, W. G. & Champoux, J. J. A model for the mechanism of human topoisomerase I. Science (New York, N.Y.) 279, 1534–1541 (1998).
Article ADS CAS Google Scholar
Forment, J. V. et al. Genome-wide genetic screening with chemically mutagenized haploid embryonic stem cells. Nat. Chem. Biol. 13, 12–14 (2017).
Article CAS PubMed Google Scholar
Lord, C. J. & Ashworth, A. PARP inhibitors: Synthetic lethality in the clinic. Science (New York, N.Y.) 355, 1152–1158 (2017).
Article ADS CAS Google Scholar
Murai, J. et al. Trapping of PARP1 and PARP2 by Clinical PARP Inhibitors. Cancer Res. 72, 5588–5599 (2012).
Article CAS PubMed PubMed Central Google Scholar
Langelier, M.-F., Planck, J. L., Roy, S. & Pascal, J. M. Crystal structures of poly(ADP-ribose) polymerase-1 (PARP-1) zinc fingers bound to DNA: structural and functional insights into DNA-dependent PARP-1 activity. The Journal of biological chemistry 286, 10690–10701 (2011).
Article CAS PubMed PubMed Central Google Scholar
Langelier, M.-F., Planck, J. L., Roy, S. & Pascal, J. M. Structural basis for DNA damage-dependent poly(ADP-ribosyl)ation by human PARP-1. Science (New York, N.Y.) 336, 728–732 (2012).
Article ADS CAS Google Scholar
Michel, A. H. et al. Functional mapping of yeast genomes by saturated transposition. eLife 6, E3179 (2017).
Article Google Scholar
Hardy, C. F., Dryga, O., Seematter, S., Pahl, P. M. & Sclafani, R. A. mcm5/cdc46-bob1 bypasses the requirement for the S phase activator Cdc7p. Proc Natl Acad Sci USA 94, 3151–3155 (1997).
Article ADS CAS PubMed PubMed Central Google Scholar
Sandrock, T. M., O'Dell, J. L. & Adams, A. E. Allele-specific suppression by formation of new protein-protein interactions in yeast. Genetics 147, 1635–1642 (1997).
CAS PubMed PubMed Central Google Scholar
Zhao, X., Chabes, A., Domkin, V., Thelander, L. & Rothstein, R. The ribonucleotide reductase inhibitor Sml1 is a new target of the Mec1/Rad53 kinase cascade during growth and in response to DNA damage. The EMBO journal 20, 3544–3553 (2001).
Article CAS PubMed PubMed Central Google Scholar
Farmer, H. et al. Targeting the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature 434, 917–921 (2005).
Article ADS CAS PubMed Google Scholar
Bryant, H. E. et al. Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP-ribose) polymerase. Nature 434, 913–917 (2005).
Article ADS CAS PubMed Google Scholar
Helleday, T. The underlying mechanism for the PARP and BRCA synthetic lethality: clearing up the misunderstandings. Mol Oncol 5, 387–393 (2011).
Article CAS PubMed PubMed Central Google Scholar
Pettitt, S. J. et al. A genetic screen using the PiggyBac transposon in haploid cells identifies Parp1 as a mediator of olaparib toxicity. PloS one 8, e61520 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Leeb, M. & Wutz, A. Derivation of haploid embryonic stem cells from mouse embryos. Nature 479, 131–134 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Elling, U. et al. Forward and reverse genetics through derivation of haploid mouse embryonic stem cells. Cell Stem Cell 9, 563–574 (2011).
Article CAS PubMed PubMed Central Google Scholar
Munroe, R. & Schimenti, J. Mutagenesis of mouse embryonic stem cells with ethylmethanesulfonate. Methods Mol. Biol. 530, 131–138 (2009).
Article CAS PubMed Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25, 1754–1760 (2009).
Article CAS Google Scholar
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England) 25, 2078–2079 (2009).
Article Google Scholar
McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics (Oxford, England) 26, 2069–2070 (2010).
Article CAS Google Scholar
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics (Oxford, England) 27, 2156–2158 (2011).
Article CAS Google Scholar
Narzisi, G. et al. Accurate de novo and transmitted indel detection in exome-capture data using microassembly. Nature methods 11, 1033–1036 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ng, P. C. & Henikoff, S. Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet 7, 61–80 (2006).
Article CAS PubMed Google Scholar
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome research 11, 863–874 (2001).
Article CAS PubMed PubMed Central Google Scholar
Ng, P. C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic acids research 31, 3812–3814 (2003).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Carmen Diaz Soria for help with the yeast screens; Josef Penninger for the gift of the AN3-12 mESC line; Carla Daniela Robles Espinoza and Martin del Castillo Velasco-Herrera for critical reading of the manuscript; James Hewinson for assistance with the submission for sequencing samples; and all the members of the SPJ laboratory for helpful discussions. This work was supported by Cancer Research UK [Programme Grant C6/A18796, Institute Core Funding C6946/A24843]; the Wellcome Trust [101126/Z/13/Z (Strategic Award - COMSIG), Investigator Award 206388/Z/17/Z, Institute Core Funding WT203144, PhD Fellowship 098051 to M.H.]; and the European Molecular Biology Organization [Long-Term Fellowship ALTF1287-2011 to F.P.].

Author information

Josep V. Forment
Present address: AstraZeneca, Oncology DNA damage response group, Hodgkin Building, 310 Cambridge Science Park, Milton Road, CB4 0WG, Cambridge, UK
Mareike Herzog and Fabio Puddu contributed equally to this work.

Authors and Affiliations

The Wellcome/CRUK Gurdon Institute and Department of Biochemistry, University of Cambridge, Tennis Court Road, CB2 1QN, Cambridge, UK
Mareike Herzog, Fabio Puddu, Julia Coates, Nicola Geisler, Josep V. Forment & Stephen P. Jackson

Authors

Mareike Herzog
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Puddu
View author publications
You can also search for this author in PubMed Google Scholar
Julia Coates
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Geisler
View author publications
You can also search for this author in PubMed Google Scholar
Josep V. Forment
View author publications
You can also search for this author in PubMed Google Scholar
Stephen P. Jackson
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The project was conceived by J.V.F., F.P., M.H. and S.P.J. Yeast mutagenesis, suppressor screening, and DNA extractions were carried out by F.P. and N.J.G. Mouse haploid cell purification, E.M.S. mutagenesis, and screening were carried out by J.V.F.; DNA extractions and PARP assays were carried out by J.C. Analysis of yeast and mouse whole genome/exome sequencing data was carried out by F.P. and M.H., respectively. The manuscript was largely written by M.H., F.P., and S.P.J., with contributions made by all the other authors.

Corresponding authors

Correspondence to Josep V. Forment or Stephen P. Jackson.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Figures and Legends

Supplementary Table 1

Supplementary Table 2

Supplementary Table 3

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Herzog, M., Puddu, F., Coates, J. et al. Detection of functional protein domains by unbiased genome-wide forward genetic screening. Sci Rep 8, 6161 (2018). https://doi.org/10.1038/s41598-018-24400-4

Download citation

Received: 09 January 2018
Accepted: 22 March 2018
Published: 18 April 2018
DOI: https://doi.org/10.1038/s41598-018-24400-4

This article is cited by

Genome architecture and stability in the Saccharomyces cerevisiae knockout collection
- Fabio Puddu
- Mareike Herzog
- Stephen P. Jackson
Nature (2019)
Derivation and maintenance of mouse haploid embryonic stem cells
- Ulrich Elling
- Michael Woods
- Gabriel Balmus
Nature Protocols (2019)
Genome-wide and high-density CRISPR-Cas9 screens identify point mutations in PARP1 causing PARP inhibitor resistance
- Stephen J. Pettitt
- Dragomir B. Krastev
- Christopher J. Lord
Nature Communications (2018)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.