Abstract
Covalent chemistry represents an attractive strategy for expanding the ligandability of the proteome, and chemical proteomics has revealed numerous electrophile-reactive cysteines on diverse human proteins. Determining which of these covalent binding events affect protein function, however, remains challenging. Here we describe a base-editing strategy to infer the functionality of cysteines by quantifying the impact of their missense mutation on cancer cell proliferation. The resulting atlas, which covers more than 13,800 cysteines on more than 1,750 cancer dependency proteins, confirms the essentiality of cysteines targeted by covalent drugs and, when integrated with chemical proteomic data, identifies essential, ligandable cysteines in more than 160 cancer dependency proteins. We further show that a stereoselective and site-specific ligand targeting an essential cysteine in TOE1 inhibits the nuclease activity of this protein through an apparent allosteric mechanism. Our findings thus describe a versatile method and valuable resource to prioritize the pursuit of small-molecule probes with high function-perturbing potential.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Proteomics data have been deposited to the ProteomeXchange Consortium (PXD038232, PXD038239 and PXD041314). Sequencing data have been deposited in the National Centers for Biotechnology Information Sequence Read Archive (PRJNA905477). Processed screen data and proteomics data are provided as Supplementary Data. Databases used in this study include the DepMap (https://depmap.org/portal/, 21Q3), UniProt (https://www.uniprot.org/, release 2016-07), AlphaFold (https://alphafold.ebi.ac.uk/, 2022), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/, v.2023-04), GENCODE (http://www.gencodegenes.org/, 2020), Ensembl (https://rest.ensembl.org/, 2023-04) and APPRIS (https://appris.bioinfo.cnio.es/#/, 2020). Source data are provided with this paper.
Code availability
Custom code used in the analysis is available on GitHub (https://github.com/cravattlab/Cys_editing).
References
Schreiber, S. L. et al. Advancing biological understanding and therapeutics discovery with small-molecule probes. Cell 161, 1252–1265 (2015).
Tsherniak, A. et al. Defining a cancer dependency map. Cell 170, 564–576.e16 (2017).
Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
Schreiber, S. L. A chemical biology view of bioactive small molecules and a binder-based approach to connect biology to precision medicines. Isr. J. Chem. 59, 52–59 (2019).
Scott, D. E., Coyne, A. G., Hudson, S. A. & Abell, C. Fragment-based approaches in drug discovery and chemical biology. Biochemistry 51, 4990–5003 (2012).
Brenner, S. & Lerner, R. A. Encoded combinatorial chemistry. Proc. Natl Acad. Sci. USA 89, 5381–5383 (1992).
Backus, K. M. et al. Proteome-wide covalent ligand discovery in native biological systems. Nature 534, 570–574 (2016).
Weerapana, E. et al. Quantitative reactivity profiling predicts functional cysteines in proteomes. Nature 468, 790–795 (2010).
Arkin, M. R. & Wells, J. A. Small-molecule inhibitors of protein–protein interactions: progressing towards the dream. Nat. Rev. Drug Discov. 3, 301–317 (2004).
Wakefield, A. E., Kozakov, D. & Vajda, S. Mapping the binding sites of challenging drug targets. Curr. Opin. Struct. Biol. 75, 102396 (2022).
Bar-Peled, L. et al. Chemical proteomics identifies druggable vulnerabilities in a genetically defined cancer. Cell 171, 696–709.e23 (2017).
Vinogradova, E. V. et al. An activity-guided map of electrophile-cysteine interactions in primary human T cells. Cell 182, 1009–1026.e29 (2020).
Kuljanin, M. et al. Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries. Nat. Biotechnol. 39, 630–641 (2021).
Maurais, A. J. & Weerapana, E. Reactive-cysteine profiling for drug discovery. Curr. Opin. Chem. Biol. 50, 29–36 (2019).
Abbasov, M. E. et al. A proteome-wide atlas of lysine-reactive chemistry. Nat. Chem. 13, 1081–1092 (2021).
Spradlin, J. N., Zhang, E. & Nomura, D. K. Reimagining druggability using chemoproteomic platforms. Acc. Chem. Res. 54, 1801–1813 (2021).
Lu, W. et al. Fragment-based covalent ligand discovery. RSC Chem. Biol. 2, 354–367 (2021).
Cross, D. A. E. et al. AZD9291, an irreversible EGFR TKI, overcomes T790M-mediated resistance to EGFR inhibitors in lung cancer. Cancer Discov. 4, 1046–1061 (2014).
Ostrem, J. M., Peters, U., Sos, M. L., Wells, J. A. & Shokat, K. M. K-Ras(G12C) inhibitors allosterically control GTP affinity and effector interactions. Nature 503, 548–551 (2013).
Lanman, B. A. et al. Discovery of a covalent inhibitor of KRAS G12C (AMG 510) for the treatment of solid tumors. J. Med. Chem. 63, 52–65 (2020).
Kavanagh, M. E. et al. Selective inhibitors of JAK1 targeting an isoform-restricted allosteric cysteine. Nat. Chem. Biol. 18, 1388–1398 (2022).
Feldman, H. C. et al. Selective inhibitors of SARM1 targeting an allosteric cysteine in the autoregulatory ARM domain. Proc. Natl Acad. Sci. USA 119, e2208457119 (2022).
Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Nishimasu, H. et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science 361, 1259–1262 (2018).
Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 38, 883–891 (2020).
Thuronyi, B. W. et al. Continuous evolution of base editors with expanded target compatibility and improved activity. Nat. Biotechnol. 37, 1070–1079 (2019).
Vogl, D. T. et al. Selective inhibition of nuclear export with oral Selinexor for treatment of relapsed or refractory multiple myeloma. J. Clin. Oncol. 36, 859–866 (2018).
Shi, J. et al. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat. Biotechnol. 33, 661–667 (2015).
Zhang, F. et al. Pyridinylquinazolines selectively inhibit human methionine aminopeptidase-1 in cells. J. Med. Chem. 56, 3996–4016 (2013).
Clark, K. L., Halay, E. D., Lai, E. & Burley, S. K. Co-crystal structure of the HNF-3/fork head DNA-recognition motif resembles histone H5. Nature 364, 412–420 (1993).
Parolia, A. et al. Distinct structural classes of activating FOXA1 alterations in advanced prostate cancer. Nature 571, 413–418 (2019).
Adams, E. J. et al. FOXA1 mutations alter pioneering activity, differentiation and prostate cancer phenotypes. Nature 571, 408–412 (2019).
Arruabarrena-Aristorena, A. et al. FOXA1 mutations reveal distinct chromatin profiles and influence therapeutic response in breast cancer. Cancer Cell 38, 534–550.e9 (2020).
Lardelli, R. M. et al. Biallelic mutations in the 3′ exonuclease TOE1 cause pontocerebellar hypoplasia and uncover a role in snRNA processing. Nat. Genet. 49, 457–464 (2017).
Lardelli, R. M. & Lykke-Andersen, J. Competition between maturation and degradation drives human snRNA 3′ end quality control. Genes Dev. 34, 989–1001 (2020).
Son, A., Park, J.-E. & Kim, V. N. PARN and TOE1 constitute a 3′ end maturation module for nuclear non-coding RNAs. Cell Rep. 23, 888–898 (2018).
Lazear, M. R. et al. Proteomic discovery of chemical probes that perturb protein complexes in human cells. Mol. Cell 83, 1725–1742.e12 (2023).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Fairman, J. W. et al. Structural basis for allosteric regulation of human ribonucleotide reductase by nucleotide-induced oligomerization. Nat. Struct. Mol. Biol. 18, 316–322 (2011).
Litman, R. et al. BACH1 is critical for homologous recombination and appears to be the Fanconi anemia gene product FANCJ. Cancer Cell 8, 255–265 (2005).
White, M. E. H., Gil, J. & Tate, E. W. Proteome-wide structural analysis identifies warhead- and coverage-specific biases in cysteine-focused chemoproteomics. Cell Chem. Biol. 30, 828–838.e4 (2023).
Boatner, L. M., Palafox, M. F., Schweppe, D. K. & Backus, K. M. CysDB: a human cysteine database based on experimental quantitative chemoproteomics. Cell Chem. Biol. 30, 683–698.e3 (2023).
Benns, H. J. et al. CRISPR-based oligo recombineering prioritizes apicomplexan cysteines for drug discovery. Nat. Microbiol 7, 1891–1905 (2022).
Sánchez-Rivera, F. J. et al. Base editing sensor libraries for high-throughput engineering and functional analysis of cancer-associated single nucleotide variants. Nat. Biotechnol. 40, 862–873 (2022).
Kim, Y. et al. High-throughput functional evaluation of human cancer-associated mutations using base editors. Nat. Biotechnol. 40, 874–884 (2022).
Hanna, R. E. et al. Massively parallel assessment of human variants with base editor screens. Cell 184, 1064–1080.e20 (2021).
Lue, N. Z. et al. Base editor scanning charts the DNMT3A activity landscape. Nat. Chem. Biol. 19, 176–186 (2023).
Békés, M., Langley, D. R. & Crews, C. M. PROTAC targeted protein degraders: the past is prologue. Nat. Rev. Drug Discov. 21, 181–200 (2022).
Huang, T. P., Newby, G. A. & Liu, D. R. Precision genome editing using cytosine and adenine base editors in mammalian cells. Nat. Protoc. 16, 1089–1128 (2021).
Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224–226 (2019).
Rodriguez, J. M. et al. APPRIS 2017: principal isoforms for multiple gene sets. Nucleic Acids Res. 46, D213–D217 (2018).
Wagner, E., Clement, S. L. & Lykke-Andersen, J. An unconventional human Ccr4-Caf1 deadenylase complex in nuclear Cajal bodies. Mol. Cell. Biol. 27, 1686–1695 (2007).
Acknowledgements
We thank J. Doench (Broad institute) and J. Luo (National Institutes of Health, NIH) for helpful discussions regarding CRISPR library cloning. This work was supported by the NIH (grant nos. R35 CA CA231991 awarded to B.F.C., U01 AI142756 awarded to D.R.L., RM1 HG009490 awarded to D.R.L., R35 GM118062 awarded to D.R.L., R35 GM118069 awarded to J.L.-A.), the Damon Runyon Cancer Research Foundation (DRG: grant no. 2406-20 awarded to H.L.), Jane Coffin Childs Fund (awarded to K.E.D.), the Mark Foundation for Cancer Research (H.L.) and the Howard Hughes Medical Institute (D.R.L.).
Author information
Authors and Affiliations
Contributions
H.L. and B.F.C. conceived the study and wrote the manuscript. H.L., T.M., J.R.R., D.O., S.J.W., K.E.D. and E.N. performed the experiments. H.L., K.T.Z., T.P.H. and D.R.L. contributed in the design and interpretation of base-editing experiments. H.L., T.M., B.M., J.L.-A., G.M.S., B.L., S.L.S. and B.F.C. contributed to data analysis and interpretation. All authors edited and approved the manuscript. B.F.C. supervised the study.
Corresponding authors
Ethics declarations
Competing interests
B.F.C. is a founder and scientific advisor to Vividion Therapeutics. D.R.L. is a consultant and/or equity owner for Prime Medicine, Beam Therapeutics, Pairwise Plants, Chroma Medicine, and Nvelop Therapeutics, companies that use or deliver genome editing or epigenome engineering agents. The other authors declare no competing interests.
Peer review
Peer review information
Nature Chemical Biology thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Base editing assigns essentiality to cysteines targeted by anti-cancer drugs.
a, b, Heatmap showing the mutagenesis effects of Adenine Base Editors (ABE) (a) and Cytosine Base Editors (CBE) (b) on different amino acid residues. The total percentage is normalized to 100% per row. c, Schematic showing the design of lentivirus vectors delivering base editor libraries used in this study. CDA: cytidine deaminase; TadA: deoxyadenosine deaminase; UGI: uracil glycosylase inhibitor; PuroR: puromycin resistance gene. d, Waterfall plots for ABE (left) and CBE (right) libraries showing the dropout significance calculated per EGFR residue between day 16 vs day 1 from saturated scanning experiments using PC14 cells (cutoff: LFC < −0.5; p < 0.05). Data represent average values of two independent experiments. One-sided p values were calculated using randomization tests (no multiple comparison adjustments, see source code in Github).
Extended Data Fig. 2 Global discovery of essential cysteines in cancer dependency proteins.
a, Scatter plot showing the dependencies included in the base-editing screens. b, Barplots summarizing the number of designed sgRNAs per cysteine. (Left) all or (right) ligandable cysteines in the dependency proteins. c, Plot comparing the essential cysteine hit rate using sgRNA targeting cysteines in dependency proteins or non-targeting sgRNA data randomly resampled from the library 10 times. The p value was calculated using two-sided Student’s t test. d, Scatter plot showing the correlation between cysteine dropouts observed with the ABE or the CBE library for Common Essential proteins in PC14 and KMS26. The Pearson correlation and two-sided p value is shown. e, f, Density scatter plots comparing the evolutionary conservation and significance of dropout of cysteines using the ABE (e) or CBE (f). Pearson correlations and two-sided p values are shown. g, Barplot showing the distribution of missense mutations for essential cysteines in orthologous proteins. h, i, Scatter plots comparing the day 16 vs day 1 LFC values of cysteines in Common Essential dependencies in KMS26 versus PC14 cells. The essential cysteines are highlighted for (h) ABE and (i) CBE libraries. j, k, Dot plots comparing the dropouts of cysteines from Strongly Selective dependencies using (j) PC14 or (k) KMS26 cells. Each pair of points compares the same cysteine. The cysteines were included based on the first essentiality filter (FDR < 10%, LFC < −0.6) only using data from one cell line without considering the dependency difference between the two cell lines. A second selectivity filter (greater dropout in the more dependent cell line, LFC difference > 0.3) was applied, and cysteines passing this filter are shown in red. Only Strongly Selective proteins with gene-level dependency scores that are sufficiently different (CERES difference > 0.8) between PC14 and KMS26 were included. The p values were calculated using two-sided paired Student’s t test based on all cysteines that pass the first filter. For c, d, e, f, h, i, j, k, the base-editing dropout screen data represent two independent experiments in PC14 and KMS26.
Extended Data Fig. 3 Base editing reveals essentiality of ligandable cysteine regions in Strongly Selective cancer dependency proteins.
a, b, Histogram showing the nucleotide substitution efficiencies for 70 sgRNAs using (a) ABE8e-NG or (b) evoCDA-NG in PC14 cells. The relative residue mutation frequency five days after lentivirus infection was quantified using targeted genomic PCR and amplicon sequencing. The percentage of gene editing was estimated by the ratio of reads that contain nucleotide substitutions over total aligned reads. Data represent average values of two independent experiments. c, Scatter plots for the EGFR_C797 region showing the correlation between base editing dropout data acquired herein and gene disruption dropout data (DepMap). The two independent experiments are shown in blue and red. Pearson correlations and associated two-sided p values are shown. d, Bar graph comparing the editing efficiencies among the sgRNAs that did or did not create significant base editing-knockout correlations. Each point represents a different sgRNA designed for FOXA1 or TOE1. Data represent average values of two independent experiments. The p value was calculated using two-sided Student’s t test. e, The fragment electrophile ligandability profile for cysteines in JAK1, showing ligandability of C817. Note that C810 and C817 on the same tryptic peptide, so their ligandability cannot be distinguished by cysteine-directed MS-ABPP. However, other studies have verified that C817 is the liganded cysteine in JAK121. f, Scatter plots for the JAK1_C817 region showing the correlation between base editing dropout data acquired herein and gene disruption dropout data (DepMap). The two independent experiments are shown in blue and red. Pearson correlations and associated two-sided p values are shown. g, The fragment electrophile ligandability profile for cysteines in FOXA1, showing ligandability of C258.
Extended Data Fig. 4 Functional characterization of covalent ligands targeting an essential cysteine C80 in TOE1.
a, The fragment electrophile ligandability profile for cysteines in TOE1, showing ligandability of C80. b, Quantification of TOE1_C80 engagement by the indicated compounds as measured by cysteine-directed ABPP performed in Ramos cells (20 µM, 3 h). Data are average values ± SEM normalized to DMSO from n = 2 independent experiments. c, Western blot showing recombinantly expressed FLAG-tagged TOE1 in MCC142 cells. GAPDH was used as a loading control. This was repeated in 3 independent experiments with similar results. d, Schematic summarizing the TOE1 deadenylation assay workflow. e, Alignment of TOE1 orthologous proteins in a panel of selected vertebrate species. Only the aligned sequences near human TOE1_C80 are shown. f, Comparison of deadenylase activity of recombinant FLAG-WT- and C80S-TOE1 proteins immunoprecipitated from MCC142 cells. Data represent two independent experiments. g, Quantification of time-dependent TOE1 deadenylation activity as shown in Fig. 4h. Data represent 2 independent experiments. h, Quantification of TOE1_C80 engagement by the indicated compounds (20 µM, 6 h) as measured by monitoring the C80-containing tryptic peptide in IP–MS experiments from FLAG-WT-TOE1-expressing MCC142 cells. Data are average values ± SEM normalized to DMSO from n = 2 independent experiments. i, Schematic for growth competition assay. Briefly, the TOE1-dependent cell line MCC142 was spin-infected with lentiviral vectors expressing codon-optimized FLAG-WT- or C80S-TOE1. After antibiotic (G418) selection, cells were mixed at a 1:1 ratio and treated with WX-02-33 (10 µM) or DMSO and grown for 8 days, after which genomic DNA was extracted and targeted amplicon sequencing was used to quantify the relative frequency of C80S vs WT. j, Relative allele frequency of MCC142 cells expressing WT- or C80S-TOE1 treated with 10 µM WX-02-33 or DMSO. The p value was calculated using two-sided Student’s t test based on n = 3 independent experiments.
Extended Data Fig. 5 Prioritizing essential cysteines with ligandability potential by quantitative cysteine reactivity profiling in native and denatured proteomes.
a, (from left to right) Density scatter plots comparing the effect of i) denaturants urea vs SDS on IA-DTB labeling in PC14 cells (left) and KMS26 cells (left middle); and ii) SDS (right middle) or urea (right) on IA-DTB labeling in PC14 vs KMS26 cells. Each point represents a quantified IA-DTB labeled cysteine. Pearson correlations and associated two-sided P values were calculated based on data from 2 biologically independent samples. b, The reactivity in denatured/native proteomes (left) and significance of dropout (right) for cysteines in UTP15. c, Structure of UTP15 in complex with NOC4L and RPS18 as part of the human small subunit processome (PDB: 7MQ8). The side chain of essential, unreactive (buried) C169 is highlighted in red. d, The reactivity in denatured/native proteomes (left) and significance of dropout (right) for cysteines in RRM1. e, Structure of RRM1 bound to effector metabolite TTP at the specificity site (S site) and substrate GDP at the catalytic site (C site) (PDB: 3HND). f, The reactivity in denatured/native proteomes (left) and significance of dropout (right) for cysteines in EEF1A1. g, Structure of EEF1A1 in complex with tRNA as part of the human ribosome complex (PDB: 6ZMO). h, The reactivity in denatured/native proteomes (left) and significance of dropout (right) for cysteines in BRIP1. i, Alignment of BRIP1 and ERCC2 protein sequences. j, Crystal structure of ERCC2 (homologous to BRIP1) in complex with DNA (PDB: 6RO4). The BRIP1_C761 corresponding residue ERCC2_A600 and the ATP-binding pocket is highlighted. For b, d, f, h, the ABPP data of native versus denatured proteomes represent average values from 6 independent experiments. For b, d, f, h, the one-sided p values were first estimated by randomization tests of dropout data from two independent base-editing experiments and then adjusted by Benjamini-Hochberg procedure to calculate the FDR.
Supplementary information
Supplementary Data 1
Related to Fig. 1 and Extended Data Fig. 1.
Supplementary Data 2
Related to Fig. 2 and Extended Data Fig. 2.
Supplementary Data 3
Related to Fig. 3 and Extended Data Fig. 3.
Supplementary Data 4
Related to Fig. 4 and Extended Data Fig. 4.
Supplementary Data 5
Related to Fig. 5 and Extended Data Fig. 5.
Supplementary Data 6
DNA sequences used in this study.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 4
Unprocessed gels.
Source Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 4
Unprocessed western blots.
Source Data Extended Data Fig. 5
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Ma, T., Remsberg, J.R. et al. Assigning functionality to cysteines by base editing of cancer dependency genes. Nat Chem Biol 19, 1320–1330 (2023). https://doi.org/10.1038/s41589-023-01428-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41589-023-01428-w