Abstract
Base editing can be applied to characterize single nucleotide variants of unknown function, yet defining effective combinations of single guide RNAs (sgRNAs) and base editors remains challenging. Here, we describe modular base-editing-activity ‘sensors’ that link sgRNAs and cognate target sites in cis and use them to systematically measure the editing efficiency and precision of thousands of sgRNAs paired with functionally distinct base editors. By quantifying sensor editing across >200,000 editor-sgRNA combinations, we provide a comprehensive resource of sgRNAs for introducing and interrogating cancer-associated single nucleotide variants in multiple model systems. We demonstrate that sensor-validated tools streamline production of in vivo cancer models and that integrating sensor modules in pooled sgRNA libraries can aid interpretation of high-throughput base editing screens. Using this approach, we identify several previously uncharacterized mutant TP53 alleles as drivers of cancer cell proliferation and in vivo tumor development. We anticipate that the framework described here will facilitate the functional interrogation of cancer variants in cell and animal models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All source data (including P values) are available in Supplementary Table 10. Processed screening data is available in Supplementary Tables 1,4,5 and primary sequencing data is available at the Sequence Read Archive (SRA) under accession PRJNA746395.
Code availability
Code for analysis and data visualization is available at: https://github.com/schmidt73/base-editing-analysis, https://github.com/Kastenhuber/AMINEsearch and https://github.com/lukedow/BEsensor
References
Gorelick, A. N. et al. Phase and context shape the function of composite oncogenic mutations. Nature 582, 100–103 (2020).
Hyman, D. M. et al. AKT inhibition in solid tumors with AKT1 mutations. J. Clin. Oncol. 35, 2251–2259 (2017).
Vasan, N. et al. Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PI3Kalpha inhibitors. Science 366, 714–723 (2019).
Zafra, M. P. et al. An in vivo Kras allelic series reveals distinct phenotypes of common oncogenic variants. Cancer Discov. 10, 1654–1671 (2020).
Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).
Vivanco, I. et al. Differential sensitivity of glioma- versus lung cancer-specific EGFR mutations to EGFR kinase inhibitors. Cancer Disco. 2, 458–471 (2012).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).
Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 36, 843–846 (2018).
Zafra, M. P. et al. Optimized base editors enable efficient editing in cells, organoids and mice. Nat. Biotechnol. 36, 888–893 (2018).
Kleinstiver, B. P. et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 481–485 (2015).
Katti, A. et al. GO: a functional reporter system to identify and enrich base editing activity. Nucleic Acids Res. 48, 2841–2852 (2020).
Vakulskas, C. A. et al. A high-fidelity Cas9 mutant delivered as a ribonucleoprotein complex enables efficient gene editing in human hematopoietic stem and progenitor cells. Nat. Med. 24, 1216–1224 (2018).
Nishimasu, H. et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science 361, 1259–1262 (2018).
Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018).
Zehir, A. et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat. Med. 23, 703–713 (2017).
Chakravarty, D. et al. A precision oncology knowledge base. JCO Precis. Oncol. 2017, PO.17.00011 (2017).
Chakravarty, D. & Solit, D. B. Clinical cancer genomic profiling. Nat. Rev. Genet. 22, 483–501 (2021).
Dimitrova, N. et al. Stromal expression of miR-143/145 promotes neoangiogenesis in lung cancer development. Cancer Discov. 6, 188–201 (2016).
Lee, K. E. & Bar-Sagi, D. Oncogenic KRas suppresses inflammation-associated senescence of pancreatic ductal cells. Cancer Cell 18, 448–458 (2010).
Arbab, M. et al. Determinants of base editing outcomes from target library analysis and machine learning. Cell 182, 463–480.e430 (2020).
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci. Adv. 3, eaao4774 (2017).
Kastenhuber, E. R. & Lowe, S. W. Putting p53 in context. Cell 170, 1062–1078 (2017).
Muller, P. A. & Vousden, K. H. Mutant p53 in cancer: new functions and therapeutic opportunities. Cancer Cell 25, 304–317 (2014).
Vassilev, L. T. et al. In vivo activation of the p53 pathway by small-molecule antagonists of MDM2. Science 303, 844–848 (2004).
Li, W. et al. Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR. Genome Biol. 16, 281 (2015).
Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).
Morris, J. P. T. et al. α-Ketoglutarate links p53 to cell fate during tumour suppression. Nature 573, 595–599 (2019).
Kanda, M. et al. Mutant TP53 in duodenal samples of pancreatic juice from patients with pancreatic cancer or high-grade dysplasia. Clin. Gastroenterol. Hepatol. 11, 719–730 e715 (2013).
Koblan, L. W. et al.Efficient C*G-to-G*C base editors developed using CRISPRi screens, target-library analysis, and machine learning. Nat. Biotechnol. 39, 1414–1425 (2021).
Shen, M. W. et al. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature 563, 646–651 (2018).
Song, M. et al. Sequence-specific prediction of the efficiencies of adenine and cytosine base editors. Nat. Biotechnol. 38, 1037–1043 (2020).
Tycko, J. et al. Pairwise library screen systematically interrogates Staphylococcus aureus Cas9 specificity in human cells. Nat. Commun. 9, 2962 (2018).
Marquart, K. F. et al. Predicting base editing outcomes with an attention-based deep learning algorithm trained on high-throughput target library screens. Nat. Commun. 12, 5114 (2021).
Chen, L. et al. Programmable C:G to G:C genome editing with CRISPR-Cas9-directed base excision repair proteins. Nat. Commun. 12, 1384 (2021).
Kurt, I. C. et al. CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells. Nat. Biotechnol. 39, 41–46 (2021).
Zhao, D. et al. Glycosylase base editors enable C-to-A and C-to-G base changes. Nat. Biotechnol. 39, 35–40 (2021).
Hyman, D. M., Taylor, B. S. & Baselga, J. Implementing genome-driven oncology. Cell 168, 584–599 (2017).
Cuella-Martin, R. et al. Functional interrogation of DNA damage response variants with base editing screens. Cell 184, 1081–1097 e1019 (2021).
Hanna, R. E. et al. Massively parallel assessment of human variants with base editor screens. Cell 184, 1064–1080 e1020 (2021).
Xu, P. Genome-wide interrogation of gene functions through base editor screens empowered by barcoded sgRNAs. Nat. Biotechnol. 39, 1403–1413 (2021).
Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e1821 (2016).
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e1817 (2016).
Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).
Kim, H. K. et al. Predicting the efficiency of prime editing guide RNAs in human cells. Nat. Biotechnol. 39, 198–206 (2021).
Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224–226 (2019).
Soto-Feliciano, Y. M. et al. A molecular switch between mammalian MLL complexes dictates response to Menin-MLL inhibition. Preprint at bioRxiv https://doi.org/10.1101/2021.10.22.465184 (2021).
Chen, B. et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 155, 1479–1491 (2013).
Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191 (2016).
Acknowledgements
We thank D. Solit, Ni. Schultz, M. Berger and B. Gross for access to MSK-IMPACT data, T. Jacks for sharing KP cells, D. Alonso-Curbelo and Dafna Bar-Sagi for sharing PDEC cells, M. Paz Zafra for sharing primers to assess tumor purity, T.M. Norman for conceptual advice and L. Cantley for support and mentorship. We gratefully acknowledge the members of the Molecular Diagnostics Service in the Department of Pathology, the Integrated Genomics Operation and Bioinformatics Core (P30 CA008748) and the Marie-Josée and Henry R. Kravis Center for Molecular Oncology. This work was supported by a project grant from the NIH/NCI (R01CA229773-01A1), P01 CA087497 (SWL), a MSKCC Functional Genomics Initiative (FGI) grant (SWL) and an Agilent Technologies Thought Leader Award (SWL). F.J.S.-R. was supported by the MSKCC TROT program (5T32CA160001), a GMTEC Postdoctoral Researcher Innovation Grant, and is an HHMI Hanna Gray Fellow. B.J.D. was supported by an F31 Ruth L. Kirschstein Predoctoral Individual National Research Service Award (F31-CA261061-01). E.R.K. was supported by an F31 Ruth L. Kirschstein Predoctoral Individual National Research Service Award (F31-CA192835) and is currently supported by NCI R35CA197588, awarded to L. Cantley. A.K. was supported by an F31 Ruth L. Kirschstein Predoctoral Individual National Research Service Award (F31-CA247351-02). J.L. was supported by the German Research Foundation (DFG) and the Shulamit Katzman Endowed Postdoctoral Research Fellowship. S.V.P. was supported by the German Academic Scholarship Foundation. F.M.B. was supported by a GMTEC Postdoctoral Fellowship, an MSKCC’s Translational Research Oncology Training Fellowship (5T32CA160001-08), and a Young Investigator Award from the Edward P. Evans Foundation. K.M.T. is supported by the Jane Coffin Childs Memorial Fund for Medical Research. D.C. and H.Z. acknowledge funding from the MSKCC Marie-Josée and Henry R. Kravis Center for Molecular Oncology for supporting OncoKB. S.W.L. is the Geoffrey Beene Chair of Cancer Biology and an Investigator of the Howard Hughes Medical Institute. L.E.D. is the Burt Gwirtzman Research Scholar in Lung Cancer at Weill Cornell Medicine. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Author information
Authors and Affiliations
Contributions
F.J.S.-R, B.J.D., E.R.K. and L.E.D. conceived the project. F.J.S.-R. and B.J.D. performed experiments, analyzed data, and wrote the paper. E.R.K. wrote code related to AMINEsearch and associated computational analyses and wrote the paper. H.S. and Y.-J.H. performed computational analyses and wrote code for sgRNA predictions and analysis. A.K., M.K., V.T., J.L., S.V.P., F.M.B., K.C., S.G., A.N.W., J.M.S. and KMT performed experiments and analyzed data. D.C. and H.Z. assisted with OncoKB analyses. C.S.L. and S.W.L. supervised experimental and computational work. L.E.D. performed and supervised experiments, analyzed data, and wrote the paper.
Corresponding author
Ethics declarations
Competing interests
L.E.D. is a scientific advisor and holds equity in Mirimus Inc. and is a consultant for Volastra Therapeutic and Fraizer Healthcare S.W.L. is an advisor for and has equity in the following biotechnology companies: ORIC Pharmaceuticals, Faeth Therapeutics, Blueprint Medicines, Geras Bio, Mirimus Inc. and PMV Pharmaceuticals. S.W.L. acknowledges receiving funding and research support from Agilent Technologies for the purposes of massively parallel oligo synthesis. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 BE efficiency for mouse sgRNAs in the APS library.
C > T editing efficiency (%) at each APS library mouse target site across base editor enzymes, as indicated. Cas9 and Cas9-NG serve as nuclease controls. Rows denote sgRNAs; columns denote PAM subclass.
Extended Data Fig. 2 Cancer somatic mutation-derived base editing sensor libraries.
(a) Number of unique recurrent SNVs per gene, ordered by mutation frequency of gene. Bars are split to indicate proportion of SNVs targeted (red) or not (black) in the HBES library. (b) Focality of mutations by cancer gene classification. Number of cumulative mutations observed in recurrent sites with respect to the number of unique SNVs observed per gene. Oncogenes are indicated by red dots and tumor suppressor genes are indicated by blue dots. Mutations in oncogenes tend to be more focal on distinct hotspot sites, with greater number of recurrent mutations per unique SNV allele (11.1 vs 6.2 mutations per unique recurrent SNV, p = 0.011, two-tailed t-test). (c) Venn diagram of sgRNAs in HBES library compatible with each base editor configuration. (d) Venn diagram of sgRNAs in MBES library compatible with each base editor configuration. (e) SNV-level annotation with each color bar sorted in order of observed mutation frequency (top). NV characteristics are indicated, including oncogenic function (OncoKB assertion of oncogenic/Likely oncogenic/VUS) and therapeutic implications (OncoKB highest level of evidence for drug sensitivity or resistance) {Chakravarty, 2017 #76;Chakravarty, 2021 #105}.
Extended Data Fig. 3 Off-target editing predictions for base editing sensor libraries.
(a) For sgRNAs in HBES library, distribution of potential off-target (OT) sites identified by PAM specificity and extent of mismatch. (b) Number of sgRNAs in HBES library targeting the human genome with 0 (white) and 1 or more (black) predicted OT sites depending on SPCas9 or Cas9-NG PAM specificity. A greater number of sgRNAs have no predicted OT sites used in conjunction with SpCas9 than with Cas9-NG. p < 2.2e-16, 2-sided Fisher’s exact test. (c) For sgRNAs in HBES library, distribution of potential OT sites identified by PAM specificity and extent of mismatch. (d) Number of sgRNAs in MBES targeting mouse genome with 0 (white) and 1 or more (black) predicted OT sites depending on SPCas9 or Cas9-NG PAM specificity. A greater number of sgRNAs have no predicted OT sites used in conjunction with SpCas9 than with Cas9-NG. p < 2.2e-16, 2-sided Fisher’s exact test. (e) Distribution of not-target editable bases (C for CBE) within the editing window for HBES library targeting human genome. (f) Distribution of not-target editable bases (C for CBE) within the editing window for MBES library targeting mouse genome.
Extended Data Fig. 4 Comparison of editing range (editing window) across FNLS, F2X, and FNLS-NG base editors as a function of dinucleotide context.
Plots represent the mean normalized BE editing efficiency for each base editor (FNLS = yellow, F2X = blue, FNLS-NG = gray) across 5 cell lines (rows) and 4 dinucleotide contexts (columns). Area shaded in grey denotes maximum editing range in each condition where normalized BE is above 30% (dotted line).
Extended Data Fig. 5 Correlation of sgRNA efficiency ranking.
Plots represent correlation of individual sgRNA efficiency rankings between MDA-MB-231 and NIH3T3, KPT1, and PDEC cells, as indicated. To reduce noise created by low efficiency sgRNAs, only HBES sgRNAs that had >1% activity in the sensor were included. Pearson correlation coefficients are shown; for all comparisons, p < 2.22 e-16.
Extended Data Fig. 6 Indel and BE correlation across cell lines.
Correlation of indel and C > T editing frequencies for all sgRNAs in the HBES library across 5 screen cell lines. Pearson correlation coefficients were calculated using ggpubr(0.4.0) package in R, the p value represents the significance of two-sided t-test.
Extended Data Fig. 7 Non-canonical cytosine editing identified by BE Sensor.
(a) Dotplots show percent C>T and C>G editing for individual target cytosines in the HBES library across three BE enzymes (FNLS, F2X and FNLS-NG) and two cell lines (MDA-MB-231 and PC9). Scales on x and y axes are not the same; dotted lines indicate 1:1 ratio (b) Ratio of C > G/C > T editing in FNLS-MDA-MB-231 cells transduced with the HBES library classified by dinucleotide context (fill) and trinucleotide context (column). Data includes all base editors (FNLS, F2X and FNLS-NG) and is filtered for sgRNAs that show more than 5% C > T editing in the sensor assay. Boxplots show the median and interquartile range (IQR) and whiskers represent 1.5*IQR. Outliers are shown as individual points. ns indicate p > 0.05; p values were determined with two-sided Wilcoxon signed rank test. Complete list of all comparisons is available in Supplementary Table 10g. (c) Schematic of (C > G) reporter developed by modifying the GO (C > T) reporter.
Extended Data Fig. 8 In vivo validation of cancer-associated TP53 missense mutations using BE.
(a) Survival analysis of mice transplanted with F2X-expressing PDECs transduced with specific Trp53-targeting base editing sgRNAs. N = 5 mice per sgRNA per mutation. (b) Frequency of target C > T editing in tumors from transplanted mice. Each individual point represents a single isolated tumor (n = 3+ per sgRNA) Target C > T editing was measured by next generation sequencing of amplified target loci and data was analyzed using CRISPResso2. Data are presented as + /- SD. (c) In vivo validation of M237I and C135Y mutations via orthotopic transplantation of FNLS-expressing PDECs transduced with sgRNAs designed to introduce the corresponding mutations in the mouse Trp53 gene (M234I and C132Y, respectively). N = 5 mice per mutation. (d) Representative macroscopic (left) and microscopic (right; H&E) images of pancreatic tumors isolated from mice transplanted with FNLS-expressing PDEC cells transduced with specific Trp53-targeting base editing sgRNAs. (e) Representative Sanger sequencing traces from tumors in (d). Red arrows denote target cytosines that, when mutated to thymine, give rise to the corresponding amino acid changes in the p53 protein. Nucleotide triplets on the right denote the precise mutational events that give rise to mutant p53 proteins. * p ≤ 0.05, ** p ≤ 0.01. P-values were calculated using the log-rank test.
Extended Data Fig. 9 Classification of screen hits by OncoKB.
(a) sgRNAs from the MBES proliferation screen were binned by categories: i) all sgRNAs; ii) sgRNAs depleted by <1.5 LFC and exhibiting 20% editing at the sensor; iii) sgRNAs enriched by >1.5 LFC; or iv) sgRNAs enriched >1.5 LFC and exhibiting 20% editing at the sensor followed by calculation of the percentage of each OncoKB classification. P-values indicate two-sided Fisher’s exact test comparison of the frequency of known or likely oncogenic mutations in each subset. (b) Bubble plot comparing sgRNA log fold changes with mean frequency of C > T editing in the sensor target site between days 5 and 30 post-transduction. Bubbles were colored by their OncoKB classification. Size denotes MaGeCK score (see Supplementary Table 6d).
Extended Data Fig. 10 Expanded base editing predictions.
(a) We used the MSK-IMPACT clinical tumor sequencing dataset and the characteristics of commonly used base editors to inform the design of base editing sensor libraries used in the experiments in Figs. 3–6. These results are available in the Shiny web portal (https://dowlab.shinyapps.io/BEscan/). Using updated and expanded versions of MSK-IMPACT sequencing data, base editing configurations, and AMINEsearch v2, we generated an exploratory set of sgRNA and sensor predictions, which are also available in the Shiny web portal. The more recent version of MSK-IMPACT contains increased numbers of (b) tumors sequenced, (c) total SNVs observed, and (d) candidate unique recurrent SNVs. These factors in the input led to to an increase in the exploratory set (v2) compared to the HBES and MBES libraries (v1) in respect to (e) Cas variants (determining PAM recognition) and base editor variants (determining editing window), collectively making base editor configurations with distinct properties (f). These factors in the input led to to an increase in the exploratory set (v2) compared to the HBES and MBES libraries (v1) in respect to (g) number of sgRNAs designed and (h) unique SNVs targeted by one or more sgRNAs in the database.
Supplementary information
Supplementary Information
Supplementary Figs. 1–19, Discussion and Methods.
Supplementary Table 1
APS library and APS results by editor.
Supplementary Table 2
AMINEsearch inputs, HBES and MBES libraries.
Supplementary Table 3
Off-target analysis of HBES and MBES libraries.
Supplementary Table 4
HBES results by editor and cell line.
Supplementary Table 5
MBES results by editor and cell line.
Supplementary Table 6
BE-Hive comparisons.
Supplementary Table 7
MaGeCK output of PDEC-MBES proliferation screen.
Supplementary Table 8
Expanded AMINEsearch input and output for BEscan.
Supplementary Table 9
gRNAs and primers used in study.
Supplementary Table 10
Source data P values.
Rights and permissions
About this article
Cite this article
Sánchez-Rivera, F.J., Diaz, B.J., Kastenhuber, E.R. et al. Base editing sensor libraries for high-throughput engineering and functional analysis of cancer-associated single nucleotide variants. Nat Biotechnol 40, 862–873 (2022). https://doi.org/10.1038/s41587-021-01172-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-021-01172-3
This article is cited by
-
Harnessing the evolving CRISPR/Cas9 for precision oncology
Journal of Translational Medicine (2024)
-
High-throughput evaluation of genetic variants with prime editing sensor libraries
Nature Biotechnology (2024)
-
Recent advances in CRISPR-based functional genomics for the study of disease-associated genetic variants
Experimental & Molecular Medicine (2024)
-
Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification
Nature Genetics (2024)
-
Precise genome-editing in human diseases: mechanisms, strategies and applications
Signal Transduction and Targeted Therapy (2024)