Main

A wide range of human diseases are associated with diverse genetic alterations that may be responsible for initiating, promoting or otherwise modifying the course of a given disease. These alterations can be quite complex; for instance, cancer genomes typically contain a repertoire of single nucleotide variants (SNVs) and large-scale copy number alterations that can impact many genes in different ways depending on the type of alteration, gene function and biological context. While tumor genotype is a well-established determinant of disease initiation, progression and therapy responses, the functional impact conferred by the thousands of unique mutations observed in human tumors remains poorly understood. This presents a major challenge to precision medicine efforts that aim to tailor cancer therapies to patients suffering from cancers harboring specific genetic lesions. Beyond the clinic, understanding the impact that diverse types of mutations have on different residues and protein domains would improve our fundamental understanding of gene and protein function (Fig. 1a).

Fig. 1: High-throughput design and evaluation of a TP53 prime editing sensor library.
figure 1

a, Schematic of our overall approach. We aim to engineer variants observed in patients with high throughput to perform functional screens in diverse contexts, elucidating variant functions to improve our ability to stratify and treat patients. b, Schematic of the sensor framework, which links each pegRNA to its editing outcome at the endogenous locus. c, We used PEGG to generate a TP53 prime editing sensor library targeting >1,000 cancer-associated TP53 variants with a median of 30 pegRNAs per variant. d, Heatmap visualization of the pegRNAs included in the TP53 sensor library, which includes SNVs, indels and silent substitutions. e, Correlation between editing at the sensor and endogenous locus in eight TP53-targeting pegRNA-sensor pairs at day 3 (D3) and day 7 (D7) posttransduction. f, Schematic of the screening protocol. The prime editing sensor library is transduced into cells constitutively expressing PEmax, and screening is performed in the presence or absence of the small molecule Nutlin-3. g, The average correct editing percentage among all pegRNAs in the library (left) or when considering only the most efficient pegRNA for each variant (right) at various timepoints in both conditions for pegRNA-sensor pairs with at least 100 sequencing reads; n = 3 biologically independent replicates. Data are presented as mean values with a 95% confidence interval. h, Rank plot of the correct editing percentage of the most efficient pegRNA per variant, as assayed at the sensor locus, at each timepoint. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure1.ipynb. AD1/2, activation domain 1/2; BlastR, blasticidin selection marker; CTD, C-terminal domain; DEL, deletions; EIF1α, eukaryotic initiation factor 1 alpha; INS, insertions; nCas9, nicking Cas9; NLS, nuclear localization signal; PRR, proline rich region; Puro, puromycin selection marker; P2A, peptide 2ART, reverse transcriptase; STOP, U6 polyT termination sequence; tevo, tevopreQ1; U6, U6 promoter.

Until recently, approaches for studying genetic variants have been limited to low-throughput, homology-directed repair (HDR)-based methods or high-throughput, nonphysiological gene overexpression systems1,2,3,4,5,6,7. While powerful, the former approach lacks scalability and generality due to the requirements of HDR and its limitation primarily to actively dividing cells. Gene overexpression systems have fewer requirements and are scalable, but fail to physiologically recapitulate the biology driven by these variants due to the absence of endogenous gene regulation mechanisms, many of which are not known for genes of interest. The recent development of precision genome editing tools, including base editing and prime editing, allows variants to be modeled in their native, endogenous genomic context with increased editing efficiency and theoretically higher throughput8,9,10.

Prime editing10 can be used to generate effectively any type of small mutation, including all SNVs and small insertions and deletions (indels). Prime editors are directed to engineer a mutation of interest by the instructions encoded in a prime editing guide RNA (pegRNA), which contains both a protospacer (the ‘search’ sequence) and a 3′ extension sequence (the ‘replace’ sequence that dictates the mutation installed at the site). The modular search-and-replace ability of prime editing has been leveraged to interrogate endogenous variants in high-throughput methods11,12,13. In these approaches, libraries of pegRNAs are delivered transiently or stably to cells expressing prime editors, and the fitness of variants is assessed by determining the relative distribution of endogenous alleles and/or pegRNAs. While powerful, these approaches have important limitations for screening applications, including reliance on a small number of variant-specific pegRNAs with unknown editing performance, inability to quantitatively assess endogenous genome editing at scale, and potential overrepresentation of undesired indels due to using PE3, a prime editor system that uses an additional guide RNA that nicks the nonedited strand to increase editing efficiency10.

With these challenges in mind, we sought to develop an integrative computational and experimental framework for high-throughput design, screening and deconvolution of pegRNA libraries to interrogate a diverse spectrum of genetic variants. This includes pairing each pegRNA with a variant-specific synthetic ‘sensor’ site14 that recapitulates the native architecture of the endogenous target locus. This sensor-based approach links pegRNA identity to editing outcomes for simultaneous high-throughput quantification of pegRNA editing activity and empirical calibration of screening data.

We chose the p53 transcription factor as a prototype to test this approach for investigating the biological impact of specific genetic variants. Notably, TP53 is the most frequently mutated gene in cancer and exhibits extensive allelic variation, leading to the generation of altered proteins that can produce functionally distinct phenotypes. Whether distinct variants of TP53 (and other genes) encode proteins with differing functional activities that influence cancer phenotypes remains controversial and technically challenging to investigate, particularly at scale. Several studies have used orthogonal cDNA-based exogenous overexpression systems to probe the fitness of p53 variants in human, mouse and yeast systems6,7,15,16. However, given the artificial nature of these screens, which rely on expression of variants at supraphysiological levels, we hypothesized that these strategies could misrepresent one or more phenotypes associated with p53 variants. Artifacts that stem from exogenous overexpression systems could be particularly relevant when studying proteins like p53 because p53 functions as a tetramer whose expression and degradation is tightly controlled by the cell17,18,19. Thus, we reasoned that alterations to the stoichiometric balance of p53 via overexpression could lead to erroneous conclusions about the effects of particular p53 variants, including misclassifying certain variants as noncausal or otherwise benign.

To tackle this question, we generated and screened a library of >28,000 pegRNAs targeting >1,000 TP53 variants observed across >40,000 cancer patients20—the largest set of endogenous TP53 variants studied so far. We included SNVs, insertions and deletions observed in patients, putative neutral silent substitutions as controls and a panel of random indels to increase the functional search space. These experiments identified alleles that impact p53 function in mechanistically diverse ways. We discovered that certain types of endogenous variants, particularly those found in the p53 oligomerization domain (OD), display opposite phenotypes when tested with exogenous overexpression systems. Collectively, these results highlight the physiological importance of gene dosage in shaping native protein stoichiometry and protein–protein interactions, and establish a powerful computational and experimental framework for studying diverse types of genetic variants at scale. To ensure widespread accessibility of this resource for the scientific community, we provide a publicly available Python package, Prime Editing Guide Generator (PEGG) (https://pegg.readthedocs.io/en/latest/), as a tool to generate prime editing sensor libraries.

Results

High-throughput design of prime editing sensor libraries

A principal limitation of using prime editing to systematically investigate genetic variants is the inherent variability in editing efficiency among different pegRNAs10,21,22,23. A number of computational tools for pegRNA design have been developed24,25,26,27,28,29,30,31,32,33, including machine-learning algorithms that can nominate sets of pegRNAs predicted to produce high efficiency edits. However, even pegRNAs generated by these predictive algorithms require extensive experimental validation, and their editing activity is not guaranteed to correlate strongly across different cell types. We hypothesized that coupling pegRNAs with ‘sensors’—artificial copies of their endogenous target sites—would allow us to systematically identify high efficiency pegRNAs while controlling for the confounding effects of variable editing efficiency in a screening context (Fig. 1b).

Synthetic sensor-like target sites have been used previously by our group and others to control for base editing gRNA editing efficiencies while defining the relative fitness of variants in genetic screens14,34. Several studies have applied a similar strategy to both base and prime editing technologies to identify features of efficient gRNAs or pegRNAs and train predictive algorithms21,32,33,35,36,37. However, this approach has yet to be applied for high-throughput phenotypic screening of endogenous genetic variants with prime editing, probably due in part to the lower editing efficiency of prime editing relative to base editing. We reasoned that a sensor-based prime editing screening approach could be powerful to discriminate bona fide endogenous variants from undesired editing outcomes that enrich or deplete in a screen. Moreover, the sensor approach would theoretically overcome the limitations of assessing variants at different genetic sites in parallel by eliminating the need to sequence several endogenous loci.

To test this approach, we first needed to build a computational tool capable of designing and ranking pegRNAs for thousands of genetic variants, while automatically generating a paired sensor site. To address this unmet need, we built and publicly released PEGG (Extended Data Fig. 1a)—a Python package that enables high-throughput design of prime editing sensor libraries38 (available at https://pegg.readthedocs.io/en/latest/). PEGG is compatible with a range of mutation input formats, including all of the datasets on the cBioPortal, ClinVar identifiers and custom mutation inputs39,40.

We chose the TP53 tumor suppressor gene as a prototype to establish and credential a scalable prime editing sensor-based screening approach for a number of reasons. First, TP53 is the most frequently mutated gene in human cancer, with ~50% of patients suffering from tumors harboring a mutation within the TP53 gene while the rest often inactivate the p53 pathway through other mechanisms. Second, thousands of unique TP53 mutations have been identified in cancer patients, including eight or so ‘hotspot’ alleles in specific residues that exhibit the highest mutational frequencies19. Although p53 has been studied for decades, there have been few systematic studies, and those have been hampered by reliance on artificial overexpression of mutant p53 proteins, unrepresentative cell lines and/or a limited spectrum of mutations evaluated6,7,15,16. These and other studies have sparked controversy in the field over whether any mutant p53 proteins are endowed with activities that go beyond LOF or dominant negative activity to achieve GOF or neomorphic status. These are important questions that extend beyond TP53 because mutant GOF proteins generated by cancer-associated variants, and the phenotypes they produce, could represent attractive therapeutic targets. Finally, prime editing sensor-based screening could be scaled up and broadly deployed to identify causal genetic variants implicated in cancer and other diseases with a strong genetic association.

With the above goals in mind, we first sought to generate a library of pegRNAs targeting TP53 variants. To generate this library, we selected variants from the MSK-IMPACT database, which uses deep exon sequencing of patient tumor samples to identify cancer-associated variants20. From this database of over 40,000 patients, we chose all observed SNVs in p53, as well as frequently observed insertions and deletions, along with a collection of random indels (Extended Data Fig. 1b). We reasoned that including several pegRNAs with different protospacers and combinations of pegRNA properties for each variant would allow us to scan the pegRNA design space more thoroughly to identify highly efficient guides for robust statistical analysis of variant phenotypes. To accomplish this, we used PEGG to produce 30 pegRNA designs per variant (for pegRNAs with a sufficient number of accessible PAM sequences) with varying reverse transcription template (RTT) (10–30 nucleotides) and primer binding site (PBS) lengths (10–15 nucleotides) coupled to canonical ‘NGG’ protospacers. The generated pegRNA designs were ranked based on a composite ‘PEGG score’ that integrates literature best practices for pegRNA design (Extended Data Fig. 1a and Supplementary Table 1).

PEGG also generated silent substitution variants as neutral internal controls for the screen, and we filtered pegRNAs to exclude protospacers with an MIT specificity score below 50 to reduce the probability of off-target editing41 (Extended Data Fig. 1e). In addition, these pegRNA designs included an epegRNA motif—tevopreQ1—an RNA pseudoknot located at the 3′ end of the pegRNA that improves editing by preventing degradation of the guide22. Even after these relatively stringent filtration steps, we were able to generate pegRNA designs for more than 95% of the input variants, resulting in a library of >28,000 pegRNAs (Fig. 1c,d and Extended Data Fig. 1c,d). Each pegRNA in the library is also paired with a 60-nucleotide long variant-specific synthetic ‘sensor’ that is generated by PEGG and included in the final oligonucleotide design. Every sensor is designed to recapitulate the native endogenous target locus, thereby linking pegRNA identity to editing outcomes (Fig. 1b).

To test the efficacy of using the sensor as a readout of editing at the endogenous locus, we randomly selected eight TP53 variant-specific pegRNA sensors generated during the process of library preparation. We generated lentivirus for each of these prime editing sensor constructs and performed separate transductions into cells expressing PEmax. At 3- (3D) and 7-days posttransduction (D7), we harvested genomic DNA and amplified both the pegRNA–sensor cassette and the endogenous locus targeted by each pegRNA. Analysis of editing at the sensor and endogenous locus revealed a very high correlation between the sensors and endogenous sites (Spearman correlation ≥0.9; Fig. 1e). In general, the prime editing sensor seems to slightly overestimate the editing activity at the endogenous locus, probably in part due to differences in locus chromatin accessibility42, but the ranking of pegRNA editing efficiencies is largely preserved, validating our sensor-based approach.

High-throughput interrogation of TP53 variants

Next, we screened our library of variants in TP53 wild type (WT) A549 lung adenocarcinoma cells stably expressing PEmax21. To measure the prime editing activity of this cell line, we generated and transduced these cells with a modified all-in-one lentiviral version of the fluorescence-based PEAR reporter43, validating that the cells displayed strong editing activity (Extended Data Fig. 2a). We then introduced the lentiviral TP53 sensor library into these cells at a low multiplicity of infection and in triplicate while ensuring a library representation of >1,000× at every step of the sfcreen. At 4 days posttransduction (D4), we split the populations into untreated or Nutlin-3-treatment arms (Fig. 1f). Nutlin-3 is a small molecule that inhibits MDM2 to activate the p53 pathway, which can be used to select for TP53 mutations that promote bypass of p53-dependent cell cycle arrest and apoptosis44. We hypothesized that this treatment group may increase the signal-to-noise ratio between TP53 variants with putative loss-of-function (LOF) or gain-of-function (GOF) activities and benign variants. We allowed the screen to progress for 34 days (D34), harvesting cell pellets from each replicate and treatment arm at several timepoints (Extended Data Fig. 2b). Genomic DNA extracted from each sample was used to amplify the pegRNA–sensor cassettes, which were subjected to next-generation sequencing (NGS) to simultaneously assess enrichment/depletion of pegRNAs and their editing activity and outcomes at the sensor target site (Extended Data Fig. 2c,d).

The average editing efficiency among all pegRNAs in the library increased in a time-dependent manner, peaking at ~8% in the final timepoint. In general, we observed low indel rates and strong correlation in sensor editing among replicates (Fig. 1g and Extended Data Fig. 3a–d). Strikingly, selecting only the most efficient pegRNA design for each variant led to a twofold increase in the average editing efficiency, highlighting the utility of the sensor for systematic empirical identification of high efficiency pegRNAs (Fig. 1g and Extended Data Fig. 3e–g). Cells with higher editing efficiency also exhibited stronger Nutlin-3 bypass in the Nutlin-3-treatment arm (Fig. 1g). Based on the assessment of editing at the sensor locus, we were able to identify active pegRNAs (≥2% editing efficiency) for more than half of the TP53 variants included in the library. This includes highly efficient pegRNAs that install the desired edit with over 20% efficiency for more than 20% of the variants (Fig. 1h). These validated pegRNAs could be further engineered with silent mutations that evade mismatch repair to boost overall editing efficiency21.

The size and diversity of this library also allowed us to examine features of highly efficient pegRNAs that broadly recapitulated previous observations32,33,37,45. Correlation analysis between various pegRNA features and editing efficiency across all timepoints identified the estimated on-target activity of the protospacer (as predicted by Rule Set 2)46 as the single largest determinant of prime editing efficiency (Fig. 2a). In addition, the distance between the edit and the nick introduced by nCas9 was correlated negatively with editing efficiency, while the length of the postedit homology was correlated positively with editing efficiency (Fig. 2a). Thus, edits closer to the nick and with larger postedit homology were more efficient, consistent with previous findings32,33,37,45.

Fig. 2: Identification of features of highly efficient pegRNAs.
figure 2

a, Spearman correlations between various features of pegRNA design and correct editing percentage, assessed for all pegRNA-sensor pairs with sufficient reads. Each dot represents a separate replicate/timepoint. For Doench 2016 score, see ref. 46. b, Relationship between PEGG score and average correct editing percentage at each timepoint and condition is increasing monotonically. c, Representative example of the correlation between PEGG score and editing efficiency for day 25 replicate 1 (D25-REP1) (Untreated). d, Visualization of the protospacer bias in editing efficiency. The number of pegRNAs generated per protospacer at each TP53 exon on the plus (+) or minus (−) strand (top) and the average editing efficiency at each of these protospacers at day 34 (D34) of the untreated condition (bottom). e, Average editing efficiency for SNV-generating pegRNAs in the library as a function of distance to the nick generated by PEmax and PBS length. The location of the ‘NGG’ PAM sequence is highlighted in blue. Protospacer disrupting (locations +1 to +3) and PAM-disrupting variants (locations +5 and +6) tend to be more efficient. f, Feature importance of 20 random forest models trained separately to predict pegRNA efficiency. Each dot represents a different model. Data are presented as mean values with a 95% confidence interval. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure2.ipynb. NUT, Nutlin-3-treated.

Notably, the PEGG Score, which is a weighted linear combination of pegRNA features based on literature best practices, correlated more strongly with prime editing efficiency than any other single feature, achieving a Spearman correlation of up to 0.4 (Fig. 2a–c). Although this correlation is modest relative to published predictive models32,33,37, the PEGG score is a simple, unbiased and cell type/organism-agnostic prediction of pegRNA activity that could complement machine-learning-based predictions of prime editing activity, which may vary due to training on particular cell types.

To further analyze the differences in prime editing activity among the 173 protospacers spanning the TP53 locus, we visualized the number of pegRNAs that utilized each protospacer and the average editing efficiency at each protospacer (Fig. 2d). This analysis suggests that only a subset of protospacers can be used to generate high efficiency pegRNAs, while other protospacers retain little-to-no editing activity. We also found that pegRNAs that introduce edits that disrupt the protospacer or PAM sequence tend to be more efficient (Fig. 2e). Relative to the nick created by nCas9, SNVs introduced at the +1–3 position, which mutate the protospacer, and at the +5–6 position, which mutate the guanine bases in the NGG PAM, display increased editing activity. In contrast, edits introduced at the +4 position, corresponding to the ‘N’ in the ‘NGG’ PAM sequence, display reduced editing, probably due to their failure to disrupt the PAM sequence (Fig. 2e).

Finally, we trained a random forest regressor to predict pegRNA efficiency (Extended Data Fig. 4a). Even with a restricted set of features, this algorithm was able to predict pegRNA activity with a Spearman correlation of ~0.6, comparable with other, more complex algorithms used to predict PE activity32,33,37 (Extended Data Fig. 4b). Analysis of the relative feature importance of this random forest model was again consistent with previous findings, and highlighted the GC content of the PBS as another important determinant of editing not identified with simple correlation analysis (Fig. 2f). These results demonstrate that large-scale, gene-specific prime editing sensor screening datasets can also provide insight into the determinants of high efficiency prime editing, even though these libraries were not designed with that objective in mind.

Sensor-based calibration identifies pathogenic TP53 variants

To assess the relative fitness conferred by engineered TP53 variants, we used the MAGeCK pipeline to normalize read counts among replicates and quantify the log2 fold change (LFC) and false discovery rate (FDR) of pegRNAs in the library47. While the LFC in pegRNA counts was highly correlated in replicates from the untreated and Nutlin-3-treated arms of the screen, respectively, the correlation among replicates between the two conditions was modest, suggesting that treatment-dependent biological effects were occurring (Extended Data Figs. 3a and 5). We then used the sensor target site as a quantitative proxy for editing efficiency at the endogenous locus to systematically filter pegRNAs based on their empirical editing efficiency and precision (Fig. 3a and Extended Data Fig. 6a–d). As expected, the number of significantly enriched or depleted pegRNAs in ‘sensor-calibrated’ datasets decreased as we increased the editing activity threshold (Fig. 3b). These results demonstrate that our sensor-based approach allows empirical removal of pegRNAs that exhibit potentially spurious enrichment or depletion, and low and/or undesired editing activity, retaining pegRNAs that are more likely to introduce the variants of interest with high efficiency and precision. Based on these results, we decided to focus our statistical analyses on a dataset composed of pegRNAs with ≥10% editing efficiency to minimize the confounding effects of imprecise editing (Fig. 3c–g).

Fig. 3: High-throughput prime editing sensor screens identify pathogenic TP53 variants.
figure 3

a, Schematic of the sensor-calibrated filtration approach, where the editing rate of a pegRNA is determined by the sensor locus and pegRNAs below a given editing threshold are filtered. b, Number of significantly enriching or depleting pegRNAs (FDR < 0.05) as a function of the minimum correct editing percentage threshold at D34 in both conditions. c, LFC of each pegRNA ≥10% editing with at least ten sensor reads at D34 relative to D4 in the untreated condition, with pegRNAs colored by editing efficiency. d, Same as c, but for the Nutlin-treated condition. e, Plot as in a, but with pegRNAs colored by variant type. Enriching pegRNAs with LFC ≥ 2 and FDR < 0.05 labeled. Depleting pegRNAs with FDR < 0.05 labeled. f, Plot as in b, but with pegRNAs colored by variant type. Selected enriching pegRNAs with FDR < 0.05 labeled. Depleting pegRNAs with FDR < 0.05 labeled. g,h, Boxplots of LFC in pegRNAs at D34 in Nutlin-treated condition separated by variant class for pegRNAs ≥10% editing and ≥20% editing. Statistics shown for two-sided t-test with Bonferroni correction. *P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001; ****P ≤ 0.0001; NS, not significant (P > 0.05). g, Nonsense versus missense (P = 0.02996), versus INS (P = 0.05716), versus DEL (P = 1.0), versus silent (P = 0.0002895). h, Nonsense versus missense (P = 0.009422), versus INS (P = 0.02487), versus DEL (P = 1.0), versus silent (P = 0.001592). i, Boxplot of the LFC at D34 in the Nutlin-treated condition for pegRNAs ≥20% editing with annotated residue functions. In all boxplots (gi), boxes indicate the median and interquartile range (IQR) for each sample with whiskers extending 1.5× IQR past the upper and lower quartiles. LFC calculated from the median values across n = 3 biologically independent samples using MAGeCK. j, Spearman correlation between LFC of SNV-generating variants and CADD score at D34 in both conditions, as a function of the minimum correct editing threshold. k, Sensor editing for selected pegRNAs at D34 in the Nutlin-treated condition. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure3.ipynb.

As hypothesized, the dynamic range in the Nutlin-3-treated arm of the screen was considerably higher than in the untreated arm, with pegRNAs more strongly enriching and depleting in the presence of Nutlin-3 (Fig. 3c,d). Treatment with Nutlin-3 also selected for cells with higher-efficiency editing, improving the resolution of the screen (Fig. 3c,d). Editing of sensor loci continued throughout the screen due to the constitutive expression of PEmax, with sensor editing increasing fourfold on average from D4 to D16, and twofold from D16 to D34. However, sensor editing rates among negatively selected (LFC < −1), unselected (−1 ≤ LFC ≤ 1) and positively selected (LFC > 1) pegRNAs remained constant all throughout the screen (Extended Data Fig. 7). These results indicate that any differences in editing rate among the pegRNAs or cells in the population were unlikely to contribute to the results of the screen and were instead controlled internally.

Several putative pathogenic TP53 variants, including R196P and R267P, were strongly enriched in both treatment arms, with several pegRNAs per variant appearing as top hits (Fig. 3e,f). Given the higher dynamic range of the Nutlin-3 treatment arm, as well as the possibility that this treatment biases towards the discovery of dominant negative TP53 mutations2,6,16, we focused our analyses on this treatment group. Several TP53 variants showed significant enrichment in the Nutlin-3-treatment group, including SNVs and indels in the C-terminal half of the DNA-binding domain (DBD) and the OD (Fig. 3f). This includes several variants at residues 248 and 249, which are known mutational hotspots in p53 and commonly observed in cancer patients and individuals with the Li–Fraumeni cancer predisposition syndrome17. We also identified strongly depleting variants in the DBD that may retain WT p53 transcriptional activity or fail to exert a dominant negative effect on the p53 tetramer (Fig. 3f). Collectively, these results validate the utility of our approach and dataset to accurately identify functionally diverse pathogenic TP53 variants.

Interestingly, the most commonly observed TP53 mutation in human cancer, R175H, did not show strong enrichment despite the existence of several R175H pegRNAs exhibiting ≥10% editing efficiency. In fact, most of the top enriching variants we identified were not in known mutational hotspots19, suggesting that other types of variants can produce stronger phenotypes. These include the top hit—an insertion of a histidine between residues 254 and 255—as well as R196P and several variants in the OD (F328L, N345S, A347P) (Fig. 3f). These observations are consistent with the possibility that a subset of TP53 hotspot mutations are observed in part due to disproportionately high mutagenesis rates at the genomic level due to extrinsic and intrinsic factors, such as tobacco smoke and APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like) activity, rather than only to the fitness advantage conferred by these variants relative to other TP53 mutations6,19. An alternative explanation to these observations is that the hotspot variants were simply not efficiently engineered by the pegRNAs used in our screen. Although this was true for several hotspot variants, even hotspots with several efficient pegRNA designs (for example, R248Q, Y220C, G245S) were outcompeted by other, less frequently observed variants (Extended Data Fig. 8a–c). This includes rare-variant-encoding pegRNAs that outcompete hotspot variant-encoding pegRNAs with similar or identical empirical editing frequencies (Extended Data Fig. 8a–c). Even at the frequently mutated codon 248, the two most commonly observed substitutions, R248Q and R248W, were outcompeted in the screen by the rarer substitutions R248P and R248G (Extended Data Fig. 8d,e). These results suggest that the spectrum of cancer-associated TP53 mutations is mechanistically diverse and probably arises through the contextual combination of disproportionate mutagenesis rates and phenotypic selection of functionally important codons and their cognate residues.

Bulk quantification of pegRNAs grouped by variant class also revealed that nonsense variants were significantly more enriched compared with missense and silent variants. This is evident at several thresholds of pegRNA activity (Fig. 3g,h). As expected, silent variants tend to deplete, particularly when considering pegRNAs at higher threshold for editing efficiency (≥20%), bolstering our confidence in the fidelity of the screen (Fig. 3h). Using available annotations of p53 residue function48, we also found that, as expected, variants in DNA-binding/contacting residues displayed strong enrichment (including residues 248 and 273) (Fig. 3i). Intriguingly, certain variants involved in tetramerization and transactivation (for example, L22V) were also strongly enriched, despite the low frequency of mutation in these residues (Fig. 3i). Other observations are more difficult to interpret, such as the large variance in the enrichment of mutations that affect residues involved in zinc binding, or the fact that variants in partially exposed residues tended to deplete while those in buried and exposed residues tended to enrich (Fig. 3i). Altogether, these observations suggest that there is a large, underappreciated phenotypic variance in the relative fitness conferred by distinct TP53 variants—not all p53 variants are one and the same, a concept that is likely relevant across many other genes and more broadly in biology.

We also sought to quantify the degree of concordance between our screening results and widely used metrics of variant deleteriousness. To do so, we used the combined annotation-dependent depletion (CADD) score, which integrates evolutionary conservation of residues with other metrics of pathogenicity to generate a CADD score, with higher scoring variants predicted to be more deleterious49. We observed a low correlation between CADD score and enrichment of all SNV-specific pegRNAs. However, the correlation between CADD score and variant-specific fitness increased dramatically when we used sensor target sites to restrict our analysis to variants generated by high efficiency pegRNAs (Fig. 3j). We achieved a Spearman correlation of ~0.3 when considering Nutlin-3-treated pegRNAs with ≥15% editing activity, and >0.4 when considering pegRNAs ≥50% editing (Fig. 3j). Across all minimum pegRNA editing activity thresholds, the CADD score correlated more strongly with fitness of Nutlin-3-treated pegRNAs to the untreated condition. These results emphasize the significant advantage of including sensor target sites in prime editing screening libraries to quantitatively assess pegRNA efficiency and reinforce the ability of Nutlin-3 to effectively pull out genuine LOF and putative neomorphic and separation-of-function TP53 variants (Fig. 3j,k).

Sequencing of endogenous TP53 validates sensor approach

The above results demonstrate that sensor-calibrated quantification of pegRNA enrichment and depletion can be used to identify bona fide pathogenic TP53 variants. However, these analyses do not rule out the possibility that these changes are independent of true editing at the endogenous TP53 locus. To formally test whether prime editing sensor screens can faithfully quantify the effects of variants engineered at endogenous loci, we performed targeted next-generation sequencing of specific regions in exons 6, 7 and 10 of TP53 using genomic DNA extracted from untreated and Nutlin-treated cells from D4 and D34 timepoints (Fig. 4a). We reasoned that sequencing the native TP53 locus would allow us to directly compare the fold change in pegRNA counts with the fold change of variants installed at defined targeted sites within endogenous TP53.

Fig. 4: Sequencing of TP53 validates the prime editing sensor approach.
figure 4

a, Schematic of the sequencing of the TP53 locus. Regions within exons 6, 7 and 10 were sequenced from genomic DNA extracted at D4 and D34 in the untreated and Nutlin-treated arm of the screen. b, Correlation between the LFC in pegRNA counts and the LFC of endogenous variants at the TP53 locus for D34 versus D4 of the untreated arm of the screen at different thresholds of pegRNA editing activity. c, Correlation between the LFC in pegRNA counts and the LFC of endogenous variants at the TP53 locus for D34 versus D4 of the Nutlin-treated arm of the screen at different thresholds of pegRNA editing activity. d, Spearman correlation between the LFC in pegRNA counts and the LFC of endogenous variants at the TP53 locus as a function of the minimum sensor correct editing threshold for D34 untreated (blue) and D34 Nutlin-treated (red) samples. e, The percentage of variants detected (≥500 counts) at the TP53 locus at D34 (untreated) for variants with pegRNAs at different minimum sensor correct editing thresholds. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure4.ipynb.

Unlike nonsensor-based prime editing screens, which are likely to suffer from extensive noise during enrichment due to the difficulty of designing high efficiency pegRNAs, our sensor-based prime editing screen could be denoised empirically by filtering for pegRNAs that edited their cognate sensor above a given editing threshold. Indeed, we observed no correlation between the LFC in pegRNA counts with the corresponding LFC of endogenous variants engineered in the native TP53 locus when considering all pegRNAs (Fig. 4b–d). However, taking advantage of the sensor site to filter out pegRNAs below a given correct editing threshold dramatically reduced the noise in these data and revealed a strong correlation between the fold change in pegRNA counts and endogenous variant counts (Fig. 4b–c). This correlation increased monotonically as the minimum correct editing threshold increased, reaching a Spearman correlation >0.4 in the untreated arm and >0.5 in the Nutlin-treated arm of the screen when using a minimum correct editing threshold of 50% (Fig. 4d). We note that this is probably an underestimate of correlation on a per variant basis, as several pegRNAs with variable editing efficiencies are being compared directly with a single endogenous genomic site. Importantly, edited exonic sites targeted by top enriching pegRNAs (for example, R196P, A347P, I254_I255insH) were also enriched significantly relative to their WT counterparts and correlated strongly with their respective pegRNA counts. We were also able to detect nearly all of the variants engineered by active pegRNAs (≥500 counts per variant and ≥1% correct sensor editing), and the detectable fraction reached saturation when considering pegRNAs producing ≥14% editing at the sensor site (Fig. 4e). Altogether, these results emphasize the need to integrate quantitative, sensor-like approaches to accurately extract true signal from the high levels of noise that are inherent in large-scale prime editing screens. Indeed, our analysis demonstrates that screening pegRNAs without any empirical quantification of their editing activity invariably leads to spurious conclusions concerning the fitness of the variants that those pegRNAs are intended to engineer.

Functional validation of pathogenic TP53 variants

The above data demonstrate that pegRNA-specific sensor modules can be used to rigorously calibrate screening results to limit the analysis of variant fitness effects only to highly efficient pegRNAs. Though suggestive, these results do not formally prove that top scoring pegRNAs are enriched due to the introduction of defined genetic variants at the endogenous target locus and that these drive the observed biological differences. To test this, we selected a cohort of 29 pegRNAs that significantly enriched or depleted in the screen, or that generated commonly observed ‘hotspot’ mutations (Fig. 5a and Extended Data Fig. 9a). This set of pegRNAs targeted residues that spanned the TP53 locus and also included two control pegRNAs that install silent edits (Fig. 5a). In all, this validation set included low, medium and high efficiency pegRNAs spanning a range of 0–86% correct editing percentages, as measured by their respective sensor sites (Extended Data Fig. 9a). We transduced A549-PEmax cells with lentiviruses encoding individual pegRNAs and allowed editing to occur for 7–10 days, based on the kinetics of editing we observed previously (Fig. 1f). We then mixed each individual population of sequence-verified isogenic A549-PEmax-pegRNA cells with parental TP53 WT A549-PEmax cells and performed longitudinal fluorescence competition assays in the presence or absence of Nutlin-3 (Fig. 5b). We used the red fluorescent protein (RFP) fluorophore linked to each pegRNA vector to track the relative fitness of pegRNA cells (RFP+) compared with parental cells (RFP−) (Fig. 5c). These competition assays proceeded for 2 weeks, with flow cytometry readings taken every 7 days for each replicate (Extended Data Fig. 9b). We then calculated the difference in the RFP+ cell fraction (∆RFP%) for each pegRNA between the profiled timepoints and the initial timepoint in both conditions (Fig. 5d). Consistent with our screening results, a significant fraction of pegRNAs showed enrichment in the presence of Nutlin-3, often reaching complete saturation (Fig. 5d). Overall, there was strong concordance between the enrichment in cells observed in both treatment conditions in the screen and in these competition assays, supporting the reproducibility of the screening results (Fig. 5e,f). Importantly, we observed a significantly strong enrichment of cells expressing pegRNAs designed to engineer variants in the OD, including A347P and N345S (Fig. 5e,f).

Fig. 5: Functional validation of pathogenic TP53 variants identified with prime editing screening.
figure 5

a, LFC of 29 pegRNAs selected for validation at D34 in the Nutlin-treated condition. Variants with insufficient control counts (I195T, E285K, G325A, Y327Ter, A347D) are represented by LFC = 0. b, Schematic of the competition assay methodology. A549-PEmax cells are transduced with individual pegRNAs with an mScarlet fluorescent marker. After 7–10 days, these A549-PEmax-pegRNA cells are mixed with parental, uncolored A549-PEmax cells and split into untreated or Nutlin-treated conditions. Flow readouts of the RFP+ cell fraction are then taken at several timepoints. c, Selected representative competition assays show the change in the RFP+ cell fraction in the presence (orange) or absence (blue) of Nutlin-3. Data are presented as mean values at each timepoint, with a 95% confidence interval. d, ∆RFP% from the D0 to the D7 and D4 timepoints for each variant in the presence or absence of Nutlin-3; n = 3 biologically independent replicates performed per condition. Data are presented as mean values with a 95% confidence interval. e, Scatterplot of the LFC of Nutlin-treated pegRNAs at D34 of the screen, and the corresponding ∆RFP% at D7 in the competition assay. Points colored by endogenous editing percentage, with ‘X’ indicating no endogenous editing data. f, Same as e but for the untreated condition. g, Scatterplot of sensor editing percentage at D16 and the corresponding endogenous editing percentage of pure A549-PEmax-pegRNA cell lines for pegRNAs included in competition assays. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure5.ipynb. rp, Pearson correlation; rs, Spearman correlation.

The above results indicate that cells harboring a number of pegRNAs designed to engineer diverse types of TP53 variants have an increased fitness. However, these results do not rule out the possibility that these pegRNAs confer increased fitness through indel generation, rather than through their encoded edits. To assess this, we performed targeted NGS of each endogenous target loci in pure A549-PEmax-pegRNA cell lines (that is, before mixing with the parental population). Comparison of the endogenous editing observed in these cell lines with the corresponding sensor editing observed in the untreated arm of the screen at a similar timepoint (D16) revealed a strong correlation, with on-target editing observed for almost all pegRNAs (Fig. 5g).

Another potential application of our approach is to combine high-throughput prime editing with drug treatments to identify variant–drug interactions that could be exploited to develop allele-specific therapies. This is particularly relevant today because recent advances in rational drug design have shown that small molecules targeting specific mutant proteins (including those produced by oncogenic point mutant KRAS alleles) can have therapeutic potential50. To test whether our approach could be used to identify variant-specific therapeutic sensitivities, we tested two small molecules that have been shown to exhibit mutant-p53-specific effects, COTI-2 and PK7088, in cells transduced with lentiviruses encoding R175H- and Y220C-targeting pegRNAs, respectively51,52,53. Both of these treated populations showed depletion in the RFP+ cell fraction (Extended Data Fig. 9c). In particular, the COTI-2 treatment arm showed significant depletion of R175H-pegRNA cells (Extended Data Fig. 9c). These results demonstrate that prime editing sensor screens could be used to systematically identify variant-specific vulnerabilities to diverse therapies, augmenting cDNA-based approaches for performing similar screens54.

Prime editing reveals new pathogenic variants

High-throughput functional genomics approaches have been used previously to investigate TP53 mutations. For instance, Giacomelli et al. performed deep mutational scanning of TP53 variants using exogenous overexpression of mutant TP53 cDNAs in A549 cells in the presence or absence of Nutlin-3, concluding that most TP53 mutations probably arise as a consequence of endogenous mutational processes that select for dominant negative and LOF activity6. A follow-up study integrated this method with HDR-based modeling of six TP53 hotspot mutations in human leukemia cells and concluded that missense mutations in the TP53 DBD act mainly through dominant negative activity2. More recently, Ursu et al. employed a modified version of Perturb-seq to interrogate the transcriptional effects of 200 mutant TP53 cDNAs in A549 cells by single-cell RNA-sequencing, also concluding that most of these disrupt p53 activity through LOF and dominant negative effects16. In contrast, another study used a similar approach in H1299 and HCT116 cell lines to interrogate variants in the TP53 DBD through parallel in vitro and in vivo experiments, concluding that certain hotspot mutations confer a higher proliferative advantage in vivo, probably through GOF mechanisms7. A number of studies in both mice and humans have also demonstrated that certain TP53 variants, including hotspot mutations at residues R175, R248 and R273, can produce phenotypes consistent with neomorphic/GOF activities55. These include promoting aberrant self-renewal of hematopoietic stem cells56, sustaining tumor growth57,58 and promoting metastatic dissemination59,60,61,62, among others55. As such, there is much controversy in the field regarding the precise cellular and molecular activities of cancer-associated TP53 mutations—the most common genetic lesions observed across all types of cancer.

We hypothesized that cDNA screening approaches are biased in favor of detecting dominant negative activities due to their reliance on supraphysiological overexpression of mutant proteins. This is particularly relevant for studying the active p53 transcription factor, which is a tetrameric protein composed of a dimer of dimers18. As such, we postulated that mutant overexpression studies in WT TP53 cells, including A549, may fail to detect mutant allele-specific activities and phenotypes that may be sensitive to gene dosage and protein stoichiometry. To test this hypothesis, we reanalyzed our data to perform comparative bioinformatic analyses with the dataset generated by Giacomelli et al.6, as their experiments were also carried out in WT TP53 A549 cells treated with Nutlin-3. First, we plotted the Z-scores of SNV-generating pegRNAs (≥10% editing) against the Z-scores of the corresponding variants expressed from cDNAs (Fig. 6a). Supporting our hypothesis, variants in the OD of p53 tended to deplete (that is, Z-score <0) when expressed from cDNAs, but often enriched significantly when expressed from the endogenous locus (Fig. 6a,b). To investigate this difference further, we calculated the difference in Z-scores (∆Z-score) between each pegRNA–cDNA pair by subtracting the cDNA Z-score from the prime editing Z-score. This analysis revealed a significantly higher ∆Z-score for endogenous variants in the OD relative to other domains of p53 (Fig. 6c)—a pattern that consistently held at several thresholds of pegRNA activity (Extended Data Fig. 10a) and even when we restricted our analysis solely to the most efficient pegRNA for each variant (Fig. 6d and Extended Data Fig. 10b).

Fig. 6: Comparative analysis of prime editing and cDNA screening datasets of TP53 variants reveals pathogenic variants in the OD.
figure 6

a, Scatterplot of prime editing Z-score for pegRNAs ≥10% editing at D34 in the Nutlin-treated condition, and the corresponding cDNA variant Z-score in the p53-WT background in the presence of Nutlin, colored by p53 domain. b, cDNA (red) and prime editing (blue) Z-scores for pegRNAs/variants located in the OD. The pegRNA with the highest Z-score is labeled. c, The difference in Z-scores between prime editing and cDNA screens (∆Z-score) for pegRNAs ≥10%, separated by p53 domain. OD versus DBD (P = 6.025 × 10−85), versus PRR (P = 1.396 × 10−26), and versus TAD (P = 3.684 × 10−15). d, Boxplot of the Z-scores for variants in the cDNA and prime editing screens (P = 1.192 × 10−6), considering only the most efficient pegRNA for each variant. Statistics for c and d shown for two-sided t-test with Bonferroni correction. *P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001; ****P ≤ 0.0001; NS, not significant (P > 0.05). In all boxplots (c and d), boxes indicate the median and IQR for each sample with whiskers extending 1.5× IQR past the upper and lower quartiles. All Z-scores were calculated from the LFC of each pegRNA, in turn calculated using MAGeCK from the median values across n = 3 biologically independent samples. e, Visualization of the residue-averaged ∆Z-scores on an NMR-structure of p53 OD (PDB: 1OLG). f, Scatterplot of the cDNA Z-score of TP53 variants and the corresponding ∆RFP% of cDNAs tested with competition assays. g, Comparison of the ∆RFP% for cDNAs (D10) and corresponding pegRNAs (D14) for variants tested with competition assays. Points marked with ‘X’ indicate a replicate with an insufficient viable cell count (<500) to determine the RFP+%. In this case, the RFP+% was quantified as unchanged from the previous timepoint for the matched replicate. n = 3 biologically independent replicates performed per condition. Data are presented as mean values with a 95% confidence interval. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure6.ipynb. EV, empty vector control; WT, wild type TP53 control.

Some of the most impactful variants in the p53 OD are observed frequently in individuals with Li–Fraumeni syndrome, who carry germline TP53 variants that predispose them to cancer. Two independent studies by the Prives and Lozano laboratories recently showed that p53 proteins harboring A347D mutations (in the OD) form stable dimers instead of tetramers, and that these dimeric p53 proteins exhibit neomorphic activities63,64. Visualizing the residue-averaged ∆Z-scores on the structure of the p53 OD65 further highlights the extensive differences in the behavior of endogenous variants as compared with exogenous (cDNA) variants in this domain (Fig. 6e).

To further investigate the phenotypic differences between endogenous and exogenous TP53 variants, we performed fluorescence competition assays with Nutlin-treated A549-PEmax cells transduced with pegRNAs or matched cDNAs representing specific variants spanning the TP53 DBD and OD regions. We also included a number of important controls for each approach, including silent-edit- or no-edit-inducing pegRNAs that targeted the same loci as their matched variant-inducing pegRNAs, as well as empty vector and WT cDNA constructs. We found a strong agreement between the behavior of mutant TP53 cDNAs tested in competition assays and their corresponding Z-scores in the Giacomelli et al. screen6, validating our assay (Fig. 6f). However, comparing the enrichment of cells harboring endogenous variants engineered with pegRNAs relative to those expressing exogenous variant cDNAs revealed large differences in the behavior conferred by endogenous and exogenous TP53 mutations (Fig. 6g and Extended Data Fig. 10c). Variants in the DBD behaved similarly across both systems, with the exception of R110P and L145P, which enriched only in the cDNA group. However, all OD variants failed to enrich when tested with cDNAs, but three-quarters of OD variants displayed strong enrichment when engineered endogenously using prime editing (Fig. 6g). Importantly, all controls behaved as expected, failing to enrich in the competition assays (Fig. 6g). These results support the hypothesis that certain variant-induced phenotypes can be observed accurately only when engineered and expressed in the endogenous genomic context.

Collectively, our observations highlight gene dosage, protein stoichiometry and protein–protein interaction domains as important variables that must be taken into account when studying the enormous diversity of mutant alleles observed in human cancer. Failing to take these considerations into account might lead to the misclassification of bona fide pathogenic variants, including those identified in patients with hereditary cancer predisposition syndromes like Li–Fraumeni.

Discussion

Most genetic variants associated with various human diseases, including cancer, remain uncharacterized40. The development of CRISPR-based precision genome editing tools, including base and prime editing9,10, has opened the door to rigorous experimental interrogation of disease-associated variants with single basepair resolution at unprecedented scale. However, these approaches, particularly prime editing, remain limited by the variance in editing efficiency among different pegRNAs.

Building on our previous work, we developed a prime editing sensor-based framework for engineering and screening variants that overcomes these limitations. Using this sensor-based approach, in which each pegRNA is coupled to an artificial version of its endogenous target site, we simultaneously identify enriched pegRNAs while empirically quantifying their editing efficiencies. This approach allows for the characterization of variants at several target sites while correcting for the potentially confounding differences in editing activity among distinct pegRNAs. To facilitate the creation of similar libraries by other researchers, we also built PEGG—a new computational tool for creating prime editing sensor libraries (https://pegg.readthedocs.io/en/latest/).

As a prototype for the prime editing sensor framework, we generated a library of pegRNAs targeting over 1,000 cancer-associated variants in TP53. We reasoned that p53 would be the most salient prototype for testing the efficacy of our prime editing sensor screening platform because of its central role in cancer development and progression, the various studies that have performed deep mutational scanning of p53 whose data can provide direct sources of comparison and the ongoing controversy in the field as to whether most (if not all) TP53 mutations observed in cancer patients are functionally redundant.

With our sensor-based approach, we were able to systematically search the pegRNA design space and identify high efficiency pegRNAs, allowing us to install over half of the targeted TP53 variants. Although not the focus of the present study, the breadth and depth of our dataset also allowed us to recapitulate many of the previous findings about the factors affecting pegRNA efficiency.

Analysis of the screening data revealed a wide range in the fitness of TP53 variants, challenging the idea that most p53 variants, particularly those in the DBD, are functionally redundant. While we identified strongly enriching pegRNAs, including many that generated commonly observed ‘hotspot’ mutations at DNA-contacting residues, most of the strongly enriching pegRNAs encoded variants that are not located at mutational hotspots. Instead, a number of these mutations are less frequently observed in patients and remain poorly understood, despite the fact that they collectively affect thousands of patients globally every year. These include variants located within the OD of p53, which were functionally validated with follow-up experiments, and some of which were recently shown to be bona fide pathogenic variants in humans with Li–Fraumeni syndrome63,64.

Comparison between our screening data and a previous study that also screened TP53 variants in A549 cells and in the presence of Nutlin-3 but instead used cDNA-based overexpression libraries revealed a statistically significant enrichment of endogenous—but not overexpressed—variants located in the p53 OD. We further validated this finding with functional assays comparing the behavior of cells harboring exogenous or endogenous TP53 variants. This comparison highlights the potential limitations of cDNA-based screening approaches, particularly when studying variants at sites of protein–protein interactions, underscoring the need to study variants in their native context to access their true biology. Together, our data suggest that stoichiometric imbalances produced by cDNA overexpression could lead to the misclassification of genetic variants as noncausal or otherwise benign. Our findings thus offer a cautionary note when using exogenous overexpression systems to interpret pathogenic alleles, and highlight the importance of using strategies like the one described in this work to investigate variants of interest in their native genomic contexts whenever possible.

More broadly, our study provides a conceptual blueprint and a modular set of experimental and computational tools that can be applied to evaluate diverse types of genetic variants in their native endogenous genomic contexts with high-throughput prime editing. For example, the prime editing sensor strategy described here could be used to investigate the influence of endogenous coding variants on drug resistance or other cancer-relevant cellular phenotypes, while maintaining native levels and regulation of the proteins of interest. Alternatively, our approach could be applied to interrogate the effects of noncoding variants at diverse loci, assessing gene regulatory biology not easily amenable to other screening approaches.

Future prime editing sensor screening efforts could incorporate improved prime editors and pegRNAs, as well as higher-efficiency prime editing systems and strategies like PE3 and PE4 (ref. 10,21). These studies could also be performed in vivo, for example, by delivering compact libraries of mismatch repair-evasive pegRNAs into mice expressing prime editors66 or by codelivery of smaller prime editors, such as PE6a23. Taken together, we envision that our approach will expand our understanding of pathogenic gene variants and help match patients suffering from genetic diseases with effective therapies.

Methods

Experimental materials and methods

Plasmids and pegRNA cloning

All plasmids were generated using Gibson Assembly strategies67 using NEBuilder HiFi DNA Assembly Master Mix (NEB cat. no. E2621) following the manufacturer’s protocol. All new plasmids, along with detailed maps and sequences, will be made available through Addgene. The PEmax coding sequence in Lenti-EFS-PEmax-P2A-Puro was obtained from pCMV-PEmax (Addgene, cat. no. 174820)21. The lentiviral plasmid used to clone and express prime editing sensor libraries was assembled by transferring the U6-sgRNA-EFS-Blast-P2A-TurboRFP cassette from pUSEBR (ref. 14) into the higher titer pLV backbone68. Lenti-PEAR-mCherry—a modified all-in-one lentiviral version of the PEAR reporter43—was also cloned using Gibson Assembly and used to test the editing activity of A549-PEmax cells. The Lenti-UPEmS-tevo plasmid is a modified version of the UPEmS vector66 that contains the tevopreQ1 motif22. This plasmid was used to assemble pegRNAs via Golden Gate Assembly66 for follow-up pegRNA validation experiments. Human WT or mutant TP53 cDNAs were cloned into pCDH-EF1-MCS-IRES-RFP (System Biosciences, cat. no. CD531A-2) using primers EcoRI-TP53-Fwd (5′-CAGTCAGAATTCGCCACCATGGAGGAGCCGCAGTCAG-3′) and BamHI-TP53-Rev (5′-CTGACTGGATCCTCAGTCTGAGTCAGGCCCTTCTGTCTTGAAC-3′). Fragments encoding each cDNA were obtained from Twist Biosciences (Supplementary Table 1).

Virus production

Lentiviruses were produced by cotransfection of HEK293T cells with the relevant lentiviral transfer vector and packaging vectors psPAX2 (Addgene, cat. no. 12260) and pMD2.G (Addgene, cat. no. 12259) using Lipofectamine 2000 (Invitrogen, cat. no. 11668030). Viral supernatants were collected at 48- and 72-h posttransfection and stored at −80 °C.

Drug treatments

Nutlin-3 (Selleck Chemicals, cat. no. S1061) was dissolved in dimethylsulfoxide at a stock concentration of 10 mM and used at a final concentration of 10 μM. PK7088 (Aobious cat. no. AOB4255) was diluted to a final concentration of 200 µM from a stock concentration of 10 mM. COTI-2 (MedChemExpress cat. no. HY-19896) was dissolved in dimethylsulfoxide to a stock concentration of 10 mM and added to a final concentration of 1 µM.

Flow cytometric analyses

Fluorescence-based measurements for the validation of prime editing activity with PEAR and for competition assays were performed using the BD FACSCelesta Cell Analyzer in tube or plate reader format, with BD FACSDiva v.9.0 software used for data collection. Downstream analysis was performed using FlowJo v.10.9.0 to identify single cells and quantify fluorescence.

Generation of A549-PEmax cell lines

To generate A549 cells stably expressing PEmax, we transduced a 15-cm plate with 2.5 million cells with freshly harvested EFS-PEmax-P2A-puromycin lentivirus, and selected cells with 10 µg ml−1 of puromycin at 72 h posttransduction. To assess the prime editing efficiency of these cells, we transduced 250K A549-PEmax cells in triplicate in six-well plates with Lenti-PEAR-mCherry—a modified, all-in-one lentiviral PEAR construct where green fluorescent protein is turned on in the event of successful prime editing. Based on the PEAR reporter activity, we noticed that prime editing activity was not sufficiently high in these cells. We then retransduced these cells with successive rounds of freshly harvested EFS-PEmax-P2A-Puromycin lentivirus. Repeating the PEAR reporter assay revealed a substantial increase in prime editing activity (Extended Data Fig. 2a). This A549-PEmax ‘v2’ cell line was used throughout the present study.

Cloning of p53-sensor library

The oligonucleotide library was ordered from Twist Biosciences. The lyophilized library was resuspended in 100 µl of TE buffer (pH 8.0) and diluted to create 1 ng µl−1 stocks. We performed n = 32 PCR reactions with NEBNext High-Fidelity 2× PCR Master Mix (cat. no. M0541S) to amplify the library with the following primers at a low cycle count: forward 5′-CATAGCGTACACGTCTCACACCG, reverse 5′-GTGCCGTTGACGACCGGATCTAGAATTC. These PCR reactions were pooled and purified using the Qiagen PCR purification kit following the manufacturer’s protocols, with 10 µl of 3 M Na acetate pH 5.2 added for every five volumes of PB used per one volume of PCR reaction. The library was digested with Esp3I (NEB) and EcoRI-HF (NEB), pooled and purified. Subsequently, n = 16 ligations were performed using 300 ng of digested and dephosphorylated Trono-BR backbone and 3 ng of digested insert with high concentration T4 DNA Ligase (NEB, cat. no. M0202M). The ligation reactions were precipitated using QuantaBio 5PRIME Phase Lock Gel tubes before being resuspended in 3 µl of EB Buffer per four precipitated reactions. These precipitated ligation reactions were electroporated into Lucigen Endura ElectroCompetent cells (cat. no. 60242-2) before being plated on LB-carbenicillin plates and incubated at 37 °C for 16 h. Library representation was assessed at this step via serial dilution plating, displaying a representation on the order of 400×. We also picked 30 random colonies from these serial dilution plates to assess the fidelity of library cloning and to test a random set of pegRNAs. We scraped the plates and collected the bacteria in 250 ml of LB-ampicillin per four plates, before incubating for 2 h at 37 °C, collecting the bacteria by centrifugation, and proceeding to perform a Qiagen Maxiprep, following the manufacturer’s protocol. Lentivirus was generated via the aforementioned protocol, and viral titer was determined through serial dilutions of virus, transductions in 12-well plates with one million A549-PEmax cells, and measurement of the RFP-positive cell fraction at 72-h posttransduction. For extended protocol information, see Supplementary Protocol 1.

Screening protocol

For each replicate, 110 million A549-PEmax cells were combined with an appropriate amount of p53-sensor virus to achieve a multiplicity of infection < 1. To this mix, puromycin was added to a final concentration of 10 µg ml−1 and polybrene transfection reagent (Sigma-Aldrich, cat. no. TR-1003) was added to a final concentration of 8 µg ml−1 in F-12K (Gibco, cat. no. 21127030) medium supplemented with 10% FBS and 1× Penicillin-Streptomycin (ThermoFisher). This mix was plated into nine 12-well plates per replicate. At 24-h posttransduction, each 12-well plate was expanded to a 15-cm plate, with medium supplemented with 10 µg ml−1 puromycin and 10 µg ml−1 blasticidin S. These puromycin and blasticidin S concentrations were maintained throughout the screen. At 72-h posttransduction, each 15-cm plate was expanded to two 15-cm plates. At 96-h posttransduction, each replicate was replated at ≥1,000× representation (≥29 million cells) for the untreated arm and the Nutlin-3 treatment arm. For the Nutlin-3 treatment arm, Nutlin-3 was added to a final concentration of 10 µM. At this timepoint, a cell pellet was taken for gDNA extraction. All cell pellets included ≥29 million cells (1,000× representation) and were stored at −80˚C. Subsequently, every 3 days, the cells were split, and replated at 1,000× representation. At each timepoint, a cell pellet was taken if there were a sufficient number of cells to allow for 1,000× representation. This process was repeated until the screen was terminated at D34 posttransduction.

Genomic DNA extraction

Genomic DNA from the D4, D16, D25 and D34 timepoints of the screen was extracted using the Qiagen Genomic Tip/500G following the manufacturer’s protocol. Genomic DNA was resuspended in 200 µl of TE Buffer, pH 8.0. Concentrations were measured using a NanoDrop 2000 (ThermoFisher) and were normalized to 1 µg µl1. For the competition assays (Fig. 5), genomic DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen), following the manufacturer’s protocol.

NGS sample preparation

We performed n = 30 PCR1 reactions per sample using Q5 High-Fidelity 2× Master Mix (NEB, cat. no. M0429S) with 10 µg of genomic DNA to maintain ≥1,000× representation. Up to four PCR reactions were pooled and purified using the Qiagen PCR purification kit following the manufacturer’s protocols. These reactions were then gel purified (Qiagen Gel Extraction Kit), pooled and measured using the NanoDrop 2000 (ThermoFisher). We performed n = 4 PCR2 reactions per sample using 10 ng of PCR1 as a template in each reaction. We PCR-purified and then gel purified these samples and eluted in 30 µl of EB Buffer. These samples were then submitted for sequencing. The PCR1 and PCR2 strategies and the deconvolution protocol are described in Supplementary Protocol 2. All primers are listed in Supplementary Table 1. A similar PCR1/2 strategy was used for preparation of endogenous TP53 amplicons for NGS with the Singular G4 sequencing system (Figs. 4 and 5). These protocols are described thoroughly in Supplementary Protocol 2 and the associated primers can be found in Supplementary Table 1.

Next-generation sequencing

We performed Amplicon-EZ sequencing (Azenta) for analysis of the correlation between sensor and endogenous editing (Fig. 1d). For the NGS of the p53-sensor library, we used the NovaSeq S1 200 sequencing system (NovaSeq 6000) with a custom sequencing primer set to amplify the protospacer, 3′ extension, sensor sequence and sample barcode in separate reads. All other NGS data were generated using the Singular G4 Sequencer (2 × 150 paired-end) with stock primers. All sequencing primers are listed in Supplementary Table 1. The custom sequencing approach for the NovaSeq 6000 is diagrammed in Supplementary Protocol 2.

Golden Gate assembly of UPEmS pegRNAs for follow-up validation

For follow-up validation of pegRNAs, individual pegRNAs were cloned via Golden Gate assembly into the Lenti-UPEmS-tevo backbone (generated by the present study). Golden Gate assembly was performed with annealed spacer oligonucleotides, annealed and phosphorylated scaffold oligonucleotides, and annealed 3′ extension oligonucleotides using NEB BsmBI Golden Gate enzyme mix, before being transformed, mini-prepped (Qiagen) and validated via whole-plasmid sequencing (Primordium). For full protocol details, see Supplementary Protocol 3. The full list of oligonucleotides used for cloning can be found in Supplementary Table 1.

Competition assays

To generate variant p53 lines, we seeded 100,000 A549-PEmax cells in six-well plates and added UPEmS lentivirus corresponding to each variant. To achieve saturation editing, we waited 7–10 days, expanding the cells to a 10-cm plate when they reached confluence, and took a cell pellet for gDNA extraction to assess editing. At this point, we mixed 250,000 variant (RFP+) cells with 750,000 untransduced A549-PEmax cells, and plated 50,000 cells in triplicate in six-well plates. For drug-treated conditions (Nutlin-3, COTI-2, PK7088), the compound was added to the appropriate concentration. Remaining cells were used for flow analysis and to generate a t = 0 cell pellet. At D7 and D14, we assessed the RFP+ fraction of the cells via flow cytometry. The flow gating strategy is displayed in Supplementary Fig. 1. For these analyses, we applied a stringent threshold of ≥500 quantifiable events (that is, single cells) because we found that samples with ≤500 quantifiable events, which were typically observed in cells treated with Nutlin-3 that underwent cellular senescence and/or apoptosis, were insufficient to accurately calculate the RFP+ cell fraction. In these cases, we assumed that the RFP+ cell fraction was unchanged from the previous timepoint, akin to a standard 3T3/proliferation assay.

Analytic/computational methods

Selection of TP53 variants and prime editing sensor library generation with PEGG

To select a cohort of TP53 variants for generating a prime editing sensor library, we used the MSK-IMPACT database20. We chose all SNVs observed in patients, as well as a collection of observed and random indels to increase the diversity of edits (Extended Data Fig. 1b). In addition, PEGG automatically generated 95 neutral/silent variants that tiled the TP53 locus to act as internal controls in the screen.

PEGG generated a maximum of 30 ranked pegRNA designs per variant with RTT lengths of 10, 15, 20, 25 and 30 nucleotides, and PBS lengths of 10, 13 and 15 nucleotides coupled to ‘NGG’ protospacers. A ‘G’ was appended to the start of each 20-nucleotide protospacer to improve U6 promoter-mediated transcription. After PEGG generated these pegRNA designs, we further filtered the library to exclude pegRNAs containing polyT termination sequences (≥4 consecutive Ts), EcoRI and Esp3I sites, and protospacers with an MIT specificity score less than 50. In addition, each pegRNA oligo included a matched, 60 nt sensor locus that was generated automatically by PEGG and used to link each pegRNA to its editing outcome.

For full details of generating a prime editing sensor library using PEGG, visit https://pegg.readthedocs.io/en/latest/.

Analysis of the p53-sensor screen

The p53-sensor sequencing results were demultiplexed into separate fastq files based on the sample barcode. Next, using a custom analysis script, we filtered reads with an average Phred quality score below 30, and identified pegRNAs based on the protospacer and 3′ extension sequences. Sequences with no matching protospacer or 3′ extension were discarded, and sequences with mismatched protospacer and 3′ extension sequences were discarded and classified as recombination events. Sequences with matching protospacer and 3′ extension sequences were used to generate pegRNA counts tables that were subsequently used for MAGeCK analysis (v.0.5.9) of pegRNA enrichment/depletion.

To classify editing outcomes at the sensor locus, we first determined whether recombination had occurred to decouple the pegRNA from its matched target sequence. To do so, we used the first and last five nucleotides of the sensor sequence as a barcode to detect recombination. Sensor reads with the first and last five nucleotides of the read matching the appropriate pegRNA were classified as correct sensor reads, while those with the first and last five nucleotides matching other pegRNAs were classified as recombination events and discarded. We noted that recombination between the pegRNA and sensor was observed at a higher rate when the protospacer was in the same orientation as the sensor, prompting us to update PEGG to automatically place the sensor sequence in the reverse orientation to reduce recombination in future PE sensor libraries (Extended Data Fig. 2e,f). Reads with the first and last five nucleotides with no match were classified as potential indels and retained. For each sample, the sensor reads that were not recombined were demultiplexed into separate fastq files for each pegRNA. We then used Crispresso2 (ref. 69) to classify editing outcomes, excluding the first and last five nucleotides of the sensor read from the quantification window. To determine the background subtracted correct editing percentage, we subtracted the correct editing percentage observed in the plasmid library from the correct editing percentage observed at a given timepoint, although we note that for plasmids with at least ten sensor reads, the median correct editing percentage was 0%, the average correct editing percentage was <0.1% and the maximum observed correct editing percentage was 8.7%.

For analysis of enrichment/depletion of pegRNAs, we used the MAGeCK algorithm47, with the D4 sample designated as the control timepoint. We then filtered to exclude pegRNAs with a control count mean <10 reads to reduce spuriously enriching pegRNA hits. For direct comparison with the cDNA libraries, the LFC values produced by MAGeCK were transformed into Z-Scores using the standard Z-score formula including all pegRNAs under consideration.

Processing and analysis of TP53 endogenous amplicon NGS sequencing

The sequencing files were automatically demultiplexed into separate fastq files based on the sample barcode. Next, we trimmed the sequences from 150 nucleotides to 100 nucleotides, to allow the sequences to be joined. The sequences were joined using the fastq-join (v.1.3.1) algorithm with the default parameters enabled70.

For analysis of genomic DNA amplified from the screen (Fig. 4), we then used custom analysis scripts to generate counts tables for all of the unique sequences, merge matching samples from different flow cells and determine the HGVSp and HGVSc of each sequence. To determine the LFC of each variant at these endogenous loci, we first filtered to exclude undesired variants (that is, those not targeted by pegRNAs in the library) and created counts tables for D4, D34 (untreated) and D34 (Nutlin-treated) for each of the three amplicons. For each amplicon, we used MAGeCK to normalize read counts between samples and determine the LFC of each variant. We then filtered endogenous variants with a control count of fewer than ten reads to reduce spuriously enriching variants. These MAGeCK tables from the different amplicons were concatenated to perform downstream analysis, comparing the endogenous variants with the pegRNA–sensor sequencing results. In addition, we performed sequencing of the WT A549-PEmax cell line, which confirmed the WT status of the regions amplified.

For analysis of genomic DNA amplified from the individually transduced A549-PEmax-pegRNA cells generated for competition assay testing (Fig. 5), we used Crispresso2 to classify editing outcomes, excluding the first and last five nucleotides of the sensor read from the quantification window69.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.