Maximizing the therapeutic potential and improving the efficacy and safety of engineered cells often require the expression of large genes or of complex gene circuits1. For example, chimeric antigen receptor (CAR) T cells leverage a single engineered receptor to rewire its cytotoxicity towards cancers2,3. Efforts to improve their functionality, reduce toxicity or move beyond blood cancer into solid tumours, ageing, autoimmunity and viral clearance have leveraged combinations of tools that increase the ability of T cells to sense, process and respond to the disease. Biomolecular tools such as next-generation receptors4,5,6, clustered regularly interspaced short palindromic repeats (CRISPR) activation (CRISPRa) and CRISPR interference (CRISPRi)7,8,9,10, and logic gates11,12,13 could enhance this therapeutic modality, but their long-term expression often poses a major challenge in primary cells.

Transgene silencing is a major hindrance to using these tools for sophisticated cell engineering and therapy14,15. Semi-random integration of genetic payloads into primary cells using lentiviruses or retroviruses often leads to a reduction (or even complete loss) of transgene expression over time16. There are numerous mechanisms enabling a cell to silence a transgene, including active epigenetic silencing of the gene or transcript through recognition of viral cis elements17,18,19. During tool development, silencing is often overlooked as the tools are optimized using transfection, in cell lines that have a more static phenotype, or at timescales that are much shorter than needed for cell therapy.

To overcome this hurdle, we hypothesized that placing a transgene directly upstream and in-frame of an endogenous essential gene would ensure long-term stability of difficult-to-express or even toxic payloads (Fig. 1). This puts the tool under the endogenous control of the essential gene’s transcriptional and epigenetic regulation, ensuring its long-term expression across any phenotype and avoiding cis elements like promoters that target the transgene for silencing. To test this, we searched for a homology-directed repair (HDR) strategy that was efficient, had low toxicity and could knock in large payloads in primary human cells. Although methods combining CRISPR–Cas9 and adeno-associated virus (AAV)20,21,22 or polymerase chain reaction (PCR)-based DNA23,24 have been used, these methods either do not allow for large knock-ins (AAV packaging limit <4.7 kb) or are too toxic to work in primary cells robustly. In this Article, instead, we engineered CRISPR for long-fragment integration via pseudovirus (CLIP), a comprehensive strategy to stabilize the expression of easily silenced transgenes.

Fig. 1: Schematic design of CLIP.
figure 1

To unlock the potential of engineered cell therapies, an ideal engineering strategy enables efficient delivery of large genetic payloads using a low cytotoxic method such as pseudoviruses. However, current strategies that have these features (such as lentiviruses) randomly integrate into the genome, leading to silencing of the payload. CLIP enables HDR-mediated knock-in of large gene cassettes into essential genes, generating stable payload expression that benefits the therapeutic potential of the engineered cell.

Source data


CLIP enables high-efficiency knock-in into an essential locus

To create CLIP, we leverage integrase-deficient lentiviruses (IDLV) to deliver an HDR donor template as a viral RNA genome to cells25,26. IDLV consists of lentiviral components but with a D64V mutation in the integrase, maintaining this enzyme’s ability to package and nuclear localize viral genomes but dramatically reducing its ability to randomly integrate into the genome. A CLIP pseudoviral genome encodes a genetic payload flanked by two homology arms (HAs) that are homologous to the gene targeted for integration (Fig. 2a). After transduction, the delivered viral RNA genome is reverse transcribed, converting it into a double-stranded DNA donor for HDR. CRISPR ribonucleoprotein (RNP) is subsequently electroporated into cells to create a targeted double-stranded break in the host genome, enabling HDR. Importantly, the CRISPR RNP also processes the pseudoviral HDR template at both the 5′ and 3′ ends, as CLIP donors are flanked by two protospacer adjacent motifs and a protospacer (collectively, cut sites) matching the genomic target. This removes the long terminal repeat (LTR) sequences and other viral cis elements.

Fig. 2: Engineering CLIP to enhance HDR in K562 cells.
figure 2

a, The CLIP methodology uses IDLVs to deliver a CLIP donor (top). The donor is reverse transcribed for 24 h. Cas9 RNP is then electroporated into cells, simultaneously cutting the cell’s genome and liberating the donor from the viral backbone (middle), enabling HDR (bottom). b, Design of the CLIP donor to integrate GFP-P2A directly upstream of the N-terminal methionine of ACTB. c, CLIP strategy compared with IDLV (same as CLIP but without inclusion of cut sites) with and without RNP 3, 5 and 7 days after electroporation. The high expression peak is indicative of a targeted knock-in while the low expression peak is indicative of transient episomal expression of the reverse-transcribed viral genome. d, Knock-in efficiency of CLIP compared with IDLV, with and without Cas9 RNP, at day 7 is nearly double. (**P = 0.0058, two-tailed Student’s t-test). e, CLIP donors were titred to change the episomal expression of CLIP donors before RNP electroporation, which predicted future knock-in efficiencies at day 7 (log-linear fit, R2 = 0.977, error bars are standard error of the mean). Figures are two biological replicates each with two technical replicates.

Source data

We first used the endogenous actin-β (ACTB) genomic site as a knock-in strategy to test CLIP27. ACTB is a highly expressed endogenous gene and an essential component of the cellular cytoskeleton. We transduced human myelogenous leukaemia K562 cells using a CLIP donor that contains a green fluorescent protein (GFP)-P2A payload, HAs that place the transgene directly upstream of the ACTB start codon, and cut sites recognized by Cas9 complexed with sgACTB (Fig. 2b). Twenty-four hours later, Cas9 and chemically synthesized sgACTB were complexed and electroporated into cells. Cells were followed over a time course via flow cytometry at 3, 5 and 7 days post-electroporation and compared with cells delivered with an IDLV donor without the cut sites or cells that received the donors but without the Cas9 RNP (Fig. 2c and Extended Data Fig. 1).

Both the CLIP and IDLV donors with Cas9 RNP created high GFP expression that was stable over time, indicating a targeted ACTB knock-in. Inclusion of cut sites in the CLIP donor dramatically increase the percentage of K562 cells that integrate GFP into the ACTB locus, doubling the knock-in efficiency when compared with the IDLV donor (Fig. 2d and Supplementary Fig. 1). We also observed a second, lower GFP expression population in both the IDLV donor with Cas9 RNP or either donor without Cas9 RNP. This is probably due to the known ability of the pseudovirus to generate episomal DNA and create transient expression of delivered payloads that dilutes over time26. In comparison, CLIP donors with Cas9 RNP dramatically reduced episomal expression. Since episomal expression relies on a fully intact viral genome, this observation indicates that cut sites enable rapid processing of donor DNA.

Previous HDR methods demonstrated that the addition of Cas9 binding sites on the HDR donor increases knock-in efficiency by facilitating its transport into the nucleus24. To determine whether Cas9-mediated shuttling of the HDR donor was responsible for the increased efficiency of CLIP, we generated a CLIP donor with protospacers 14 base pairs long, allowing Cas9 to bind but not cut the donor (Supplementary Fig. 2)28. We found no increase in knock-in efficiency with these truncated cut sites when compared with an IDLV donor suggesting that processing and linearization of the donor is responsible for the increased efficiency29.

On-target integration of the donor via non-homologous end joining (NHEJ) or residual integration capacity of IDLV could account for some of the GFP+ population30,31. To determine if either CLIP- or IDLV-based donors created these unintended events, we sorted the GFP+ K562 population from both donors and performed genomic PCR to determine if any non-HDR-mediated integration occurred (Supplementary Fig. 3). Neither on-target NHEJ nor residual lentiviral integration was apparent using either donor.

Given that the CLIP donor expresses GFP before Cas9 RNP introduction, we asked whether the degree of this expression is indicative of the donor concentration and thus predictive of the knock-in efficiency. To do this, we serially diluted the CLIP pseudovirus and quantified the K562’s GFP expression before electroporation of Cas9 RNP. After 7 days, knock-in efficiency was quantified and correlated to the degree of episomal expression before electroporation (Fig. 2e). There was a high degree of log-linear correlation (R2 = 0.977) between these two measures, indicating that CLIP efficiency is dependent on the pseudovirus titre and that knock-in efficiency can be predicted a priori, before delivery of Cas9 RNP.

A single CLIP donor delivers two payloads at different loci

Given CLIP’s large packaging limit (~9 kb) and ability to process the HDR template, we reasoned that CLIP could generate multiple knock-ins simultaneously at distinct locations using a single donor. To test this, we cloned a pseudoviral genome that contained a GFP-P2A payload surrounded by HAs to ACTB followed by an mCherry-P2A payload surrounded by HAs to RAB11A (Fig. 3a)23. The RAB11A template has flanking cut sites that could be cut by Cas9-sgRAB11A, while the ACTB template has cut sites for Cas9-sgACTB.

Fig. 3: CLIP enables the simultaneous knock-ins of multiple payloads.
figure 3

a, A CLIP donor to knock mCherry into the RAB11A locus and GFP into the ACTB locus off a single pseudoviral genome. b, Targeted knock-in of GFP and mCherry from the CLIP double donor was dependent on the presence of its corresponding sgRNA. c, Knock-in efficiencies of GFP into the ACTB locus or mCherry into the RAB11A locus. Numbers indicate mean efficiency of each population; error bars are standard error of the mean (n = 4). Figures are two biological replicates each with two technical replicates.

Source data

We delivered the designed CLIP donor to K562 cells with multiple conditions: no RNP, with RNP containing either sgACTB or sgRAB11A, or with RNP containing both guide RNAs (Fig. 3b and Supplementary Fig. 4). CLIP with one RNP generated knock-ins only to its corresponding locus, while RNPs containing both single-guide RNAs (sgRNAs) enabled both knock-ins in single cells (Fig. 3c). The percentage of double knock-in cells was approximately multiplicative of the two single RNP controls.

CLIP is a robust and low-toxic knock-in method in primary T cells

We hypothesized CLIP can be adopted as a non-toxic knock-in method in human primary T cells for cell therapy as repurposed lentivirus efficiently delivers genetic payloads to this cell type and avoids direct delivery of naked DNA. To test this, we performed CLIP in CD3+ T cells (Fig. 4a and Methods). We used the a priori predictive power of CLIP to define a multiplicity of infection (MOI) of 1,000 as the titre that gives the highest GFP expression within what is practical to manufacture (Fig. 4b). Using the ACTB HDR template, we observed a high GFP expression population indicative of targeted knock-in into this essential gene (Fig. 4c and Supplementary Fig. 5a). Importantly, the knock-in efficiency using CLIP was consistent across three different donors averaging between 16% and 23% (Fig. 4d and Supplementary Fig. 6).

Fig. 4: CLIP is a non-toxic method for targeted essential gene knock-ins in human primary T cells.
figure 4

a, Timeline for performing CLIP in primary T cells. b, Median episomal GFP expression from CLIP donors before introduction of RNPs at different MOI indicates that an MOI of 1,000 will optimize knock-in efficiency within what is manufacturable (bar indicates mean). c,d, Representative histogram (c) and knock-in efficiencies (d) of GFP into ACTB locus in primary T cells 7 days after electroporation across three donors (each with a technical replicate). e,f, Representative histogram (e) and knock-in efficiencies (f) of GFP into IL2RG locus in primary T cells 7 days after electroporation across three donors (each with a technical replicate). g, Viability of plasmid-based and PCR-based DNA donor knock-ins compared with CLIP at day 4 demonstrated high toxicity to naked DNA. Grey indicates viability (mean ± standard error of the mean) of wild-type cells across three donors (each with a technical replicate). h, Representative images of primary T cells after undergoing CLIP, plasmid-based or PCR-based knock-in strategies demonstrate low viability of naked DNA-based methods compared with CLIP after 24 h (top) and only PCR-based and CLIP show any GFP+ cells at day 4 (bottom) (scale bar, 300 µm).

Source data

Deviating from optimized protocols reduced the efficiency of previous knock-in methods23. To test how sensitive CLIP is to protocol change, we altered transduction timing and the number of primary T cells electroporated to see its effect on knock-in efficiency. Adding CLIP pseudovirus 2 days before electroporation instead of 1 day or using fewer cells in the electroporation reaction did not change HDR efficiency, indicating CLIP is robust to changes in methodology (Extended Data Fig. 2).

We next sought to adapt CLIP to target another essential gene and chose IL2RG as its expression is essential to primary T cells but lower than ACTB. IL2RG encodes a vital cytokine receptor subunit for signalling by many interleukins including the necessary proliferation signal interleukin (IL)-2. We designed an IL2RG-specific CLIP donor by placing payloads and HAs directly upstream of the N-terminal methionine of the endogenous gene32. Performing CLIP with an IL2RG HDR template created expression indicative of GFP under the regulation of this locus (Fig. 4e and Supplementary Fig. 5b). The knock-in efficiency was also robust and consistent across multiple donors, with an average efficiency between 18% and 30% (Fig. 4f).

CLIP’s payload capacity (~9 kb) enables delivery of large payloads compared with alternative HDR methods using AAV (<4.7 kb). While methods that use plasmid DNA or linearized double-stranded DNA (for example, PCR products) could theoretically allow for large payload knock-in, they also show substantial cytotoxicity in primary T cells33. We compared CLIP with non-viral methods that have been optimized to reduce DNA-derived cytotoxicity in primary T cells using the ACTB HDR template23. Four days post-electroporation, CLIP edited cells had a viability close to that of wild-type cells experiencing the same culture conditions, while plasmid and PCR DNA edited cells demonstrated substantial cell death (Fig. 4g). This naked DNA cytotoxicity could not be fully rescued by changing the basal medium conditions from RPMI 1640 to X-VIVO 15 while CLIP remained non-toxic, further demonstrating that CLIP is robust to changes in methodology (Extended Data Fig. 3a). This low toxicity is a feature of the pseudovirus as IDLV donor (without cut sites) knock-ins into primary T cells also demonstrated high viability (Extended Data Fig. 3b). This DNA mediated cytotoxicity was apparent as early as 24 h post-electroporation (Fig. 4h). Optimized PCR-based methods generated a small portion of cells with the desired knock-in while plasmid-based methods did not.

Functional and stable expression of large CRISPR payloads

We next tested whether CLIP could be used to stabilize expression of genes that are larger and more complex than fluorescent proteins. To do this we first created CLIP methods to introduce Cas13d from Ruminococcus flavefaciens into the ACTB or IL2RG essential genes34. Cas13d is an RNA-guided, RNA targeting nuclease with a coding gene size ~2.9 kb, which could be used to generate stable therapeutic transcriptional programmes in primary T cells through targeted perturbation of specific transcripts.

To do this, we generated CLIP donors that encoded mCherry-P2A-Cas13d-P2A (~4.7 kb from cut site to cut site) and integrated the payload into both the ACTB and IL2RG loci in K562 cells (Fig. 5a). While ACTB created Cas13d expression, only a small amount of expression was observed in the IL2RG locus (Fig. 5b and Extended Data Fig. 4a). This led to discrepancies in the apparent knock-in efficiencies between the two loci as well as the magnitude of expression in Cas13d+ cells (Extended Data Fig. 4b). However, genetic analysis demonstrated that the payload was able to integrate for both knock-ins, indicating that IL2RG expression is probably too low for this protein to be measured with flow cytometry (Extended Data Fig. 4c). This was consistent with the previous data (Supplementary Fig. 3), where primary T cells expressed GFP nearly 2.5 orders of magnitude higher when knocking into the ACTB locus as compared with the IL2RG locus (Extended Data Fig. 5a).

Fig. 5: CLIP enables knock-in of large CRISPR–Cas payloads for the stable expression in K562 and primary T cells.
figure 5

a, Design of CLIP donors that place mCherry-P2A-RfxCas13d-P2A directly upstream of the N-terminal methionine of ACTB or IL2RG. b, CLIP-mediated knock-in of mCherry-P2A-Cas13d-P2A into two locations in K562s demonstrated that only the highly expressed ACTB locus creates differentiable expression of larger transgenes. c, Surface CD46 protein in K562 cells expressing Cas13d from the ACTB locus alone or in the presence of a non-targeting guide (crNT) or a guide targeting CD46 RNA (crCD46) demonstrates Cas13d is fully functional. d, Stability of mCherry-P2A-Cas13d in primary T cells compared between CLIP in the ACTB locus and randomly integrating lentiviruses driven by an EF1α or SFFV promoter. Knock-in or transduction efficiency was normalized to the efficiency at day 3 (error bars represent standard error of the mean). e, Design of CLIP donor that places hyperdCas12a-miniVPR (CRISPRa) directly upstream of the N-terminal methionine of ACTB. f, CRISPRa knock-in efficiency in the ACTB locus of K562 cells 2 days post-electroporation. g, CRISPRa+ cells were sorted on day 3 and assayed weekly for a month to demonstrate that CLIP stabilizes hard to express genetic payloads. h, Payloads stabilized by CLIP are still functional even after 31 days as demonstrated by surface CD2 upregulation in the presence of a guide targeting CD2 (crCD2) but not with a non-targeting guide (crNT) or the CRISPRa protein alone. Figures are two biological replicates each with two technical replicates.

Source data

K562 cells with Cas13d expressed from the ACTB locus were sorted and Cas13d-cognate CRISPR RNA (crRNA) targeting CD46 (crCD46) or a non-targeting guide (crNT) were introduced via lentivirus35. Consistent with previous reports on Cas13d-mediated transcript knockdown, we observed a nearly 50% reduction of the surface CD46 protein level 3 days after crRNA transduction, which demonstrates that CLIP-stabilized proteins are functionally active in edited cells (Fig. 5c).

We further compared CLIP-mediated Cas13d knock-in with the ACTB locus of primary T cells versus traditional lentiviral integration for large transgene expression. For this comparison, we tested two lentiviral vectors that encoded Cas13d expression under two different promoters, an SFFV promoter that is of viral origin and an EF1α promoter that is of human origin36. Edited primary T cells with Cas13d knocked into the ACTB locus showed stable Cas13d expression over time but was less efficient than lentivirus (Fig. 5d and Extended Data Fig. 5b). In contrast, the number of cells expressing Cas13d using lentivirus showed drastic silencing, with about 25% of EF1α driven Cas13d+ cells and about 50% of SFFV driven Cas13d+ cells experiencing silencing in 15 days. This effect will probably be exacerbated in engineered cell therapies that are not experiencing the optimal growth conditions performed here.

We next created CLIP methods to introduce a highly efficient CRISPRa molecule (hyperdCas12a-miniVPR) into the ACTB locus, which could be used for long-term, targeted gene upregulation of therapeutic programmes in T cells37. The CLIP donor encodes hyperdCas12a-miniVPR-T2A-GFP-P2A flanked by HAs to ACTB and sgACTB cut sites (Fig. 5e). The donor is ~6.4 kb from cut site to cut site, making it too large to fit into AAV-based donors. Additionally, we generated lentiviral control constructs that express hyperdCas12a-miniVPR under the control of an SFFV or EF1α promoter. While the control constructs expressed the CRISPRa protein when transfected into HEK293T cells, no expression was observed after 2 days when delivered to K562 cells as lentivirus, indicating strong silencing (Extended Data Fig. 6).

In contrast, hyperdCas12a-miniVPR was efficiently expressed when introduced into the ACTB locus of K562 cells 2 days after electroporation (Fig. 5f). At day 3 post-electroporation, hyperdCas12a-miniVPR+ cells were sorted and measured every week until day 31 (Fig. 5g). CRISPRa protein was stably expressed for the entire month, demonstrating CLIP’s ability to stabilize hard to express and easily silenced transgenes. To ensure the protein was still functional even at this later timepoint, we introduced Cas12a-cognate crRNA targeting CD2 (crCD2) or a non-targeting crRNA (crNT) into sorted cells and measured its ability to upregulate CD2 at day 31 via flow cytometry (Fig. 5h). Cells carrying both hyperdCas12-miniVPR and crCD2 had a 100-fold increase in CD2 expression, indicating that the stabilized protein was still functional even after 31 days.

Long-term expression of multiple antigens in primary cells

We next tested whether CLIP could be used to stably express antigens in primary cells to study antigen presentation. The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic triggered a major need to understand how different human leukocyte antigen (HLA) types present viral antigens to elucidate the interactions between the virus, infected cells and immune cells across different genetic backgrounds. We asked whether CLIP could express multiple antigens by stably and strongly expressing them using human primary T cells to generate SARS-CoV-2 peptide major histocompatibility complex class I (MHC-I).

To examine whether CLIP enables generation of primary T cells stably expressing SAR-CoV-2 proteins, we created a polypeptide cassette containing the S1 fragment of the Spike protein (~2 kb), a conserved region of the RNA-dependent RNA polymerase (RdRP, ~1.9 kb)38 and GFP (~0.7 kb) (Fig. 6a). The total length of the cassette is ~5.8 kb, making it too large for AAV-based methods. We compared stable integration of this cassette into the ACTB locus using CLIP with both EF1α-driven and SFFV-driven lentiviruses and followed them for 15 days (Fig. 6b). Interestingly, 3 days after transduction, we found that SFFV-driven payload expression was almost non-functional due to rapid silencing while the EF1α-driven lentivirus generated a small tail of low-expression S1-RdRP+ cells. However, CLIP generated more than three times higher-expressing population at day 3 post-electroporation (Fig. 6c). Importantly, CLIP created a distinct and highly expressed population that remained stable over 15 days in terms of percentage of cells expressing the cassette and the level of expression, whereas the small EF1α-driven S1-RdRP+ population was further silenced over time (about 55% loss) (Fig. 6d and Extended Data Fig. 7a).

Fig. 6: CLIP enables proof-of-principle engineered cell vaccines with knock-in of multiple SARS-CoV-2 derived genes for antigen display.
figure 6

a, Design of CLIP donor that places the S1 domain of SARS-CoV-2 Spike protein, a fragment of SARS-CoV-2 RdRP that is conserved across coronaviruses, and GFP, all separated by 2A tags, directly upstream of N-terminal methionine of ACTB. b, CLIP mediated stable expression of the transgene in primary T cells over a 15 day time course, whereas lentivirus-mediated random integration with EF1α created a small population of low-expression cells that silenced over time. SFFV-driven expression did not generate any transduced cells at day 3 and beyond. c, Even before silencing (day 3), EF1α-driven expression had lower magnitude than CLIP (**P = 0.0057, two-tailed Student’s t-test). d, Stability of S1-RdRP+ for both CLIP- and EF1α-driven lentivirus expression. Knock-in or transduction efficiency was normalized to the efficiency at day 3 (error bars represent standard error of the mean). e, Jurkats expressing CLIP-mediated integration of the SARS-CoV-2 protein cassette were subjected to immunopeptidomics. Two peptides were statistically found compared with wild type and predicted for HLA-A*03 affinity in NetMHC 4.0 (cut-off of 0.05 for strong binder). Figures are two biological replicates each with two technical replicates.

Source data

To determine whether CLIP-delivered S1-RdRP could generate MHC-I displayed immunopeptides, we generated and sorted a Jurkat T-cell line that expressed the S1-RdRP polypeptide gene integrated into the ACTB locus (Extended Data Fig. 7b). Immunopeptidomics was performed via mass spectrometry from immunoprecipitated peptide MHC-I. Notably, this assay identified two peptides in the RdRP protein that were also predicted to be strong binders of HLA-A*03, the HLA haplotype of Jurkat T cells (Fig. 6e).


Mammalian synthetic biology requires the efficient and facile integration of large gene payloads, for stable expression in cells. We develop and characterize CLIP as an efficient, easy-to-implement and versatile pseudoviral method to stabilize large and difficult-to-express transgenes in immortalized and primary human cells. We engineered multiple enhancements into clinically useful lentivirus systems to build CLIP, including the use of an IDLV strain and incorporation of flanking cut sites by the same CRISPR RNP into the pseudoviral genome that dramatically increases knock-in efficiency. It is well known that the IDLV genome, after reverse transcription, undergoes HDR at the viral LTRs to create circular episomal DNA26. Processing of the HDR template via the cut sites cleavage by CRISPR RNP not only removes unneeded viral backbone elements but also linearizes the DNA, both of which improve knock-in efficiency. Additionally, while previous reports have shown that IDLV can have residual viral integration activity31 or HDR donors can experience on-target NHEJ30, we were unable to detect either of these off-target effects using either CLIP or IDLV donors in the knock-in positive fraction (GFP+ population).

In comparison with other described HDR technologies, CLIP has several advantages. For example, whereas AAV-based HDR donors have been described in multiple cell types, the size of the AAV payload is limiting and often not amenable to large payload knock-in or double knock-ins off a single donor. While non-viral methods, which in theory can deliver larger payloads than AAV, have undergone many optimizations to improve knock-in efficiency and reduce toxicity, in our hands it still created extremely low viability in primary T cells, consistent with previous reports15. Compared with these methods, CLIP balances the need to knock-in large or multiple payloads while limiting cytotoxicity, as we showed nearly no loss of viability and revealed robust efficiencies in multiple loci.

The predictability of CLIP knock-in efficiency before introduction of Cas9 RNP could become a useful aspect for manufacturing engineered cell therapies for quality control. Since the vast expense of the therapy as well as some of the clinical outcome is dependent on the quantity of the engineered cells that are manufactured, a priori knowledge of the integration efficiency could be leveraged to save valuable time and resources.

In CLIP, we further chose essential genes that are highly or moderately expressed as integration locus for long-term stable transgene expression. For example, we showed that ACTB-integrated payloads are consistently highly expressed across multiple tested cell lines or primary cells, whereas IL2RG-integrated payloads usually exhibit low expression. The low expression is not necessarily useless as sometimes toxic payload expression should be reduced. To increase the magnitude of expression, other essential gene loci such as GAPDH, cytoskeletal proteins or ribosomal elements may be used. It is also conceivable that, by identifying and choosing cell-type specific loci, we could use CLIP to achieve highly specific payload expression that increases the safety of engineered cells.

Delivery of Cas13d into primary T cells demonstrated that any payload introduced using traditional lentivirus can undergo fluctuations in percentage of cells expressing the payload. Integration of the protein into an essential locus stabilized both aspects albeit with lower initial expression and lower percentage than lentivirus under our culture conditions. However, CLIP can be further optimized to improve transduction efficiency using a myriad of strategies employed by other HDR methodologies such as introducing NHEJ inhibitors or HDR enhancers. As both are being heavily optimized, we believe CLIP can be easily combined with these advances to provide a solution for the long-term gene expression in primary T-cell manufacture that is necessary for future cell therapies.

Finally, as a proof-of-concept demonstration, we used CLIP to express very large (~6 kb) transgene cassettes. Aggressive silencing of CRISPRa proteins inhibits their widespread adoption into cell therapies. The stable expression of hyperdCas12a-miniVPR in cell therapies enables the complex control of a transcriptional programme that increases the therapeutic potential of the cell. Additionally, many of the proteins expressed by SARS-CoV-2 are difficult to express and even toxic, hindering their dissection in primary cell types. Stable expression of viral proteins in easy-to-access blood cells like T cells would allow us to study viral peptide-MHC-I presentation in a variety of HLA alleles, creating valuable information for vaccine design and epidemiology. For our cassette encoding two SARS-CoV-2 genes, CLIP maintains a substantial number of primary T cells expressing the payloads at high levels over a long period. We show that CLIP-engineered cells can generate MHC-I displayed peptides from the introduced transgenes, thus offering a useful tool to study the immunopathology of viruses that have difficult-to-express genes. While we were only able to identify peptides from the RdRP protein, we hypothesize that the S1 protein was exported from the cells too quickly to be captured efficiently by the proteosome and therefore was not measurable by immunopeptidomic analysis.

As cell-based vaccines become a promising approach towards cancer, CLIP could be applied to antigen-presenting cells or other cell types to generate cell-based vaccines or adjuvant vector cells against viruses such as SARS-CoV-2. Furthermore, CLIP can stabilize the expression of difficult-to-express proteins, such as Cas13 or dCas12a, that will be essential to generate positive therapeutic outcomes. We envision CLIP as a broadly applicable and clinically relevant method for cell manufacturing for cell therapies.


Plasmid cloning

Standard molecular cloning techniques were used to assemble constructs in this paper. ACTB HAs (87425) and RAB11A HAs (112013) were obtained from Addgene. The IL2RG HAs and turboGFP were received as a generous gift from Dr Matthew Porteus. SARS-CoV-2 S1 was synthesized using gBlocks (Integrated DNA Technologies), and the SARS-CoV-2 RdRP fragment as well as RfxCas13d were amplified from previously published clones38. Cut sites were generated through PCR amplification of HDR templates using extended oligos. HAs and payload sequences can be found in Supplementary Note 1.

Cell line culture

HEK293T cells (Clonetech) were cultured in DMEM + GlutaMAX (Thermo Fisher) supplemented with 10% fetal bovine serum (FBS; Alstem) and 100 U ml−1 of penicillin and streptomycin (Life Technologies). K562 and Jurkat cells (ATCC) were cultured in RPMI 1640 + HEPES + l-glutamine (Thermo Fisher) supplemented with 10% FBS (Alstem) and 100 U ml−1 of penicillin and streptomycin (Life Technologies). Cells were maintained at 37 °C and 5% CO2 and passaged using standard cell culture techniques. HEK293T cells were subcultured every 2–3 days at a concentration of 1 × 105 cells ml−1. K562 cells and Jurkat cells were maintained between 1 × 105 and 1 × 106 cells ml−1. Cells were not tested for mycoplasma contamination.

Primary T-cell isolation and culture

Buffy coat from de-identified human donors were received from the Stanford Blood Center. Peripheral blood mononuclear cells were isolated using Ficoll centrifugation with SepMate tubes (STEMCELL, per manufacturer’s instructions). Untouched bulk T cells were isolated by EasySep immunomagnetic negative separation (STEMCELL, per manufacturer’s instructions). T cells were cryopreserved for future use.

T cells were thawed and cultured in RPMI 1640 + HEPES + l-glutamine (Thermo Fisher) supplemented with 10% FBS (Alstem), 100 U ml−1 of penicillin and streptomycin (Life Technologies), 200 U ml−1 of IL-2 (STEMCELL), 5 ng ml−1 of IL-7 (STEMCELL), 5 ng ml−1 of IL-15 (STEMCELL) and a 1:1 CD3/CD28 magnetic dynabead (Thermo Fisher) to cell ratio. Beads were removed by magnetic separation at 48 h, and IL-2 was adjusted to 500 U ml−1. Primary T cells were maintained at around 1 × 106 cells ml−1 and 500 U ml−1 of IL-2 by addition of fresh medium every 2–3 days. For determining the effects of basal medium conditions on viability, X-VIVO 15 (Lonza) supplemented with 5% FBS (Alstem), 50 µM 2-mercaptoethanol and 10 µM N-acetyl l-cystine was used with the same cytokine and bead conditions as described above.

Generation of lentivirus and IDLV

A D64V mutation in the lentiviral integrase was cloned into the pCMV-R8.91 vector to create pCMV-R8.91-IND64V. To create virus, 3 × 105 or 4.5 × 106 HEK293T cells were seeded into the well of a six-well plate or a 15 cm dish, respectively. Twenty-four hours later, 1.51 µg or 30.2 µg of pHR plasmid, 1.32 µg or 26.4 µg of pCMV-R8.91 or pCMV-R8.91-IND64V, and 165 ng or 3.3 µg of pMD2.g was mixed into 250 µl or 5 ml of Opti-MEM, followed by addition of 7.5 µl or 150 µl of TransIT-LT1 reagent (Mirus Bio). Mixtures were incubated for 15–30 min and added to the seeded six-well or 15 cm dish, respectively. Seventy-two hours after transfection, supernatant was collected and filtered through a 0.45 µm filter, and 5X lentivirus precipitation solution (Alstem) was added. Viral mix was incubated at 4 °C for 24 h, and lentivirus was isolated per the manufacturer’s instructions. Lentivirus was resuspended in appropriate media and titred using a qPCR lentiviral titre kit (ABM).

Cas9 RNP generation

Cas9 RNP was generated by mixing 40 µM purified Cas9 HIFI (Integrated DNA Technologies) and 80 µM chemically modified sgRNA (Synthego) in PBS without Mg2+ or Ca2+ (Thermo Fisher). RNP was allowed to complex at room temperature for 10 min. All sgRNA sequences can be found in Supplementary Note 2.

CLIP electroporation

For Jurkat cells or K562 cells, 1 × 105 cells were seeded into the well of a 96-well plate and CLIP donor (IDLV) was added to the well for a total of 200 µl. Twenty-four hours later, cells were pelleted and resuspended in 20 µl of Cell Line Nucleofector Solution SF or SE (Lonza) for K562s or Jurkats, respectively. Then, 1 µl of Cas9 RNP was added immediately after complexation and the cell mixture was moved into a Nucleocuvette strip (Lonza). Cells were electroporated in a Nucleofector X Unit (Lonza) using code CA-122 or CK-116 for K562s or Jurkats, respectively, and handled post-electroporation per the manufacturer’s instructions.

For primary T cells, 1 × 106 cells were seeded into the well of a 48-well plate. Twenty-four hours after thaw, CLIP donor (IDLV) was added to the well for a total of 1 ml unless otherwise noted. Twenty-four hours later, cells were pelleted and resuspended in 20 µl of Primary Cell Nucleofector Solution P3 (Lonza). Then, 2.5 µl Cas9 RNP was added immediately after complexation and the cell mixture was moved into a Nucleocuvette strip (Lonza). Cells were electroporated in a Nucleofector X Unit using code EH-115. Immediately after electroporation, 80 µl of pre-warmed medium was added directly to the Nucleocuvette well. Cells were rested for 10 min at 37 °C and 5% CO2 and then resuspended in a final volume of 1 ml of medium + 500 U ml−1 of IL-2.

PCR and plasmid electroporation

Plasmid-based HDR donors were generated using the pHR vector used in IDLV to generate CLIP donors. Plasmids were isolated from bacterial hosts using a Plasmid Midi Kit (Qiagen) with endotoxin removal. PCR-based HDR donors were generated from the pHR vector used in generating CLIP donors using primers CTGGGACTCAAGGCGCTAACT and CGATGGGGTACTTCAGGGTGAG and Kapa HIFI polymerase (Takara Biosystems). PCR products were isolated using SPRI (1.0×) purification. Both products were adjusted to 2 µg µl−1.

Two microlitres of PCR or plasmid HDR donors were first aliquoted into a V-bottom plate. Next, 2.5 µl of Cas9 RNP was added to the well. After a 30 s incubation at room temperature, 1 × 106 primary T cells that were grown for 48 h as described before, removed of CD3/CD28 beads, pelleted and resuspended in 20 µl Primary Cell Nucleofector Solution P3 (Lonza). The mixture was moved into a Nucleocuvette strip (Lonza). Cells were electroporated in a Nucleofector X Unit using code EH-115. Immediately after electroporation, 80 µl of pre-warmed medium was added directly to the Nucleocuvette well. Cells were rested for 10 min at 37 °C and 5% CO2 and then resuspended in a final volume of 1 ml of medium supplemented with 500 U ml−1 of IL-2 (STEMCELL).

Flow cytometry and cell sorting

Cells were mixed via pipetting to create a single-cell solution. For flow cytometry, cells were analysed for fluorescence using a CytoFLEX S flow cytometer (Beckman Coulter). A total of 1 × 104 live single cells were collected for each sample and analysed using FlowJo. Median fluorescent intensities or percentages of gated cells were compared using a two-sided t-test in GraphPad Prism 9.3.1. For sorting, cells were sorted using a SH800S cell sorter (Sony) using fluorescent markers to gate. For Cas13d knockdown experiments, mCherry (Cas13d) K562s were sorted. For CRISPRa upregulation experiments, GFP (hyperdCas12a-miniVPR) K562s were sorted. For immunopeptidomics, GFP (S1-RdRP)-positive Jurkats were sorted.

Genomic DNA PCR

Genomic DNA from K562 cells was isolated using a DNeasy Blood and Tissue Kit (Qiagen). For determining unintended NHEJ prevalence, primers were designed such that a forward primer annealed to GFP and a reverse primer annealed to ACTB exon 1. For determining unintended lentiviral integration, primers were designed such that a forward primer bound to the lentiviral backbone (the central polypurine tract) and the reverse to GFP. For Cas13d experiments, primers were designed such that a forward primer annealed to the integrated Cas13d and the reverse annealed to either IL2RG or ACTB, downstream of the HA. For unintended NHEJ or viral integration, reactions were run at 40 cycles to increase sensitivity. For unintended NHEJ, extension was run at 5 min to facilitate amplification of longer sequences. Primers can be found in Supplementary Note 2.

Quantification and imaging of viability

Primary T cells that underwent targeted knock-in via CLIP or using plasmid- or PCR-based donor templates were cultured for 4 days post-nucleofection in media supplemented with 500 U ml−1 of IL-2 (STEMCELL). Cells were counted every 2–3 days and adjusted to 106 cells ml−1. At 4 days, cells were imaged in both the phase and FITC channel using an EVOS M5000 Cell Imaging System (Thermo Fisher). Cells were then stained for viability using LIVE/DEAD Fixable Red Dead Cell Stain Kit (Thermo Fisher) following the manufacturer’s protocols and assayed via flow cytometry.

Quantification of Cas13d knockdown and hyperdCas12-miniVPR upregulation efficiency

K562s underwent CLIP to integrate Cas13d or hyperdCas12-miniVPR into the ACTB locus as previously described. Cas13d+ or hyperdCas12-miniVPR+ cells were then sorted and lentivirus containing a puromycin selection marker and a U6-driven Cas13d-cognate crCD46 or crNT or a Cas12a-cognate crCD2 or crNT was introduced. Double-positive cells were selected using 1 µg ml−1 of puromycin. Cells were washed twice, stained using an APC-conjugated CD46 antibody (BioLegend, clone MEM-258) or an APC-conjugated CD2 antibody (BioLegend, clone TS1/8), then washed twice. CD46 or CD2 expression was quantified using the median APC intensity of live, single cells as measured by flow cytometry. crRNA sequences can be found in Supplementary Note 2.


Sorted, GFP+ (S1-RdRP+) Jurkats and control, unedited Jurkats were expanded to 500 × 106 cells, snap frozen in liquid nitrogen and submitted to Cayman Chemicals for MHC-I immunopeptidome profiling. Frozen pellets were lysed, human MHC class I was immunoprecipitated using immunoaffinity resin and peptides were eluted. Capture of MHC class I and peptide removal were validated using enzyme-linked immunosorbent assay. Peptides were concentrated and desalted using solid-phase extraction with Wates µHLB C18 plate. Peptides were loaded directly and eluted using 80/20 acetonitrile/water (0.1% trifluoroacetic acid). Eluted peptides were lyophilized and reconstituted in 0.1% trifluoroacetic acid.

Peptides were analysed by nano liquid chromatography with tandem mass spectrometry using a Waters NanoAcquity system interfaced to a ThermoFisher Fusion Lumos mass spectrometer. Peptides were loaded on a trapping column and eluted over a 75 µm analytical column at 350 nl min−1; both columns were packed with Luna C18 resin (Phenomenex). A 2 h gradient was employed. The mass spectrometer was operated using a custom data-dependent method, with mass spectrometry performed in the Orbitrap at 60,000 full width at half-maximum resolution and sequential tandem mass spectrometry performed using high-resolution collision-induced dissociation and EThcD in the Orbitrap at 15,000 full width at half-maximum resolution. All mass spectrometry data were acquired from m/z 300–800. A 3 s cycle time was employed for all steps.

Raw files were searched using a local copy of PEAKS with the following parameters: Enzyme, None; Database, Swissprot Human + Custom Sequence; Fixed modification, None; Variable modifications; Oxidation (M), Acetyl (Protein N-terminus); Mass values, Monoisotopic; Peptide Mass Tolerance, 10 ppm; Fragment Mass Tolerance, 0.02 Da; Max Missed Cleavages, N/A; PSM FDR, 1%; Chimeric peptide, TRUE. Peptides were further analysed for PTMs (PEAKS PTM) and mutations (SPIDER). The analysis generated a total of 5,255 peptides detected at the 1% peptide-to-spectrum match false discovery rate.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.