Abstract
CRISPR–Cas9 paired with adeno-associated virus serotype 6 (AAV6) is among the most efficient tools for producing targeted gene knockins. Here, we report that this system can lead to frequent concatemeric insertions of the viral vector genome at the target site that are difficult to detect. Such errors can cause adverse and unreliable phenotypes that are antithetical to the goal of precision genome engineering. The concatemeric knockins occurred regardless of locus, vector concentration, cell line or cell type, including human pluripotent and hematopoietic stem cells. Although these highly abundant errors were found in more than half of the edited cells, they could not be readily detected by common analytical methods. We describe strategies to detect and thoroughly characterize the concatemeric viral vector insertions, and we highlight analytical pitfalls that mask their prevalence. We then describe strategies to prevent the concatemeric inserts by cutting the vector genome after transduction. This approach is compatible with established gene editing pipelines, enabling robust genetic knockins that are safer, more reliable and more reproducible.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data generated or analyzed during this study are included in the published article and its supplementary information files. Regarding digital PCR data, raw individual cluster information is included in Supplementary Table 3 in accordance with the 2020 guidelines regarding the minimum information necessary for publication of quantitative digital PCR experiments.
References
Vaidyanathan, S., McCarra, M. & Desai, T. J. Lung stem cells and therapy for cystic fibrosis. In Lung Stem Cells in Development, Health and Disease (eds Nikolić, M. Z. & and Hoganheffield, B. L. M.) 306–321 (European Respiratory Society, 2021).
Itoh, M. et al. Footprint-free gene mutation correction in induced pluripotent stem cell (iPSC) derived from recessive dystrophic epidermolysis bullosa (RDEB) using the CRISPR/Cas9 and piggyBac transposon system. J. Dermatol. Sci. 98, 163–172 (2020).
Wilkinson, A. C. et al. Cas9-AAV6 gene correction of beta-globin in autologous HSCs improves sickle cell disease erythropoiesis in mice. Nat. Commun. 12, 686 (2021).
Khalil, A. M. The genome editing revolution. J. Genet. Eng. Biotechnol. 18, 68 (2020).
Martin, R. M. et al. Highly efficient and marker-free genome editing of human pluripotent stem cells by CRISPR–Cas9 RNP and AAV6 donor-mediated homologous recombination. Cell Stem Cell 24, 821–828 (2019).
Dever, D. P. et al. CRISPR/Cas9 β-globin gene targeting in human haematopoietic stem cells. Nature 539, 384–389 (2016).
Gaj, T. et al. Targeted gene knock-in by homology-directed genome editing using Cas9 ribonucleoprotein and AAV donor delivery. Nucleic Acids Res. 45, e98 (2017).
Charlesworth, C. T. et al. Priming human repopulating hematopoietic stem and progenitor cells for Cas9/sgRNA gene targeting. Mol. Ther. Nucleic Acids 12, 89–104 (2018).
Romero, Z. et al. Editing the sickle cell disease mutation in human hematopoietic stem cells: comparison of endonucleases and homologous donor templates. Mol. Ther. 27, 1389–1406 (2019).
Zheng, Y. et al. Efficient in vivo homology-directed repair within cardiomyocytes. Circulation 145, 787–789 (2022).
Kuzmin, D. A. et al. The clinical landscape for AAV gene therapies. Nat. Rev. Drug Discov. 20, 173–175 (2021).
Hanlon, K. S. et al. High levels of AAV vector integration into CRISPR-induced DNA breaks. Nat. Commun. 10, 4439 (2019).
Nelson, C. E. et al. Long-term evaluation of AAV-CRISPR genome editing for Duchenne muscular dystrophy. Nat. Med. 25, 427–432 (2019).
Koniali, L., Lederer, C. W. & Kleanthous, M. Therapy development by genome editing of hematopoietic stem cells. Cells 10, 1492 (2021).
Haltalli, M. L. et al. Hematopoietic stem cell gene editing and expansion: state-of-the-art technologies and recent applications. Exp. Hemat. 107, 9–13 (2022).
Notta, F. et al. Isolation of single human hematopoietic stem cells capable of long-term multilineage engraftment. Science 333, 218–221 (2011).
Soldner, F. & Jaenisch, R. Stem cells, genome editing, and the path to translational medicine. Cell 175, 615–632 (2018).
Moço, P. D., Aharony, N. & Kamen, A. Adeno‐associated viral vectors for homology‐directed generation of CAR‐T cells. Biotechnol. J. 15, 1900286 (2020).
Bak, R. O. et al. Multiplexed genetic engineering of human hematopoietic stem and progenitor cells using CRISPR/Cas9 and AAV6. eLife 6, e27873 (2017).
Martin, R. M. et al. Improving the safety of human pluripotent stem cell therapies using genome-edited orthogonal safeguards. Nat. Commun. 11, 2713 (2020).
Straathof, K. C. et al. An inducible caspase 9 safety switch for T-cell therapy. Blood 105, 4247–4254 (2005).
Haberman, R. P., McCown, T. J. & Samulski, R. J. Novel transcriptional regulatory signals in the adeno-associated virus terminal repeat A/D junction element. J. Virol. 74, 8732–8739 (2000).
Flotte, T. R. et al. Expression of the cystic fibrosis transmembrane conductance regulator from a novel adeno-associated virus promoter. J. Biol. Chem. 268, 3781–3790 (1993).
Bak, R. O., Dever, D. P. & Porteus, M. H. CRISPR/Cas9 genome editing in human hematopoietic stem cells. Nat. Protoc. 13, 358–376 (2018).
Duan, D. et al. Circular intermediates of recombinant adeno-associated virus have defined structural characteristics responsible for long-term episomal persistence in muscle tissue. J. Virol. 72, 8568–8577 (1998).
Shestak, A. G. et al. Allelic dropout is a common phenomenon that reduces the diagnostic yield of PCR-based sequencing of targeted gene panels. Front. Genet. 12, 62033721 (2021).
Kanagawa, T. Bias and artifacts in multitemplate polymerase chain reactions (PCR). J. Biosci. Bioeng. 96, 317–323 (2003).
McCarty, D. et al. Adeno-associated virus terminal repeat (TR) mutant generates self-complementary vectors to overcome the rate-limiting step to transduction in vivo. Gene Ther. 10, 2112–2118 (2003).
Ma, E. et al. Single-stranded DNA cleavage by divergent CRISPR–Cas9 enzymes. Mol. Cell 60, 398–407 (2015).
Ferrari, S. et al. Choice of template delivery mitigates the genotoxic risk and adverse impact of editing in human hematopoietic stem cells. Cell Stem Cell 29, 1428–1444 (2022).
Nishimura, T. et al. Sufficiency for inducible Caspase-9 safety switch in human pluripotent stem cells and disease cells. Gene Ther. 27, 525–534 (2020).
Chao, M. P. et al. Human AML-iPSCs reacquire leukemic properties after differentiation and model clonal variation of disease. Cell Stem Cell 20, 329–344 (2017).
Ang, L. T. et al. Generating human artery and vein cells from pluripotent stem cells highlights the arterial tropism of Nipah and Hendra viruses. Cell 185, 2523–2541 (2022).
Whale, A. S., Huggett, J. F. & Tzonev, S. Fundamentals of multiplexing with digital PCR. Biomol. Detect. Quantif. 10, 15–23 (2016).
Acknowledgements
We thank K. C. Chan, M. Rivera, F. Zhao and S. Homma for laboratory and administrative support. We thank S. Tsuji for scientific advice. This work was supported by grants from the National Institutes of Health (NIH) (R01DK121851, H.N.; R21OD030009, H.N.; R21OD030529, H.N. and R.M.; and R01HL064274, M.A.K.). F.P.S. received mentorship and financial support from Stanford’s SPARK Translational Research Program; the Stanford Clinical and Translation Science Award to Spectrum (UL1TR003142); the National Science Foundation Graduate Research Fellowship; and the Pat Tillman Fellowship. D.K. was supported by the Japan Society for the Promotion of Science (JP21J01690) and the Osamu Hayaishi Memorial Scholarship for Study Abroad. A.C.F. is supported by the Stanford Graduate Fellowship, the National Science Foundation Graduate Research Fellowship Program and the Stanford Lieberman Fellowship. Toshiya Nishimura was supported by the Japan Society for the Promotion of Science (JP18K14602 and JP18J00499). A.C.W. was supported by the NIH (K99HL150218), the Leukemia & Lymphoma Society (3385-19) and the Edward P. Evans Foundation. J.B. was supported by the International Postdoc Grant from the Swedish Research Council (2017-00344) and by the Assar Gabrielsson Foundation.
Author information
Authors and Affiliations
Contributions
F.P.S. and D.K. conceived the research, performed most experiments and analyzed data. Y.N. performed flow cytometry and HSPC transplant. M.H., J.Z. and I.H. performed ddPCR. K.P. performed the Southern blot. A.C.F. and Toshinobu Nishimura cloned plasmids. C.T.C., J.B., Toshiya Nishimura and A.C.W. analyzed data. H.N., R.M., M.A.K. and F.P.S. acquired funding and supervised the experiments. H.N. and R.M. contributed equally. F.P.S. wrote the manuscript, and all authors edited and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
R.M. is on the advisory boards of Kodikaz Therapeutic Solutions, Orbital Therapeutics, Pheast Therapeutics and 858 Therapeutics. R.M. is a co-founder of and equity holder in Pheast Therapeutics, MyeloGene and Orbital Therapeutics. H.N. is a co-founder of and shareholder in Megakaryon, Century Therapeutics and Celaid Therapeutics. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Biotechnology thanks Casey Maguire and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Comparison of PCR, ddPCR, and Southern blot for genotyping subcloned colonies after AAV6/Cas9-mediated knockin.
a, Schematic showing location of primers in 3-primer in-out PCR. Primers span left-homology arm. b, Schematic showing location of primers in 3-primer in-out PCR. Primers span right-homology arm. c, Schematic showing location of primers in 2-primer PCR genotyping. Primers span entire editing site. d, Theoretical gel electrophoresis after genotyping as shown in panel a-c. WT = wildtype; Mono = monoallelic knockin; Bi = biallelic knockin. e, Gel electrophoresis from PCR genotyping. 36 subclones expanded after knocking in CRE into CD14 locus in PT-iPSCs. Ladder size shown in kb. Expected band sizes shown in panel a-c. Sample number indicated at top and bottom of gels: L = ladder; N = no-template-control. Interpreted genotype indicated by circle at top of each gel: white = WT; gray = monoallelic; black = biallelic; red = cannot be determined. Top, middle, and bottom row of gel indicate PCR-L, PCR-R, and PCR-F genotyping strategies, respectively. f, Schematic showing ddPCR allele counting strategy for genotyping. Four ddPCR targets are multiplexed in a single well for each sample. Primers indicated by arrows, and probes indicated by small rectangles flanked by primers. Ref-1 and Ref-2 indicate trans references used to determine overall cell number. RE indicates restriction enzymes. HEX and FAM indicate probe color. High and Low indicate concentration of probe (used for ddPCR amplitude multiplexing), resulting in high or low clusters shown in panel g. g, Representative two-dimensional ddPCR plot of reaction shown in panel f. h, References from ddPCR genotyping the 36 samples shown in panel e. Average of the two references set to 2 copies/cell. Sample # shown at bottom. i, ddPCR genotyping results. Normalized to references shown in panel h. The reaction was run +/− restriction enzyme (RE) as shown in panel f. Interpreted genotype indicated by circle at bottom graph: white = WT; gray = monoallelic; black = biallelic; red = additional insertions. j, Schematic indicating concatemeric knockin. Without restriction enzyme, concatemeric inserts will be linked. k, Schematic indicating ddPCR droplet partitioning. Without restriction enzyme, linked concatemeric inserts will partition in the same droplet and be counted only once. l, Schematic showing Southern blot genotyping strategy. RE1 and RE2 indicate two different restriction enzymes. Probe designed to bind right homology arm. m, Southern blot of 11 select subclones. Size references indicated on left in kb. Expected sizes shown in panel l. * indicates off-target insertion. Interpreted genotype indicated by circle at top of blot: white = WT; gray = monoallelic; black = biallelic; red = additional insertions.
Extended Data Fig. 2 Comparison of PCR and ddPCR genotyping at AAVS1 locus.
a, Gel electrophoresis from PCR genotyping. 36 subclones expanded after knocking in Ubc-CRE into AAVS1 locus in PT-iPSCs. Ladder size shown in kb. Sample number indicated at top and bottom of gels; L = ladder. Interpreted genotype indicated by circle at top of each gel: white = WT; gray = monoallelic; black = biallelic; red = cannot be determined. Top, middle, and bottom row of gel indicate PCR-L, PCR-R, and PCR-F genotyping strategies, respectively. b, References from ddPCR. Average of the two references set to 2 copies/cell. Sample ID shown at bottom. c, ddPCR genotyping results. Normalized to references shown in panel b. The reaction was run +/− restriction enzyme (RE). Interpreted genotype indicated by circle at bottom graph: white = WT; gray = monoallelic; black = biallelic; red = additional insertions or cannot be determined.
Extended Data Fig. 3 Characterization of concatemers.
a, Concatemer junction-spanning PCR of 11 select samples (CRE knockin into CD14 locus in PT-iPSC—same samples as used in Southern blot in Extended Data Fig. 1). Top: schematic showing location of primers. Size of expected amplicon indicated assumes viral ITRs are joined end-to-end. Bottom: gel electrophoresis of PCR products. Top of gel indicates genotype previously determined by ddPCR and Southern blot. Bottom of gel indicates sample number. L = ladder; N = no-template control. Ladder size indicated in kb. F and R indicate forward and reverse primers, respectively. b, Repeat of panel a with forward primer only. c, Repeat of panel a with reverse primer only. d, Schematic showing sequence of chimeric ITR mapped to regions of viral vector genome. Green triangle indicates AhdI restriction enzyme cutsite. Orange arrows and rectangle indicate primer and probe sites for ITR ddPCR or qPCR (used in panel g-k). e, References from ddPCR analysis of 36 subclones (same samples as in Extended Data Fig. 1). Average of the two references set to 2 copies/cell. Sample number shown at bottom. References are in cis: Ref-1 is upstream (5′) of LHA and Ref-2 is downstream (3′) of RHA. f, ddPCR results of counting LHA and RHA. Analyzed +/− restriction enzyme (RE) used to separate concatemeric regions. Normalized to references shown in panel e. g, qPCR measurement of AAV ITR when analyzed +/− AhdI RE. Y-axis is fluorescent intensity; x-axis is cycle number. Left graph shows DNA extracted from sample #10 (concatemeric monoallelic KI). Right graph shows DNA extracted from sample #14 (monoallelic KI by end-joining). Δ indicates change in cycle threshold when run with and without AhdI. h, qPCR measurement of control primer-probe set Cis Ref-1. Same samples and axis as panel g. i, ddPCR results of counting ITR insertions in the 36 subclones +/− RE (+RE is AhdI and MseI). Normalized to cis-references (not shown). j, Schematic showing ddPCR linkage analysis of ITR and cis-references. Left indicates concatemeric knockin with chimeric ITR inserted. Right indicates end-joining-mediated knockin. Legend at bottom. Dashed lines indicate amplicons are linked. Red X indicates linkage is lost when adding RE. k. Linkage heat map between cis-reference sites and ITRs measured by ddPCR (analyzed with addition of AhdI). Sample number shown at top. Amplicon sites and legend shown in panel j. Top row indicates the % of Ref-1 sites bound to an ITR; middle row indicates linkage between Ref-2 sites bound to an ITR; bottom row indicates the % of Ref-1 sites bound to Ref-2 sites.
Extended Data Fig. 4 Characterization of dual-knockin subclones.
a, Schematic indicating strategy for genotyping double knockin subclones using ddPCR linkage analysis. Cis reference sites shown as small light-gray and dark-gray squares. Ubc-GFP amplicon site indicated by small green square. Ubc-mCherry amplicon site indicated by small red square. Dashed-lines indicate linkage, which can be measured by ddPCR. b, Schematic indicating strategy for counting concatemers in double knockin subclones using ddPCR. Scissors indicate restriction enzymes. Red X indicates linkage lost, such that concatemeric inserts will be separated. c, Images of 72 subclones after dual knocking of Ubc-GFP and Ubc-mCherry into the TET2 locus in PT-iPSCs. Subclones were selected and expanded as shown in Fig. 2e. Left and right side show the same images, processed differently. The left images were processed using auto-contrast prior to stitching. The right images were processed with uniform settings. The upper 24 colonies were selected from the low-gate during FACS and the lower 48 colonies were selected from the high gate as shown in Fig. 2f. White scale bar (lower right) indicates 400 microns. d, Correlation of GFP mean fluorescent intensity (MFI) and copies of Ubc-GFP per cell. MFI measured from uniformly processed images. Ubc-GFP copies/cell measured as shown in panel b. Dashed line indicates linear regression—equation and R2 indicated on graph. e, Correlation of mCherry MFI and copies of Ubc-mCherry per cells (similar to panel d).
Extended Data Fig. 5 Flow cytometry plots and editing rate in human PSCs +/−ITR removal.
a-g, Flow cytometry plots 5 days after double knockin of Ubc-GFP and Ubc-mCherry into human PSCs. Loci (TET2, AAVS1, HBB, RUNX1) and cell line (PT-iPSC, WT-iPSC, H9) shown at top. NC indicates no cut (ITR not removed). IC and DC indicate internal cut and distal cut methods for ITR removal, respectively. GFP and mCherry fluorescent intensity indicated on x- and y-axis, respectively. Gating indicated by quadrants; % cells in each quadrant shown on plot. Double positive cells were sorted into individual wells and used for later clonal analyses. h, Change in polydispersity in flow cytometry quadrants in panel a-g (n = 7, each panel in a-g is one measurement). Q1-Q4 indicate upper left, upper right, lower right, and lower left quadrant, respectively. Polydispersity measured as absolute average deviation and shown relative to NC. Error bars indicate 1 standard deviation; center indicates mean. * indicates P = 0.02 to 0.05. ** indicates P = 0.005. P values are from two-sided paired t-test. i, Legend for panel j-k. j-k, % cells positive for GFP and/or mCherry (j), and % double positive (GFP and mCherry, k), as indicated by panel a-g.
Extended Data Fig. 6 Concatemer rate at various MOIs.
a, Schematic indicating ddPCR linkage analysis strategy for measuring KI frequency in bulk (non-subcloned) samples. Cis reference sites shown as small light-gray and dark-gray squares. Ubc-GFP amplicon site indicated by small green square. Dashed-lines indicate linkage, which can be measured by ddPCR. ID1 and ID2 are small (<100 bp) unique DNA sequences added outside the homology arms in the viral vector genome as shown at the top. b, Schematic indicating ddPCR linkage analysis strategy for measuring concatemeric KI frequency in bulk (non-subcloned) samples. Similar depiction as panel a. ddPCR amplicons for ID1 and ID2 shown as dark purple squares. c, Flow cytometry plots 5 days after KI of Ubc-GFP into the HBB locus in H9 ESCs. Editing performed at various MOIs indicated by number on each plot. Right gate on each plot indicates the positive population, which was sorted and expanded for ddPCR analysis. Side-scatter-area on y-axis and GFP fluorescent intensity on x-axis. Gating indicated by trapezoid; % cells in each gate shown on plot. d, Flow-chart indicating subpopulations analyzed for panel e-h and j-m. e, Editing rate (%GFP positive cells) at various MOIs as indicated by flow cytometry in panel c. f, Knockin rate (%HBB alleles with GFP knockin) at various MOIs as indicated by ddPCR. Black line indicates measurements performed by ddPCR linkage analysis as shown in panel a. Blue dashed-line indicates measurements performed by ddPCR allele counting as shown in Extended Data Fig. 1f. Samples analyzed were GFP+ cells sorted and expanded at each MOI as shown in panel c. The same samples are used for ddPCR analysis in panel g-h. g, Concatemeric knockin rate (%GFP knockin alleles with concatemer) as indicated by ddPCR. h, Average size of concatemeric knockin (for example, 1 = single KI, 2 = a concatemeric knockin with a single repeat, etc.) as indicated by ddPCR. i, Flow cytometry plots 5 days after KI of Ubc-GFP into the IL2RG locus (x chromosome) in H1 (male) ESCs. Similar to panel c. j-m, Similar to panel e-h. Samples analyzed are from panel i.
Extended Data Fig. 7 Comparison of HDR- and NHEJ-mediated knockin rate +/−ITR removal.
a-d, In-out PCR and gel electrophoresis of subclones after dual-knockin of Ubc-GFP and Ubc-mCherry into the TET2 locus in PT-iPSCs. Subclones were selected from NC, IC, and DC groups that contained biallelic knockins without concatemers (determined by ddPCR). Primer location and expected band size shown on schematic at top of each gel. Larger than expected bands, indicated by *, could result from NHEJ-mediated knockin. e, Schematic showing strategy for comparing HDR- and NHEJ-mediated knockin rate by flow cytometry. Mismatched homology arms (HA) measure NHEJ-mediated knockin. Matching homology arms measure HDR-mediated knockin rate. f, Results of editing with matched versus mismatched homology arms as measured by flow cytometry. Number at top of graph corresponds to schematic in panel e. Experiment was performed with the IC method of ITR removal (orange bars) or no removal (gray bars). GFP+ cells measured by flow cytometer more than 2 weeks after editing.
Extended Data Fig. 8 Flow cytometry plots and editing rate in human HSPCs +/−ITR removal.
a-d, Flow cytometry plots 5 days after double knockin of Ubc-GFP and Ubc-mCherry into human CD34+ HSPCs. Loci (TET2, HBB) and repeat number shown at top. NC indicates no cut (ITR not removed). DC indicates distal cut method for ITR removal. GFP and mCherry fluorescent intensity indicated on x- and y-axis, respectively. Gating indicated by quadrants; % cells in each quadrant shown on plot. Double positive cells were sorted into individual wells and used for later clonal analyses. e-g, Flow cytometry plots 5 days after knockin of Ubc-GFP into in male human CD34+ HSPCs at the IL2RG locus (x chromosome). GFP fluorescent intensity indicated on x-axis; side-scatter-area on y-axis. Gating indicated by rectangles; % cells in each gate shown on plot. GFP positive cells were sorted into single wells and used for later clonal analyses. h-i, Flow cytometry plots 5 days after knockin of Ubc-GFP in human CD34+ HPSCs at the HBB locus. Gating indicated by rectangles; % cells in each gate shown on plot. GFP positive cells were sorted and transplanted into mice for later analyses. j, Table summarizing HSPC analyses. Legend for panel k-m. k, % cells positive for GFP and/or mCherry as indicated by panel a-i. l, % cells positive for GFP and mCherry as indicated by panel a-d. m, % CD45+ GFP+ human cells in mouse whole bone marrow 4 months after transplantation. Transplanted cells initially sorted from panel h-i and transplanted into 2 mice each (n = 8 total mice). Error bars indicate 1 standard deviation; center indicates mean.
Supplementary information
Supplementary File 1
Chimeric ITR nanopore sequencing alignment.
Supplementary Tables 1 to 4
Table 1: Primer and probe sequences. Table 2: ddPCR experimental parameters. Table 3: Raw ddPCR results. Table 4: NC/IC/DC subclone genotype summary.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Suchy, F.P., Karigane, D., Nakauchi, Y. et al. Genome engineering with Cas9 and AAV repair templates generates frequent concatemeric insertions of viral vectors. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-024-02171-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41587-024-02171-w
This article is cited by
-
Next-generation CRISPR technology for genome, epigenome and mitochondrial editing
Transgenic Research (2024)