We introduced the 18-bp recognition sequence of the I-SceI homing endonuclease into human HT-1080 fibrosarcoma cells by using an integrating foamy retrovirus vector (CSnZPNO; Fig. 1) to create a chromosomal target site for formation of double-strand breaks (DSBs). CSnZPNO is a shuttle vector that allows G418 selection of transduced mammalian cells and propagation of rescued proviruses as bacterial plasmids. We expanded a G418-resistant clone containing a single provirus (HT-1080/CSnZPNO) and infected it with the murine leukemia virus (MLV) vector LSceISHD expressing I-SceI or with the control vector LXSHD lacking I-SceI (Fig. 1). Southern-blot analysis showed that DSBs formed at a fraction of I-SceI target sites in genomic DNA from cells transduced with LSceISHD (Fig. 2c).

Figure 1: Vectors used in this study.
figure 1

Maps of the AAV vectors (AAV2-RH, AAV2-TOA and AAV2-RNO), foamy retrovirus target site vector (CSnZPNO), MLV retroviral vectors expressing I-SceI (LSceISHD and LSceISP) and control MLV vector (LXSHD) are shown with the following genetic elements: viral long terminal repeats (LTR); cytomegalovirus (CMV), phosphoglycerate kinase (PGK), Rous sarcoma virus (RSV), simian virus 40 (SV40) and bacterial Tn5 promoters; amp, tet, lacZ, hisD, neo, hygro and puro genes; polyadenylation signals (A); and the p15A and pMB1 replication origins. Arrows indicate transcription start sites. Hatched boxes indicate AAV ITRs. Relevant restriction enzyme sites and the I-SceI site in CSnZPNO are shown: E, EcoRV; S, SexAI; M, MfeI; B, BglII.

Figure 2: AAV vectors integrate at induced DSBs.
figure 2

(a,b) Southern blotting of genomic DNAs from hygromycin-resistant HT-1080/CSnZPNO cell clones generated by in vivo exposure to I-SceI and transduction with AAV2-RH. DNAs were digested with SpeI and the same blot was probed with 5′ lacZ (a) or hygro (b) sequences. Asterisks mark lanes showing integration at the I-SceI target site. (c) Southern blotting of genomic DNAs from clonal HT-1080/CSnZPNO digested with BglII (lanes 1–4) and polyclonal MHF2/CSnZPNO fibroblasts digested with EcoRV (lanes 5–8) probed for 5′ lacZ sequences at the CSnZPNO I-SceI target site. Samples were incubated with I-SceI in vitro or exposed to I-SceI in vivo before DNA isolation as indicated. The positions of size standards (in kilobases) are shown. (d) A map of AAV vector provirus junctions in the CSnZPNO I-SceI target site is shown with relevant restriction enzyme sites. Left side junctions are indicated with filled symbols, right side junctions with open symbols. Circles and triangles indicate AAV2-TOA provirus junctions from fibroblasts and HT-1080 cells, respectively. Squares indicate AAV2-RH provirus junctions from HT-1080 cells. Nucleotide sequence surrounding the I-SceI recognition site (in bold with the cleavage site indicated as a staggered line) is shown below the map.

We infected HT-1080/CSnZPNO cells containing I-SceI-induced breaks at the CSnZPNO chromosomal target site with the AAV2-RH vector encoding hygromycin B kinase (Fig. 1). We analyzed 12 hygromycin-resistant cell clones by probing a Southern blot of genomic DNA for target site and AAV vector sequences (Fig. 2). We observed new fragments in several clones (Fig. 2a), indicative of earlier target site cleavage and repair. One of 12 clones contained an AAV vector provirus integrated at the I-SceI site, as shown by hybridization of both target site and AAV vector probes to the same SpeI fragment (Fig. 2a,b) and sequencing of integration junctions (Fig. 3a). We also analyzed integration by rescuing I-SceI target sites as bacterial plasmids from a polyclonal population of hygromycin-resistant HT-1080/CSnZPNO cells transduced by AAV2-RH. Eight of 190 rescued target sites contained the AAV vector (4.2%), as shown by hybridization to vector sequences. In each of these cases, integration occurred at or near the I-SceI site (Figs. 2d and 3a).

Figure 3: Vector-chromosome junctions formed under different conditions.
figure 3

Vector sequences are aligned above I-SceI target sites (ac) or chromosomal sequences (d,e), with junctions indicated by black joining lines. Known sequences recovered from human cells are in upper-case and colored letters: red indicates AAV vector sequence, blue indicates I-SceI target site sequence (ac) or human genomic sequence (d,e), and green indicates areas of microhomology. Nucleotide insertions are indicated with lower-case, underlined letters. Continuations of vector or genomic sequence not present in rescued proviruses are in lower-case letters. Deletion and insertion sizes are noted below each provirus, assuming microhomologies were vector-derived. Junctions were from unselected normal human fibroblasts (c) or HT-1080 cells (a,b) containing cleaved I-SceI target sites transduced with AAV2-RH (a) or AAV2-TOA (b,c), and nucleotide positions were determined by counting from the I-SceI cleavage site (upper strand shown in Fig. 2). Chromosomal junctions (d,e) were from unselected normal human fibroblasts transduced with AAV2-RNO (d) or AAV2-TOA after γ-irradiation (e), and nucleotide positions were from the 1 April 2003 human genome sequence freeze. The nearest gene (as predicted by the National Center for Biotechnology Information Reference Sequence Project) and distance from the closest insertion site are indicated on the right. Provirus clone numbers are indicated on the left.

We used an AAV shuttle vector containing ampicillin and tetracycline resistance genes and a prokaryotic replication origin (AAV2-TOA; Fig. 1) to determine what percentage of integration events occur at an induced DSB. We infected HT-1080/CSnZPNO cells containing chromosome breaks at 5.5% of I-SceI target sites with AAV2-TOA, expanded them without selection and recovered circular fragments containing AAV vector proviruses from genomic DNA as bacterial plasmids that conferred resistance to ampicillin. We also selected with kanamycin to obtain plasmids containing the target site neo gene. This analysis showed that 7.4% of integrated vector proviruses were located at the I-SceI site, and 0.59% of I-SceI sites contained an AAV vector provirus (Table 1).

Table 1 Plasmid rescue of integrated AAV vector proviruses

To study integration in normal cells and at many chromosomal locations, we used a polyclonal primary human fibroblast population containing CSnZPNO target sites at over 104 different locations that was transduced with the MLV vector LSceISP expressing I-SceI (Fig. 1). Southern blots showed that 4.2% of target sites were cleaved by I-SceI in vivo (Fig. 2c), and a fraction of sites were resistant to in vitro digestion, presumably representing earlier DSB repair events (Fig. 2c). We infected fibroblasts containing induced DSBs with AAV2-TOA and rescued integrated proviruses as bacterial plasmids. We found that 7.7% of proviruses were present at the I-SceI target site, and 0.34% of I-SceI sites contained an AAV vector provirus (Table 1). Vector-chromosome junctions were typically located within 10 bp of the induced DSB in both HT-1080 cells and normal fibroblasts (Figs. 2d and 3b,c).

DSBs created by the I-SceI endonuclease may be uniquely processed owing to the complementary DNA overhangs generated by this enzyme. Therefore, we generated DSBs in human fibroblasts by treating them with the type II topoisomerase inhibitor etoposide (3 μM) or by exposing them to γ-irradiation (250 rads), both of which produce multiple DSBs per cell6,7. Etoposide and γ-irradiation increased AAV vector transduction rates 2 and 12 times, respectively, as measured by formation of hygromycin-resistant colonies after infection with the AAV2-RH vector. Etoposide and γ-irradiation also increased integration frequencies 3 and 14 times, respectively, as measured by plasmid rescue of vector proviruses after infection with the AAV2-TOA shuttle vector (Fig. 4a). Although γ-irradiation may have generated other types of DNA damage in addition to DSBs, the effects of etoposide can be attributed directly to DSBs produced by inhibition of topoisomerase II religation7. Therefore, increasing the number of DSBs in a cell increases vector integration, indicating that DSBs are a rate-limiting substrate for the AAV vector integration reaction.

Figure 4: Integration frequencies and proviral structures associated with spontaneous and DSB-induced integration.
figure 4

(a) Graph showing the changes in AAV vector transduction and integration frequencies in MHF2/CSnZPNO cells after treatment with etoposide (3 μM) or γ-irradiation (250 rads), relative to untreated cells (n = 3, with standard deviations). Light gray columns show the increase in AAV2-RH transduction as measured by counting hygromycin-resistant cell colonies. Black columns show the increase in AAV2-TOA integration as measured by the number of integrated proviruses rescued from infected cells as ampicillin-resistant bacterial colonies. Simultaneous rescue of integrated CSnZPNO foamy proviruses as kanamycin-resistant bacterial colonies was used to normalize results, and so values represent the ratio of ampicillin-resistant colonies to kanamycin-resistant colonies. Plasmids from ampicillin-resistant colonies were sequenced to identify vector-chromosome junctions and exclude episomal vector forms. (b) Plot of chromosomal insertion (y axis) and deletion (x axis) sizes for each sequenced provirus in Figure 3 (n = 42) and in an earlier HeLa cell study3 (n = 6). (c) Map of AAV vector ITR junction sites from the same proviruses as in b and others where only one junction could be sequenced, with each symbol representing a junction at the adjacent nucleotide. Flip and flop ITR configurations, AAV Rep binding site (RBS) and terminal resolution site (TRS) are indicated. Junctions that removed the flip or flop hairpins are shown in the flip orientation. Open circles represent proviruses integrated at I-SceI target sites. Filled symbols represent spontaneous (circles) or γ-irradiation-induced (triangles) proviruses found at random locations.

We compared the vector-chromosome junction sequences of proviruses found at I-SceI cleavage sites (Fig. 3a–c) or recovered from fibroblasts treated with γ-irradiation (Fig. 3e) with those of proviruses formed by spontaneous integration in the absence of induced DSBs. We recovered eight spontaneous AAV vector proviruses from unselected fibroblasts, sequenced them and mapped them to chromosomal locations in the human genome (Fig. 3d). Four proviruses were in genes, with two lying only 171 kb apart (clones 27 and 30). Similarly, of 12 proviruses recovered from cells treated with γ-irradiation, 5 were in genes (Fig. 3e). All types of integration events were associated with vector and chromosomal deletions, insertions of additional nucleotides and microhomologies between the vector and chromosome. For direct comparison, we plotted data from 49 different AAV vector proviruses isolated from human cells. The sizes of chromosomal deletions were similar whether integration occurred spontaneously (median = 22 bp), at I-SceI sites (median = 11 bp) or after γ-irradiation (median = 23 bp), except that some spontaneous and γ-irradiation-induced integration events were associated with large (>85 kb) chromosomal deletions (Fig. 4b). The lack of large deletions at I-SceI site proviruses is expected, as this would have removed the target site neo gene and replication origin used for plasmid rescue. We also found small insertions in all three types of proviruses (Fig. 4b). Each type of AAV vector provirus contained similar terminal deletions, and in many instances, chromosome-vector junctions occurred at identical nucleotides in the vector inverted terminal repeats (ITRs) (Fig. 4c). Therefore, both the chromosomal changes at integration sites and the structures of proviral vector genomes were similar whether integration occurred spontaneously or at induced DSBs, suggesting that they occurred through a common mechanism.

Our results suggest that randomly integrated AAV vector proviruses found in untreated cells are located at sites of spontaneous DSB formation. Exposure of cells to DNA-damaging agents or DNA synthesis8 can lead to DSB formation and can increase stable transduction by AAV vectors9,10. The preference for integration at transcribed sequences11 could be explained by a greater likelihood of DSB formation during transcription12 or improved access to DSBs in transcribed regions13. Integration hot spots may represent sites where chromosomes frequently break. Such regions could be the origin of deletions, insertions or translocations responsible for human diseases, and as such, they warrant further study.

The data in Table 1 allow us to estimate the average number of DSBs present in a human cell. In normal human fibroblasts containing 0.042 I-SceI DSBs per cell, there were 12 spontaneous integrants for every I-SceI site integrant. Assuming that all integration occurs at DSBs, and that spontaneous DSBs behave like I-SceI-induced DSBs, there were 0.5 spontaneous DSBs per cell (12 × 0.042). The same calculation in HT-1080 cells gives a value of 0.7 spontaneous DSBs per cell. These low values are consistent with the expected toxicity of DSBs and with earlier estimates based on staining for γ-H2AX6,14. A low level of DSBs is sufficient to account for all integration events, suggesting that it could be the primary pathway of AAV vector integration.

Based on the sequences of rescued proviruses, we propose a model for AAV vector integration in which chromosomal ends and vector termini are processed before ligation (Fig. 5a). At the chromosomal DSB, single-stranded regions are generated by 5′-3′ exonuclease activity, and nucleotides are sometimes added by DNA synthesis from the 3′ terminus. The AAV terminus requires an additional nicking step within the hairpins of the ITR to generate a DSB. After processing, the exposed single-stranded regions of the chromosome and vector termini align at microhomologies before the final ligation step. Frequently, the template used to generate the insertions found at integration sites was either the strand from the other DSB terminus or the same strand folded back on itself (Fig. 5b). In one case, a 159-bp insertion was derived from a different chromosome (Fig. 3a), perhaps by a similar DNA synthesis step. The consistent presence of insertions, deletions and microhomologies at AAV vector-chromosome junctions is markedly similar to the products of nonhomologous end joining15,16,17 used for both DSB repair and lymphocyte gene rearrangements, suggesting that the same proteins are involved.

Figure 5: Model for AAV vector integration.
figure 5

(a) Chromosomal DSBs are processed by 5′ to 3′ exonuclease digestion or the addition of nucleotides by DNA polymerase (broken arrow) to yield a 3′ single-stranded DNA overhang. AAV vector genomes are first nicked to generate similar overhangs. The two substrates are aligned at areas of microhomology (green uppercase Ns), unpaired flaps are trimmed, single-stranded gaps are filled by DNA synthesis, and the DNA ends are ligated. Chromosomal sequences are in light and dark blue, and AAV vector sequences are in red. (b) Insertions generated during AAV vector integration are of three types (1, 2 or 3) depending on which DNA strand is used for a synthesis template. The substrate (light blue) and template (dark blue) ends of the cleaved I-SceI sequence (in upper-case letters) are shown with the terminal nucleotides at cleaved I-SceI sites in bold font. Alignments with insertions in red (arrow in 5′ to 3′ direction) generated by template-directed DNA synthesis are shown. Microhomologies that may have aligned templates before DNA synthesis are indicated as boxed nucleotides. Extensions of nucleotides not shown are indicated by dots. Clone numbers are as in Figure 3. Clone 29 shows a spontaneous AAV vector integration.

Our findings have implications for gene therapy, where AAV vectors are becoming more commonly used. First, one should not assume that AAV vectors cause the deletions, insertions and rearrangements previously noted at provirus integration sites3,11, as these can also occur when DSBs undergo normal cellular repair processes16,17,18,19. Second, the preexisting chromosomal breaks that serve as vector integration sites are already prone to mutation, and so exposure to AAV vectors may not increase spontaneous mutation rates. In support of this, we found that mutation rates at the X-linked gene hypoxanthine phosphoribosyl transferase (HPRT) were not statistically different in uninfected fibroblasts and those infected with an AAV vector (7 × 10−7 versus 9 × 10−7 mutations per cell per division, respectively). But AAV vector integration may still produce distinct types of mutations, as new genetic elements encoded by the vector could influence neighboring gene expression, including possible proto-oncogenes. Third, vector integrations observed in clinical and preclinical studies can now be interpreted as DSB capture events, providing insight into the amount and location of DNA damage in normal mammalian tissues.


Plasmids and DNA manipulations.

Plasmids pA2LAPSN9, pA2RHbSN20, pBR322, pBRpm21, pCMV-I-SceI-3xnls18, pCnZPNO20, pDG22, pLSceISHD20, pLXSHD23 and pSC101 have been described. We obtained the retroviral vector helper plasmid pCI-VSV-G from G. Nolan (Stanford University. We made the foamy retrovirus vector plasmid pCSnZPNO from pCnZPNO by annealing 5′ phosphorylated oligonucleotides (sequences available on request) and inserting them at the NotI site between the CMV promoter and translation start site of the nuclear-localized lacZ gene to create an I-SceI recognition site. We made the plasmid pLAPSP by replacing the neo gene in pLAPSN24 with a puromycin cassette. We made pLSceISP by inserting the I-SceI gene from pCMV-I-SceI-3xnls into pLAPSP. We constructed p101LacZ2.5 by replacing the backbone fragment of pA2LAPSN with the pSC101 origin of replication and chloramphenicol resistance gene. We constructed pA2TOA by digesting pBRpm with EcoRI, end-filling with Klenow polymerase, adding BglII linkers and ligating it to the BglII backbone fragment of p101LacZ2.5. pARNO is similar to pA2SNO25, with the SV40 promoter replaced with RSV promoter sequences. Plasmid sequences are available on request. We purified plasmids using a plasmid maxi kit (Qiagen). We isolated genomic DNAs and carried out Southern blotting by standard techniques. We quantified Southern-blot hybridization signals by Phosphorimager analysis (Molecular Dynamics).

Cell culture.

We grew cells at 37 °C in 5% CO2 in Dulbecco's modified Eagle's medium containing 4 g of glucose per liter (Gibco/Invitrogen), 10% heat-inactivated fetal bovine serum, penicillin and streptomycin. We obtained primary, normal male human fibroblasts (MHF2) from the Coriell Institute for Medical Research. HT-1080 human fibrosarcoma cells26, Phoenix-GP cells27 and 293T27 cells have been described. We generated HT-1080 and MHF2 cells containing proviral target sites by transducing them with foamy virus vector CSnZPNO and selecting with G418 (0.3 mg of active compound per ml). We isolated clones of HT-1080 cells with cloning rings and found by Southern-blot analysis that the HT-1080/CSnZPNO line we used contained a single CSnZPNO provirus. We derived G418-resistant polyclonal MHF2 populations from >104 independent transduction events (CSnZPNO), as determined by seeding dishes with dilutions of transduced cells and counting the number of G418-resistant colonies. We selected AAV2-RH-transduced HT-1080 cells with 150 μg ml−1 of hygromycin B (Calbiochem), LXSHD- and LSceISHD-transduced HT-1080 cells with 5 mM L-histidinol dihydrochloride (Sigma) in histidine-free medium and LSceISP-transduced HT-1080 and MHF2 cells with 0.7 μg ml−1 of puromycin dihydrochloride (Sigma).

Vector preparations.

We made concentrated, helper-free foamy virus vector preparations of CSnZPNO with plasmid pCSnZPNO as described28 and determined their titers by G418 selection of transduced HT-1080 or MHF2 cells. We generated MLV vectors LSceISP, LSceISHD and LXSHD by transient transfection of Phoenix-GP cells with pCI-VSV-G and vector plasmids pLSceISP, pLSceISHD and pLXSHD, respectively (1:1 ratio). We replaced the culture medium 16 and 48 h later, collected filtered (0.45 μm), conditioned medium after a 16-h exposure to cells and concentrated the medium 50–100 times by centrifugation29. We determined MLV vector titers by selection with puromycin or L-histidinol. Transduction with MLV vectors was done with polybrene (4 μg ml−1 concentration; Sigma-Aldrich) added to the medium. We made serotype 2 AAV vectors AAV2-RH, AAV2-RNO and AAV2-TOA by transfecting 293T cells with helper plasmid pDG and vector plasmids pA2RHbSN, pARNO and pA2TOA, respectively; treating cell lysates with benzonase; purifying them by iodixanol step gradient; and subjecting them to heparin affinity column chromatography (HiTrap, Amersham Biosciences)30 and HiTrap desalting column. AAV vector titers were based on the amount of full-length single-stranded vector genomes detected by alkaline Southern-blot analysis.

I-SceI cleavage and AAV vector infection.

We generated DSBs at I-SceI target site loci by seeding clonal HT-1080 or polyclonal MHF2 cells containing the target site provirus (CSnZPNO) at 5 × 105 cells per 10-cm dish on day 1; infecting them with MLV vector LSceISP, LSceISHD or control vector LXSHD on day 2 (multiplicity of infection (MOI) of one transducing unit per cell); and changing the culture medium on day 3. On day 4, we distributed cells to three 10-cm dishes and selected for the MLV vector with puromycin or L-histidinol. On day 7 (puromycin selection) or day 10 (L-histidinol selection), we detached cells with trypsin and used them for genomic DNA isolation or transduction by AAV vectors. Transduction by AAV2-RH was done by seeding 5 × 104 cells per well in 12-well dishes on day 1, infecting with AAV2-RH on day 2 (MOI of 1 × 103 genome-containing particles per cell), transferring an equal number of cells to two separate 10-cm dishes on day 3 and selecting with L-histidinol and hygromycin B on day 4. On day 12, we isolated colonies with cloning rings or stained them with Coomassie Brilliant Blue G and enumerated them. Infection of cells with AAV2-RNO or AAV2-TOA (without selection) was done by seeding 1 × 106 cells per 6-cm dish on day 1, infecting at a MOI of 3 × 104 vector-containing particles per cell on day 2, transferring cells to a 15-cm dish on day 4 and culturing until confluent (usually day 11) for DNA isolation.

AAV vector infection after γ-irradiation and etoposide treatments.

We grew MHF2/CSnZPNO cells to confluence in six-well dishes and serum-starved them as previously described27. We infected cultures with AAV vectors (MOI of 1 × 103 genome-containing particles per cell) on day 1 and treated them with etoposide (3 μM final concentration; Sigma) or with 250 rads of γ-irradiation (GammaCell 40; AEC, Kanata) on day 2. On day 3, we replaced the medium (without etoposide). On day 4, we detached the cells with trypsin and plated them in either a 10-cm dish to determine cell viability (400 cells) or a 15-cm dish to allow for cell proliferation (remaining cells). On day 5, we began hygromycin B selection in the 15-cm dish infected with AAV2-RH. No selection was applied to cells infected with AAV2-TOA. On day 14, we stained colonies from the 10-cm dishes and those selected with hygromycin B with Coomassie Brilliant Blue G and prepared genomic DNA from nonselected cells infected with AAV2-TOA. Plating efficiencies were 93% and 70% of untreated cells for etoposide and γ-irradiation treatments, respectively.

Shuttle vector rescue in bacteria.

We rescued CSnZPNO, AAV2-RNO and AAV2-TOA shuttle vectors by digesting 20 μg of genomic DNA containing integrated proviruses with SexAI (CSnZPNO), MfeI (AAV2-TOA) or EcoRI (AAV2-RNO), extracting it with phenol and chloroform and precipitating it with ethanol. We resuspended DNA fragments in 355 μl of water, brought the volume to 400 μl with 40 μl of 10× ligation buffer and circularized the DNA by adding 5 μl of T4 DNA Ligase (400 U μl−1; New England Biolabs). We incubated ligations at 15 °C overnight, extracted them with phenol and chloroform and precipitated them with ethanol. We resuspended the DNA pellets in 5 μl of water and used them to transform Escherichia coli strain DH10B by electroporation with 4 μg (1 μl) of DNA. We grew bacteria containing rescued plasmids on agar containing 50 μg ml−1 of kanamycin, 50 μg ml−1 of ampicillin or both antibiotics. We identified plasmids derived from circular AAV vector episomes (70%) by restriction digests and excluded them from the analysis.

HPRT mutagenesis assay.

To determine the mutation rate at the HPRT locus, we seeded 5 × 105 hypoxanthine-aminopterin-thymidine (HAT)-selected MHF2 cells per 10-cm dish on day 1 in medium with HAT, infected them with AAV2-RNO at an MOI of 3 × 104 vector-containing particles per cell in medium without HAT on day 2 (or no virus as a control) and split them equally into three 15-cm dishes on day 3, except for 300 cells that were plated in one 10-cm dish. On day 8, we detached the cells in the 15-cm dishes with trypsin, counted them and seeded them at 1 × 105 cells per 15-cm dish in five dishes and plated 300 cells in one 10-cm dish. The dishes receiving the 300-cell aliquots were stained with Coomassie Brilliant Blue G 8 d after seeding to determine plating efficiencies throughout the experiment. We initiated 6-thioguanine selection (10 μg ml−1 concentration) on day 9 and replaced the medium every 3 d. We stained resistant colonies with Coomassie Brilliant Blue G on day 25 and enumerated them. We calculated HPRT mutation rates by dividing the total number of 6-thioguanine-resistant colonies by the total number of cells seeded (corrected for plating efficiencies) and the total number of cell divisions that took place after adding the AAV vector and before 6-thioguanine selection.