The RAG complex can misrecognize and cleave pseudo-signals at lymphoid translocation breakpoints at LMO2, Ttg-1 and SIL6. We tested the RAG-misrecognition hypothesis at the Bcl-2 Mbr, although there is no apparent V(D)J heptamer/nonamer sequence within it (Supplementary Fig. 1). To test whether the Bcl-2 sequence serves as a recombination signal, a 300-base-pair (bp) Bcl-2 sequence (containing the 150-bp Mbr region) was positioned 250-bp upstream or downstream of a single 12- or 23-signal sequence on an episome that can replicate in human cells. We transfected the substrates into the human pre-B-cell line Reh, and recovered them 48 h later (Supplementary Fig. 2).

The boundaries of the recovered recombinant molecules were characterized. None of the breakpoint boundaries was at the 12- or 23-signals; rather, the recombinants were far to one side or the other of each signal and clearly not related to V(D)J recombination. However, a disproportionate number of recombination boundaries were in the Bcl-2 Mbr (Table 1). The recombination frequency (0.056%) was 160-fold lower in our constructs than in a substrate with a pair of optimal 12/23-signals (recombination frequency, 9%) in Reh cells. The pattern of the recombination breakpoints within the Bcl-2 Mbr was striking. About 77% (27 out of 35) of the breakpoints were within the Mbr, and another five of the 35 events were immediately outside (within five to seven nucleotides) of the Bcl-2 Mbr (Table 1). Interestingly, most of these 27 events were within peaks I, II or III of the Mbr (data not shown). The difference between the 77% in the Mbr versus the expected random level of 48% is highly significant (P < 0.006). T-nucleotides (templated nucleotides) were also present at the junctions of these recombinants, a signature feature of t(14;18) in patients (data not shown)7.

Table 1 Breakpoint frequencies in plasmids from cells expressing RAG proteins

Given that Reh cells have high endogenous levels of RAG proteins, we wondered whether the Bcl-2 Mbr breaks are caused by RAGs. To test the in vivo contribution of RAG proteins to Mbr recombination, we used the human cell line 293T, which does not carry out any detectable V(D)J recombination. We transfected this cell line with a plasmid (pXW5) bearing the Bcl-2 Mbr (paired with a single 23-signal) with or without RAG1 and RAG2 expression vectors or with mutant RAG1 and RAG2 expression vectors8,9. We observed that the recombination frequencies are consistently about three- to fourfold higher in the cells transfected with RAGs than in those without RAGs or with mutant RAGs (Table 1). The recombination frequency of pXW5 (0.004%) in the presence of RAGs is about 25-fold lower than that of a substrate with a pair of optimal 12/23-signals (recombination frequency of 0.1% in fibroblasts transfected with RAG vectors9). More importantly, the distribution of the breakpoints was markedly influenced by the presence of the RAG proteins. With RAGs, the breaks were predominantly within the Bcl-2 Mbr, with particular predilection for peaks I, II and III (76.5%, 26 out of 34 events) (Table 1).

We observed clustering of breaks at the Bcl-2 Mbr side of the deletion events, but the other boundary of the deletions (recombinants) above were not at the 23-signal. In clinical cases of the Bcl-2 translocation, breaks at the immunoglobulin heavy chain occur during attempted rearrangement between D and J elements that use 12- and 23-signals (Supplementary Fig. 1). Therefore, we designed a substrate to check the recombination efficiency of the Bcl-2 Mbr when both the 12- and 23-signals are present rather than one alone (Fig. 1). This plasmid, pSCR45, was then transfected into 293T cells with and without human full-length RAG expression vectors, or with a mutant RAG1 vector10.

Figure 1: In vivo breakpoints on an extrachromosomal substrate cluster within the Bcl-2 Mbr and are dependent on the RAG complex.
figure 1

The episome (pSCR45) was transfected into mammalian cells (293T cells), recovered 40 h later and analysed in bacteria as described in the Methods. a, Comparison of breakpoint frequencies in recombinant plasmid molecules. DA, the number of substrate molecules that replicated in the 293T cells. Total DAC, bacterial transformants that are ampicillin–chloramphenical (double) resistant (also referred to as recombinants). The total DAC for each row is subdivided according to the left and right boundaries of recombination. For the left boundary, DACMbr is the subset of double-resistant recombinants that have a breakpoint within the Bcl-2 Mbr. DACNot Mbr is the subset of recombinants with breakpoints anywhere within the 326-bp region upstream or downstream of the Mbr. For the right boundary, DAC23 are recombinants that use the 23-signal sequence for recombination. DACNot 23 are recombinants that do not use the 23-signal for recombination. (DACMbr/DA) × 100 is the recombination frequency for all events that use the Mbr. (DAC23/DA) × 100 is the recombination frequency for all events that use the 23-signal. The frequency of recombinants obtained is indicated as a percentage (% Events). The row labelled 293 indicates recombinants obtained from 293T cells in the absence of RAGs; 293 + RAGs indicates that cells were also transfected with full-length RAG expression vectors; and 293 + Mut RAGs indicates that the cells were transfected with mutant RAG1 and full-length RAG-2 vectors. The asterisk indicates that this single event may represent a random break that just happens to be within the 13-bp region adjacent to the heptamer of the 23-signal8. b, Regions of recombination relative to the sequence of pSCR45. The Mbr region is shown in dark grey. The light grey regions are outside of the Mbr but are the flanking regions that naturally are adjacent to the Mbr in the human chromosome. The 12-signal is indicated by the open triangle, and the 23-signal by the filled triangle. The transcriptional promoter (short arrow upstream of the Mbr), the transcription terminator (Stop) and the chloramphenicol gene (Cat) are indicated.

In common with the observations described above for one-signal substrates (Table 1), double-signal substrates have a breakpoint distribution that is strikingly dependent on the Mbr (24 out of 29 events) (Fig. 1a). The preference for breaks in the Mbr occurs despite twice as much DNA surrounding the area where breaks could occur (150 bp in the Mbr compared with 326 bp both upstream and downstream). In more than half of the events (15 out of 29), the second break is at the 23-signal, such that the coding end formerly attached to that signal is now joined to the Bcl-2 Mbr break. Such a precise use of the 23-signal only in the simultaneous presence of a 12-signal clearly indicates the involvement of the RAG complex. The presence of both a 12- and a 23-signal permits a recapitulation of the D to J joining process, whereas a single signal does not (Table 1; see also Supplementary Fig. 2). This indicates that the Bcl-2 Mbr recombination with the coding end of the 23-signal is likely to be an interruption of the normal 12/23-recombination. In the absence of RAGs or with mutant RAGs, there is little or no cleavage at the 23-signal (Fig. 1a).

Given that the Bcl-2 Mbr recombined with a coding end adjacent to a 23-signal, the involvement of the RAG proteins is clear. But if the RAG complex recombines the Mbr in vivo, is it possible that it might cleave the Mbr in a purified system consisting only of the RAG complex and the Mbr DNA? To test this, we incubated the Bcl-2-Mbr-bearing plasmids with the purified RAG complex under standard physiological divalent cation (Mg2+) buffer conditions for RAG cleavage reactions. Although the core and full-length RAG complexes do not nick the top strand of the Mbr, they consistently nick the bottom strand to an extent that is similar to its nicking of a standard 23-signal (Supplementary Fig. 3). Active-site mutant RAGs and controls (which do not have RAGs) showed no nicking (see Supplementary text).

These in vitro nicking results by RAGs are consistent with the in vivo substrate recombination. However, what feature of the Mbr does the RAG complex recognize if the Mbr does not function as a recombination signal?

In the absence of any heptamer/nonamer function by the Bcl-2 Mbr, we wondered whether there is a structural basis for the fragility at this 150-bp region. To test for single-strandedness of the Bcl-2 Mbr, we used chemical-probing methods on genomic DNA11,12. To study the single-strandedness of chromosomal DNA at the single-molecule level, bisulphite-treated DNA is polymerase chain reaction (PCR)-amplified, cloned and sequenced. Bisulphite converts unpaired cytosines to uracil, and these become thymine after PCR amplification, thus allowing the detection of single-stranded regions11.

The chromosomal DNA extracted from Reh cells was used for bisulphite treatment. A PCR fragment of 528 bp containing the Bcl-2 Mbr amplified from treated DNA was cloned, sequenced and analysed. Most cytosine conversions occur in peaks I and III of the Mbr region (Fig. 2). If we display the results for individual strands, we find that 28% of human chromosomal alleles have long single-stranded regions at peaks I and III of the Bcl-2 Mbr. (Overall 15.5% of the cytosines are converted; Fig. 3.) There is also a tendency for consecutive cytosines to be converted (seven or more cytosine residues distributed over 14–49 bp) (Figs 2 and 3). These regions of single-strandedness suggest the existence of a non-B-form single-stranded DNA conformation at the Mbr in this human pre-B-cell line.

Figure 2: Bisulphite reactivity at the Bcl-2 Mbr on chromosomal DNA.
figure 2

A 528-bp fragment containing the Mbr amplified from chromosomal DNA after bisulphite treatment is shown. The 150-bp Bcl-2 Mbr region is expanded to show the complete sequence. The three breakpoint peaks are indicated by three short horizontal lines between the top and bottom strands (see Supplementary Fig. 1). Bisulphite sensitivity is shown for the 250 bp downstream of the Mbr and 125 bp upstream of the Mbr. Vertical incremental bars (vertical dashes) above the line indicate the sensitivity for the top strand, and bars below the line indicate the sensitivity for the bottom strand; each vertical bar represents a cytosine conversion on one molecule. The total numbers of molecules sequenced from the top and bottom strands are indicated at the right margin.

Figure 3: The bimodal nature (B-form DNA versus non-B-form) of the structural configuration of chromosomal DNA at the Mbr.
figure 3

The upper panel shows bisulphite sensitivity of molecules from the top strand, whereas the lower panel shows the corresponding bottom strand data. In both panels, each row of circles represents one DNA molecule. Each filled circle represents an instance of a bisulphite-converted cytosine (single-stranded), whereas open circles represent cytosines resistant to bisulphite (that is, double-stranded DNA). Note that only the C residues are depicted, but a scale to relate the length along the DNA is shown at the bottom. The designations I, II or III correspond to the three translocation frequency peaks within the Mbr. The population of molecules includes a mixture of those that are B-form and those that have substantial focal and recurring regions of single-strandedness (non-B-form).

Controls at nine unrelated genomic sites showed that the Bcl-2 Mbr is distinctive for the lengths of DNA that are single-stranded (see Supplementary text and Supplementary Fig. 5). We confirmed the single-stranded character of the Bcl-2 Mbr in chromatin by diffusing two other types of chemical probes (KMnO4 and OsO4) into viable cells. Results from these experiments also indicated that there is a single-stranded character at the Mbr (see Supplementary text and Supplementary Fig. 6).

To further investigate the requirements of non-B-DNA structure formation at the Bcl-2 Mbr, plasmid pXW5, bearing the Bcl-2 fragment, was propagated as a replicating human minichromosome in Reh cells, harvested 42 h later and subjected to the bisulphite modification assay. The results are indistinguishable from those observed above for the chromosomal DNA (see Supplementary Fig. 7). This indicates that, on an episome, the 300-bp region containing the Bcl-2 Mbr is sufficient to assume the single-stranded conformation that is observed in the human chromosome, regardless of the sequence of the neighbouring regions. Again both B-form and non-B-form conformations exist among the population of molecules and in a similar proportion to that observed in the chromosome (Figs 2 and 3).

We wondered whether plasmid DNA bearing the Bcl-2 Mbr and harvested from bacteria would be able to form the non-B-DNA structure. pXW5 DNA was extracted using a non-denaturing method and subjected to the bisulphite modification assay. The results obtained after sequencing a 930-bp fragment show a high bisulphite sensitivity in the Bcl-2 Mbr (Fig. 4a). The single-stranded regions are restricted to peaks I and III (Fig. 4a, c, d), which is the case for the human chromosomal and minichromosomal studies of the same DNA region. The conversion frequency at the Bcl-2 Mbr is 22.5% when the plasmid DNA is supercoiled, and this is 3.9-fold higher than the background. The Mbr in linearized plasmids is still hyper-reactive with bisulphite relative to the surrounding DNA, although the overall reactivity is reduced by 1.8-fold (Fig. 4b).

Figure 4: The bisulphite sensitivity of the Bcl-2 Mbr on plasmid DNA.
figure 4

The pXW5 plasmid DNA was purified from Escherichia coli by a non-denaturing method (see Supplementary Methods). Following bisulphite modification, a 930-bp DNA fragment containing the Bcl-2 Mbr was PCR-amplified and sequenced. a, b, Supercoiled plasmid DNA (a) and linear plasmid DNA (b) (linearized by BglII digestion) is shown. c, d, Distribution of bisulphite reactivity on molecules with seven or more C to T conversions in a string on the top (c) or bottom (d) strands of the Bcl-2 Mbr on supercoiled pXW5. (See Supplementary Fig. 7 legend for more details.)

PCR fragments of shorter length are able to reproduce the non-B conformation based on the bisulphite reactivity, gel mobility shift and by P1-sensitivity analysis (see Supplementary text).

We have reproduced key aspects of the t(14;18) translocation on extrachromosomal DNA substrates transfected into human lymphoid cells. On substrates bearing the Bcl-2 Mbr and only one of either the 12- or 23-signal, no in vivo recombinants using the 12- or 23-signal were detected, even though breaks at the Mbr were quite focused. However, when we included a pair of 12- and 23-signals, Mbr recombination specific to the 23-signal was readily detectable. The simplest explanation for this is that one or both of the coding ends are released during a 12/23 paired V(D)J recombination event13 and simultaneously a break at the Bcl-2 Mbr is generated by the RAGs. Subsequently, the break at the Mbr is joined to a coding end from the failed V(D)J recombination reaction in ways that are diagrammatically indistinguishable from the t(14;18) translocation (Supplementary Fig. 1a). The Mbr break does not seem to involve direct pairing with any 12- or 23-signal; otherwise, we would have seen recombination on substrates that carry the Mbr and only one signal.

We have noted that the single-strandedness at both peaks I and III varies between individual molecules (Figs 3 and 4). For peak II, there is no marked degree of single-strandedness, and yet it, like peaks I and III, is subject to translocation. We propose that the single strands at peaks I and III interact such that the region between them (peak II) is a site of marked DNA bending. Although the correlation between the single-stranded regions and the translocations is quite good at peaks I and III, there remains the possibility that in vivo proteins bound to these single-stranded and adjacent double-stranded regions may modify further the conformations, thereby explaining the lack of single-strandedness at peak II.

Why would the RAG complex cleave the non-B-form DNA structure at the Bcl-2 Mbr? There are two known structures that the RAG complex has been demonstrated to cleave in a sequence-independent manner, and both have stable single-stranded character, in common with the structure here. First, the RAG complex (and other transposase enzymes) cleaves single-stranded 3′ overhangs in Mg2+-containing buffers14. This may reflect some common structural features between 3′ overhangs and the presumed single-strandedness thought to exist at the border of the heptamer, where RAGs normally cleave15. Second, the RAG complex opens DNA hairpins in Mn2+-containing buffers16. This RAG-mediated hairpin opening is inefficient in Mg2+ buffers, and it may not be physiologically relevant to hairpin opening17, but it illustrates that RAG proteins have some low level of activity on such non-B configurations. Could the non-B-form structure at the Bcl-2 Mbr somehow be a target for nicking? This appears to be the case, given the efficient nicking at or very near the three peaks of the Bcl-2 Mbr. We note that we cannot rule out the possibility that any of a number of other structure-specific nucleases contribute to cleavage at the Mbr; however, the low level of focused cleavage in cells that do not express RAGs would argue that other nucleases can only play a minor role, if any at all.

In this study, we have clearly demonstrated that non-B-form DNA structural alterations can occur in the human genome and that these alterations can demarcate the precise boundaries of sites of recurrent chromosomal breakage. We have also provided evidence that the RAG complex is responsible for the t(14;18) translocation in vivo and in vitro and that its role here is different from ones invoked by previously described mechanisms.


Plasmid construction and V(D)J recombination assay

The plasmid constructs were made by modifying the SV40-based plasmid, pGG51 (ref. 8). See Supplementary Methods.

Transfection of the 293T cells with pXW5 or pSCR45 along with full-length RAG1/2 (Fig. 1), mutant RAG1/full-length RAG2 (Fig. 1 and Table 1) or core RAG1/2 (Table 1) was done using the calcium-phosphate method as described earlier9. The coordinates of the core RAGs are the same as those used for protein production below. Plasmid DNA was alkaline harvested and analysed as described in the Supplementary Methods.

Bisulphite modification assay

The bisulphite modification assay was used as described previously11 (see Supplementary Methods).

Ligation-mediated PCR

For details of the ligation-mediated PCR see Supplementary Methods.

In vitro RAG nicking assay

Core murine glutathione S-transferase (GST)–RAG1 (amino acids 330–1040) and GST–RAG2 (amino acids 1–383), or MBP murine core RAG1 and RAG2, or core RAG1 and full-length MBP RAG2 proteins were overexpressed in the human 293T cells and purified as previously described10,18. See Supplementary Methods.