Molecular characterization of TCF3::PBX1 chromosomal breakpoints in acute lymphoblastic leukemia and their use for measurable residual disease assessment

The translocation t(1;19)(q23;p13) with the resulting chimeric TCF3::PBX1 gene is the third most prevalent recurrent chromosomal translocation in acute lymphoblastic leukemia and accounts for 3–5% of cases. The molecular background of this translocation has been incompletely studied, especially in adult cases. We characterized the chromosomal breakpoints of 49 patients with TCF3::PBX1 and the corresponding reciprocal PBX1::TCF3 breakpoints in 15 cases at the molecular level, thus providing an extensive molecular overview of this translocation in a well-defined study patient population. Breakpoints were found to be remarkably clustered not only in TCF3 but also in PBX1. No association with DNA repeats or putative cryptic recombination signal sequence sites was observed. A simplified detection method for breakpoint identification was developed and the feasibility of patient-specific chromosomal break sites as molecular markers for detecting measurable residual disease (MRD) was explored. A highly sensitive generic real-time PCR for MRD assessment using these breakpoint sequences was established that could serve as a useful alternative to the classical method utilizing rearranged immune gene loci. This study provides the first extensive molecular data set on the chromosomal breakpoints of the t(1;19)/TCF3::PBX1 aberration in adult ALL. Based on the obtained data a generic MRD method was developed that has several theoretical advantages, including an on average higher sensitivity and a greater stability of the molecular marker in the course of disease.

been overcome by modern therapy regimens and TCF3::PBX1 currently defines a group of ALL patients with a good clinical outcome in childhood ALL 16,[6][7][8][9] , although these patients appear to have an increased risk for CNS involvement at diagnosis 10 .The prognostic impact of TCF3::PBX1 in ALL in older patients (age > 15 years) is less well defined, and relatively few molecular-based and controversial data have been published in this area [11][12][13][14][15] .TCF3::PBX1-positive ALL has been found to express the receptor tyrosine kinase ROR1, which may serve as a therapeutic target in the future 16,17 .Some promising therapeutic in vitro effects have been observed with the SRC inhibitor dasatinib 18 and the phosphatidylinositide 3-kinase delta (p110δ) inhibitor idelalisib 19 .However, no established targeted therapy currently exists for TCF3::PBX1-positive patients, and the assessment of measurable residual disease (MRD) remains the most important tool in therapy stratification and prognostication.
Data on the molecular details of the t(1;19)(q23;p13) translocation in adult ALL are scant.The following work analyzed 49 TCF3::PBX1-positive, clinically well-defined adult cases, identified the chromosomal break sites, and characterized the molecular background of this translocation, thus providing a detailed and extensive molecular overview of this translocation.A method for the easy identification of the breakpoint sites is presented, and the potential utilization of these chromosomal breakpoints for detecting measurable residual disease is demonstrated.

Results
Rationale for use and development of a long range-inverse PCR method.One type of chimeric RNA transcript was predominantly found in TCF3::PBX1-positive patients, showing a fusion of TCF3 exon 16 (reference sequence NG_029953.2) and PBX1 exon 3 (reference sequence NG_028246.2) 26 .Other transcripts have been described, but they seem to be very rare 27 .Chromosomal breaks can thus be assumed to occur in the intron 3′ of TCF3 exon 16 ("intron 16") and in the intron 5′ of PBX1 exon 3 ("intron 2").The TCF3 reference sequence includes a 3289 bp intron 16 (ncl 1615822-1619110, NC_000019.10,GRCh38.p13primary assembly).This intron is present in all 41 TCF3 variants listed in the NCBI gene database (updated on 1-Aug-2020).The location of the breakpoint site on chromosome 1 is less clear (8 PBX1 transcript variants with either a 229,182 bp intron 2 (ncl 164563312-164792493) or a 166,397 bp intron 2 (ncl 164626097-164792493). Since the breakpoint region on chromosome 19 appeared to be relatively localized, a long range-inverse PCR (LRI PCR) approach was chosen for the analysis.Commercially available restriction enzymes were screened for those with restriction sites flanking the putative breakpoint region on chromosome 19.Three enzymes were suitable because they had palindromic cutting sites without degenerate nucleotides, produced sticky ends and were frequent cutters: SphI, BamHI and TaqI (Fig. 1A).BamHI had one cutting site 148 bp 5′ of the TCF3 intron end; thus, breakpoints near the intron end could not be detected using this enzyme.The three enzymes provided dense coverage of PBX1 intron 2 with restriction sites (Table S1 and Figure S1).Various PCR primers and PCR conditions were tested for the development of the inverse PCR method.The efficacy of inverse PCR could be optimized because the three enzymes produced a detectable "control" PCR product when testing normal DNA.The final PCR primer locations are depicted in Fig. 1A.
To complement this analysis, all samples were also investigated for intragenic IKZF1 deletions by PCR.These deletions are found in approximately 20% of B-cell precursor ALL cases and are known to be caused by aberrant www.nature.com/scientificreports/VDJ recombinase activity.None of the 49 samples showed an intragenic IKZF1 deletion.This does not exclude a possible role of RAG-mediated secondary aberrations in TCF3::PBX1-rearranged ALL as illustrated by the example of ETV6::RUNX1-positive pediatric ALL 286,29 .Chromosomal translocations are occasionally associated with DNA secondary structures, such as inverted repeats with hairpin loops 30 , and thus, the hotspot region of TCF3 was analyzed with RNAfold.The main break site was located in an open loop that was flanked by regions with relatively strong base pair binding (Fig. S8).The analysis of the 15 cases in which reciprocal PBX1::TCF3 were characterized showed mostly no microhomologies at the break sites, with frequent insertion of nontemplate nucleotides, suggesting a nonhomologous end-joining repair (NHEJ) mechanism 31 .One sample (4297) showed an insertion from the FGF6 gene on chromosome 12, a gene not previously implicated in the pathogenesis of ALL (Fig. S3).
Development and optimization of a real-time qPCR method.The clustering of chromosomal breaks in a narrow region in TCF3 intron 16 suggested a quantitative PCR method with a common forward primer, a common dual-labeled hybridization probe 5′ of the breakpoint cluster region and a patient-specific reverse primer 3′ of the breakpoint.Several forward primers and dual-labeled probes were first tested on patient samples and control DNA to exclude spurious amplifications.Finally, one combination of a common forward primer and a common dual-labelled probe was selected that was tested on 15 randomly chosen patient samples (Table 2, Fig. 3).In all 15 cases, it was possible to design a reverse PCR primer that yielded data with good sensitivity and specificity.The testing of further samples was not possible because of shortage of sample material.This generic real-time qPCR was designed to quantify breakpoints in the TCF3 hotspot region (~ 80%).Breakpoints outside this region (and likewise the PBX1::TCF3 breakpoints) could theoretically also be used as MRD targets, but in these cases, no generic recipe can be given, and individual patient-specific qPCRs would have to be constructed.

Development and optimization of two multiplex long range PCRs.
Since breakpoint identification by long range-inverse PCR is an elaborate procedure and since the breakpoints showed clustering in certain regions, efforts were made to simplify the detection procedure.Two multiplex long-range PCRs with a series of PCR oligonucleotides covering the entire breakpoint regions were developed and optimized that allowed the detection of breakpoints in the two breakpoint cluster regions of PBX1.Examples of these multiplex long-range PCRs are shown in Fig. 1C.

Discussion
The translocation t(1;19)(q23;p13) has been described as mostly unbalanced 156,[32][33][34] .This is in accordance with the observation made in this study that in only 14 cases (33%) a reciprocal break site could be characterized.

Clustering of breakpoints. Since the first description of the translocation t(1;19) as a recurrent aberration
in ALL in 1984, research has largely focused on cytogenetic aspects of this aberration, and few investigations have been carried out in adult ALL 276,346,35 .Wiemels et al. 36 first systematically investigated the translocation at a molecular level and described 24 cases from various pediatric ALL studies.The median age of the patients was 6.8 years, with only one patient being an adult > 18 years of age.A similar clustering of breakpoints was observed, www.nature.com/scientificreports/and the authors speculated that aberrant VDJ recombinase activity might be involved.They identified a reciprocal breakpoint in 5 (21%) cases 36 .
In this study, no association of t(1;19) chromosomal breaks with repetitive DNA elements was found.While the location of the break cluster in TCF3 intron 16 close to a MER20 element could be coincidental, there was no similar association of the breaks mapping to PBX1 intron 2. Similarly, no direct association with cryptic RSS was observed.None of the 49 patient samples showed an intragenic IKZF1 deletion-an aberration caused by illegitimate VDJ recombination-mediated deletion, and present in approximately 20% of BCR::ABL1-negative B       precursor ALLs.Recently, Liu et al. analyzed the TCF3 "fragile zone" and suggested that the initial TCF3 breakage may arise at a CpG site.They found a statistically significant proximity of the activation-induced cytidine deaminase (AID) hotspot motifs WRC and WGCW near the TCF3 breakpoints W = A or T, R = A or G) suggesting AID involvement in the break process 37 .This is consistent with the fact that TCF3::PBX1 is predominantly detected in pre-B ALL, which is immunophenotypically the most "mature" entity in B precursor ALL, indicating a relatively late stage of B-cell development.

Real-time qPCR for measurable residual disease detection. Measurable residual disease in ALL is
usually assessed by the use of clonally rearranged immunoglobulin (IG) and/or T-cell receptor (TCR) loci for the construction of real-time quantitative PCRs (qPCRs) 38 .The main advantage of this approach is its universal applicability.Theoretically, it can be applied in any malignant disease of lymphatic origin.However, this method also has some disadvantages.In a significant minority of cases, it is not possible to identify clonal rearrangements, and with the introduction of next generation sequencing techniques it has become apparent that IG/ TCR rearrangements are often in fact polyclonal at diagnosis 39 .IG/TCR-based MRD monitoring is thus often based on only one of several clones, and such an analysis may miss the decisive clone.IG/TCR-based qPCRs frequently show a suboptimal sensitivity (below 10 -4 ), because of the difficulty of constructing a specific PCR against a highly homologous background.In addition, the IG/TCR rearrangements are potentially unstable, and further rearrangements can occur without loss of the malignant cell phenotype, leading to false negative results.
In those cases where chromosomal translocations lead to the expression of a chimeric mRNA transcript, MRD monitoring can also be performed by the relative quantification of this transcript 40 .However, this approach has been widely discarded in ALL (with the exception of BCR::ABL1), because it only allows a quantification relative to a "housekeeping gene", assumed to be stably expressed."Dormant" tumor stem cells with low expression of the oncogene may escape detection by RT-PCR.This is exemplarily illustrated by the observation that in BCR::ABL1-positive ALL, only a limited correlation between BCR::ABL1-mRNA-based and IG/TCR-based MRD levels is found 41 .Additionally, RNA is relatively unstable and significantly more difficult to handle than DNA.
An alternative approach is targeting the breakpoint sites of chromosomal translocations to detect and monitor MRD by constructing patient-specific qPCR assays.These are stable molecular markers that cannot be lost in the course of disease because they are linked to molecular drivers of the disease.This approach has been exploited in various entities, such as ALL with t(12;21)/ETV6::RUNX1 426,43 , ALL with 11q23/KMT2A aberrations 446,45 , ALL or CML with t(9;22)/BCR::ABL1 416,46,47 and other hematopoietic malignancies [48][49][50] .In most of these cases, it could be shown that the break site-specific PCRs were at least as reliable as the IG/TCR-based methods and yielded a superior sensitivity.The main disadvantages of this approach are the technical difficulties posed by the individual characterization of break sites which may in some cases be dispersed over hundreds of kilobases of genomic DNA, precluding this approach for routine clinical studies with the exception of KMT2A-rearranged ALL, where relatively standardized techniques for break site identification are in use 51 .With the increasing availability of next-generation sequencing techniques and their technical advances (e.g., nanopore sequencing or mate-pair sequencing) and better knowledge of the molecular background these difficulties are likely to be overcome in the future and MRD detection methods based on chromosomal breakpoints will become increasingly important 526,53 .

Conclusions
The present work characterizes the t(1;19) chromosomal breakpoints of a large number of adult ALL patients from a well-defined study population and is the largest and the first major investigation on this topic in adult ALL.The results provide a representative and relatively unbiased overview of the molecular details of this aberration.Based on the experimental results, a simplified method for the rapid identification of chromosomal breakpoints is proposed and the usefulness of these chromosomal breakpoint data for measurable residual disease detection is demonstrated.While the theoretical advantages of such an MRD approach appear obvious, clinical studies are necessary to validate the TCF3::PBX1 breakpoint fusion as MRD marker in a clinical context.Further testing and comparisons will have to be performed to fully establish TCF3::PBX1 breakpoints as valuable MRD targets.

Methods
Patient samples and ethics statement.Patient samples were collected from residual diagnostic material obtained between 2001 and 2021 in the context of the German Multicenter ALL Therapy Studies (clinicaltrials.govidentifiers: 00199056 and 00198991).Patients gave written informed consent to scientific investigations on study inclusion and the studies were approved by local and central ethics committees, among them an ethics board of the Goethe University, Frankfurt/Main, Germany and the ethics board of the Charité Universitätsmedizin, Berlin, Germany.Our study complied with the principles set forth in the World Medical Association Declaration of Helsinki.
DNA isolation.DNA was isolated from archived or fresh samples using either the Gentra PureGene method (QIAGEN, Hilden, Germany), the AllPrep DNA/RNA Kit (QIAGEN) or in a few cases the DNA preparation from TRIzol (ThermoFisher Scientific, Darmstadt, Germany) with subsequent DNA purification.

Table 2 .
Real-time quantitative PCR parameters.The table shows the slopes of the standard curves, the correlation coefficients (R2) of the standard curves and the difference in Ct values between successive dilutions in the standard curve.