Myeloid neoplasms and acute leukemias encompass many clinically independent entities, some with unique genetic features and optimized treatment approaches. The accurate assignment of patients to stratified risk groups can be a difficult, expensive and slow process, requiring intensive laboratory studies including morphology, cytochemistry, immunophenotyping, cytogenetics and molecular diagnostics. One of the most important laboratory features used to categorize leukemia cases according to the World Health Organization (WHO) 2008 classification is the presence of specific balanced chromosomal translocations that in many cases determine therapy.1 For example, acute lymphoblastic leukemia (ALL) with t(1;19) responds poorly to conventional antimetabolite-based treatment but has cure rates approaching 80% when treated with more intensive therapies. In acute myeloid leukemia (AML), the genetic alterations inv(16) and t(8;21) are associated with a favorable prognosis, good response to chemotherapy and relatively long complete remission. In some cases, the presence of a single alteration dictates the use of a specific drug such as all-trans-retinoic acid for t(15;17) positive acute promyelocytic leukemia or an ABL1 tyrosine kinase inhibitor for t(9;22) positive chronic myelogenous leukemia (CML).

The advent of targeted therapy underscores the need for rapid and accurate genetic diagnostic methods. Major disease-defining, diagnosis-relevant and therapy-determining data may be provided by molecular methods.1, 2, 3, 4 Fluorescence in situ hybridization (FISH) is widely used clinically for detecting chromosome translocations at the DNA level.3 However, it can be labor-intensive, is not amenable to multiplexing and can lack sufficient resolution to detect or differentiate certain genetic alterations. Reverse transcription-PCR (RT-PCR) has been shown to be a sensitive tool for the detection of chimeric transcripts resulting from recurrent chromosome translocations.5, 6, 7 RT-PCR analysis has numerous advantages over conventional cytogenetics, including shorter turn-around time, no requirement for dividing cells, detection of translocations that may be missed by conventional cytogenetics (cryptic translocations) and identification of specific markers for subsequent monitoring of minimal residual disease with sensitive molecular methods. As there are many distinct genetic alterations characteristic of various leukemia subtypes, it is difficult or impractical to carry out a series of single RT-PCR detection assays, especially when the total amount of RNA is limiting.

Multiplex testing is a practical approach for detection of the most relevant targets of interest in a single reaction with minimal sample input and shorter turn-around time. For routine clinical use these methods must be compatible with the equipment and workflow of the molecular diagnostic laboratory. They also must be evaluated for clinical accuracy relative to standard cytogenetic methods and must maintain the appropriate sensitivity and specificity relative to other well-characterized molecular methods. Several studies have established the feasibility, analytical performance and clinical utility of multiplex RT-PCR combined with various downstream detection methods, from gel-based techniques, to capillary electrophoresis, bead array or microarray.4, 5, 8, 9, 10, 11, 12, 13, 14, 15, 16 These studies, however, either focused on a subset of leukemia markers, types and patient populations4, 5, 8, 9, 10, 11, 15, 16 or did not use truly single-reaction multiplexed assay format4, 5, 8, 9, 13, 14 or were performed at a single institution.4, 5, 8, 9, 10, 11, 12, 13, 14, 15, 16 Perhaps because of the lack of systematic and comprehensive validation, standardized multiplex RT-PCR methods are not yet used routinely in clinical laboratories. Here, we report the first broad multicenter evaluation of a qualitative multiplex molecular assay (MMA) for 12 fusion transcripts associated with CML, ALL or AML and its performance relative to cytogenetics and independent molecular methods. The objective of our retrospective study was to assess a representative set of clinical specimens at four independent sites to determine whether multiplex RT-PCR is a reliable method for the rapid and accurate detection of diagnostically and prognostically relevant RNA markers in the clinical laboratory.

Materials and methods

Clinical specimens

Peripheral blood or bone marrow specimens were collected, processed and archived as part of standard clinical care using site-specific procedures. Metaphase cytogenetic and interphase FISH analyses were performed at sites 1 (Johns Hopkins University), 2 (Birmingham Women’s NHS Foundation Trust) or 3 (Children’s Hospital Colorado) using independent laboratory-validated methods. Results were reviewed and interpreted by certified cytogeneticists in their respective institutions. Total RNA was purified using site-specific laboratory-validated methods. The methods used at sites 1 and 2 were both based on the RNeasy Mini Kit (Qiagen, Hilden, Germany); the method used at site 3 was based on the TRIzol reagent (Life Technologies, Carlsbad, CA, USA). Residual RNA samples from sites 1, 2 and 3 were evaluated with the MMA in their respective institutions. Fifty independent samples from site 1 were also tested at site 4 (Asuragen, Austin, TX, USA). Purified total RNA was quantified using a NanoDrop ND1000 (NanoDrop Technologies, Waltham, MA, USA) at sites 1 and 4. All human specimens used in this study were residual purified RNA samples de-identified and evaluated according to protocols approved by their respective institutions.

Molecular methods

Multiplex reactions for 12 fusion transcripts and an endogenous control transcript (GAPDH) were performed in 96-well plates using the Signature LTx v2.0 Kit (CE-marked in vitro diagnostic device at site 3, for research use only at sites 1, 2 and 4) according to the instruction for use (Asuragen Inc.). Briefly, total RNA (constant volume of 5 or 3 μl at sites 2 and 3, constant input of 400 ng at sites 1 and 4) was heat-denatured at 70°C for 10 min and reverse transcribed into cDNA in a 20 μl reaction for 45 min at 42 °C. The resulting cDNA (5 μl) was amplified by multiplex PCR with 2.5 units of AmpliTaq Gold (Applied Biosystems, Carlsbad, CA, USA) in a 20 μl reaction (45 cycles consisting of 94 °C for 30 s, 55 °C for 30 s, 72 °C for 30 s). The PCR products (5 μl) were then hybridized to a mixture of 12 probes conjugated to 12 unique carboxylated xMAP beads (Luminex Corp., Austin, TX, USA) in a 50 μl reaction for 30 min at 52 °C. Following addition of 25 μl of reporter Solution, the probe-bound PCR products were detected by flow cytometry on a Luminex 200 system (Luminex Corp.). A minimum of 50 beads for each fusion transcript (bead ID no. 14, 16, 18, 30, 31, 34, 55, 63, 65, 70, and 73) and the endogenous control GAPDH (bead ID no. 22) were analyzed to determine the median fluorescent intensity (MFI) signal for each probe. The positive, negative and no RNA controls provided with the Kit were included in every run. A GeneAmp PCR System 9700 (Applied Biosystems) was used for the RT-PCR and hybridization steps at sites 1, 3 and 4. A DNA Engine Tetrad 2 (Bio-Rad, Hercules, CA, USA) was used at site 2. The protocols for the modified prototype panels are described in Ye et al.17

The reference molecular methods used at site 2 were individual qualitative RT-PCR or quantitative real-time RT-PCR tests based on the European BIOMED-1 Concerted Action7 and Europe Against Cancer18 protocols. The reference methods used at site 3 were individual qualitative tests based on end-point RT-PCR and agarose gel analysis as previously described.19, 20, 21, 22, 23, 24, 25 For bi-directional sequencing, PCR amplifications were carried out with pairs of primers flanking the breakpoint or site of interest at site 4 and the resulting PCR products were sequenced using the BigDye terminator method (ACGT Inc., Wheeling, IL, USA). Primer design and exonic sequences for MLL–AFF1 fusion transcripts were according to van Dongen et al.7

Analytical experiments

Total RNA from cultured cell lines was isolated with the RNeasy Mini Kit according to manufacturer’s instructions for cultured cells (Qiagen). When indicated, total RNA was diluted in purified HL60 total RNA (Applied Biosystems) keeping the concentration of total RNA constant. Synthetic RNAs corresponding to specific fusion transcripts, splicing variants or mutations were prepared using standard in vitro transcription methods and diluted in HL60 total RNA keeping the concentration of total RNA constant. All plasmids and cell line RNA used at site 4 were characterized by bi-directional sequencing to confirm the identity of the fusion transcript or mutation and the presence of the binding sites for the primers and probes used in the molecular assays. Analytical sensitivity experiments at site 3 (Figure 2d) were performed using Clonal Control RNA IVS-0003, -0011, -0032 and -0035 as described by the manufacturer (Invivoscribe, San Diego, CA, USA). The limit of blank (LOB) study was performed at site 4 by testing in triplicate the no RNA control provided with the Signature LTx v2.0 Kit and 400 ng of HL60 total RNA in 12 independent runs with three operators, three thermal cyclers and three Luminex systems over multiple days (36 replicates). Normal range was evaluated at site 4 with total RNA purified from whole blood specimens from asymptomatic control donors using two different procedures: collection in EDTA vials and purification with the QIAamp RNA Blood Mini Kit (Qiagen, n=4) or collection and purification with the PAXgene Blood RNA System (PreAnalytiX, Hombrechtikon, Switzerland, n=4). The eight purified RNA were tested at two input volumes, 1 or 6 μl, representing a mass input range from 75 to 1400 ng per RT reaction across the 16 replicates.

Data analysis

For all experiments, a sample was called positive for a given fusion transcript if the corresponding bead-conjugated probe generated a MFI signal greater than a cutoff value set at 350 MFI. Percent agreements in Tables 4 and 5 were calculated by comparing the qualitative positive or negative call obtained with the MMA against the presence or absence of the corresponding seven translocations as determined by metaphase and molecular cytogenetics. Agreement relative to independent molecular tests was calculated using the qualitative positive or negative call obtained with both methods including the specific type of fusion transcript detected when both methods were able to distinguish between the variants. The single sample that failed at site 2 (ID no. 105 with no GAPDH signal) was not included in the calculation of percent agreements. Quantitative analyses of MFI signals at the probe level were performed in Excel (Microsoft Corp., Redmond, WA, USA). LOB values, representing 95% of the observed values, were calculated according to current clinical guidelines assuming a one-sided normal-like distribution of negative signals.26 For Figure 3, MFI signals were transformed to log base 10 values after removal of the null values (no MFI detected for 171 probes or 0.5% of all probe signals generated in the study). The positive category included all the single-probe signals detected above 350 MFI in true-positive samples at each site. The negative category included the signals generated by the 11 probes in all true negative samples at each site plus the signals below 350 MFI generated by the 10 non-positive probes in all true-positive samples at each site.


Study design rationale and sample set

The MMA used in this study was developed and optimized for the qualitative detection of 12 fusion transcripts resulting from seven chromosomal abnormalities commonly found in CML, ALL or AML (Table 1). Specificity for individual fusion transcripts was incorporated by design at both the PCR amplification and the detection steps (see Materials and Methods) and was analytically validated using synthetic in vitro transcripts and total RNA from leukemic cell lines (Figure 1). To evaluate the performance of the MMA in the clinical setting, a set of 281 archived RNA samples was selected, blinded and retrospectively tested at four independent sites (Table 2). The sample set was designed to be representative of the breadth of clinical presentations and sample types usually found in molecular diagnostic labs, to assess all the targets included in the panel, and to challenge both the clinical sensitivity and specificity of the assay.

Table 1 MMA panel design
Figure 1
figure 1

Representative MMA output. Each row represents the results from a single multiplex PCR amplification hybridized onto 12 target-specific probes in a single reaction. The resulting MFI signals generated by each probe-bound PCR product are shown for three control samples, a total RNA sample isolated from a translocation-negative cell line (HL60), 12 different synthetic fusion transcripts prepared by in vitro transcription and spiked in a background of HL60 RNA (400 ng input) and eight total RNA samples purified from translocation-positive cell lines (400 ng input). Target-specific positive signals above the qualitative cutoff (350 MFI) are highlighted. The three controls are designed to assess the validity of the multiplex amplification, hybridization and detection steps in every batch/run.

Table 2 Study design

A total of 146 specimens were positive by cytogenetics for one of the translocations relevant to the MMA and another 135 specimens had a normal karyotype or various chromosomal abnormalities not included in the MMA panel (Tables 2 and 3). The sample set represented a variety of myeloid neoplasms and acute leukemias with a majority of AML (44%), ALL (30%) and CML (12%), and a few cases of myelodysplastic syndrome (n=8), acute leukemia of mix lineage (n=6), or non-CML myeloproliferative neoplasms (n=3). The ‘other’ category in Table 3 included one poorly differentiated hematological malignancy, one blastic plasmacytoid dendritic cell neoplasm, one Felty syndrome and one treated T-cell prolymphocytic leukemia with no residual disease, all negative for relevant translocations by cytogenetics. Overall, each of the seven chromosomal abnormalities in the MMA panel was represented by at least eight independent specimens (Table 3). Additional information for all specimens in the study including FISH, karyotype and molecular data are presented in Supplementary Table 1.

Table 3 Sample set

Performance relative to cytogenetics

To measure assay performance the qualitative positive/negative call obtained with the MMA for specific fusion transcripts was scored against the qualitative positive/negative call obtained for the corresponding seven chromosomal abnormalities by cytogenetics. Percent agreements between methods were very high at each site (97.1–100%) and for all fusion transcripts (99.3–100%) (Table 4). A single sample failure was identified at site 2 by the absence of endogenous control signal corresponding to a failure rate of 0.36% overall (1.4% at site 2). A total of four discrepancies was observed. At site 1, two follow-up samples from CML or ALL cases in cytogenetic remission (ID no. 55 and 69) were found positive for BCR–ABL1 e14a2 or e1a2 with a relatively low MMA signal (768 and 946 MFI, respectively). At site 3, samples ID no. 146 and 184 from two t(4;11) positive ALL cases at presentation had a MLL–AFF1 probe signal below the qualitative cutoff and were therefore reported as negative with the MMA. The resulting positive, negative and overall percent agreements for the 280 samples combined and the corresponding 95% confidence intervals are reported in Table 5.

Table 4 Summary of MMA results by site and fusion transcript
Table 5 Overall MMA performance

At the MMA probe level, a total of 3372 individual results were generated during the study (281 samples × 12 multiplexed assays) and 3080 data points were used for direct comparison against cytogenetics (280 samples × 11 fusion-transcript-specific probes). The overall agreement at the individual probe level was 99.8% (3074/3080; 95% confidence interval: 99.6–99.9%). In addition to the four discrepant signals described above, there were two low false-positive signals observed at site 3 for the PML–RARA probes specific for bcr1 or bcr3 (398 or 444 MFI for samples ID no. 209 and 204, respectively; Supplementary Table 1). Both signals were generated in t(15;17) positive samples with a very strong true-positive signal, >7000 MFI, on the other PML–RARA probe. This cross-hybridization was likely caused by a large excess of PCR product resulting from a high input of RNA at site 3 (see discussion). This phenomenon was not observed in 19 independent PML–RARA-positive samples tested at other sites where the background signal on these probes ranged from 0 to 166 MFI (75 MFI on average, 4485 MFI for the positive PML–RARA signals).

Discrepancy analyses

Among the 18 cases at follow-up included in the study (Table 2), 16 generated the expected results, including 7 in cytogenetic relapse and positive for 1 of the translocations represented in the MMA panel (Supplementary Table 1). The two apparent false-positive samples ID no. 55 and 69 were in cytogenetic remission following treatment at the time of analysis but were initially positive for t(9;22) at diagnosis, suggesting that low levels of residual BCR–ABL1 transcripts may have been correctly detected by the MMA. To assess the analytical sensitivity of the MMA, RNA input range studies were performed at site 4 with the leukemic cell lines generating the highest (RUNX1–RUNX1T1) and lowest (BCR–ABL1 e1a2 and ETV6–RUNX1) positive MMA signals (see Figure 1). Typical sigmoidal response curves were observed with a plateau at 100–1000 ng input depending on the cell line and no hook effect, that is, no signal inhibition at high input of target RNA (Figure 2a). BCR–ABL1 and ETV6–RUNX1 were reproducibly detected in 10 ng of total RNA and RUNX1–RUNX1T1 in as low as 1 ng. Serial dilutions experiments showed robust detection of the different targets at 0.1–1% dilution with a signal at least two- to three-fold above the cutoff and the expected input-dependent response (Figures 2b and c). BCR–ABL1 was detected at 1% dilution with 400 ng total RNA input and at 5–10% with 100 ng input. Independent experiments at site 3 using commercially available controls confirmed that the MMA can reproducibly detect all three BCR–ABL1 fusion transcripts at a level equivalent to 1% of t(9;22) positive cells (Figure 2d).

Figure 2
figure 2

Evaluation of analytical sensitivity. (a) Total RNA purified from the indicated translocation-positive cell lines was tested with the MMA at 1000, 100, 10 or 1 ng per RT reaction. (b) The same total RNA samples were tested either undiluted (100%) or diluted at 10, 1% or 0.1% in a background of total RNA isolated from the translocation-negative cell line HL60 at a final input of 400 ng per RT reaction. (c) Same experiment as in (b) at 100, 10 or 1% dilution and 100 ng input. (d) BCR–ABL1 sensitivity controls tested in duplicate with the MMA at 600 ng input. The graphs show the average MFI signals generated by the target-specific probes (black bars) and by the GAPDH endogenous control probe (white bars) relative to the 350 MFI cutoff value (dash lines). The complete data set is presented in Supplementary Table 2.

Assay accuracy at the level of individual fusion transcript variants was also compared against independent transcript-specific, clinically validated, laboratory-developed tests (LDT). Among the 153 samples with historical LDT data evaluated at sites 2 and 3, 58 samples with different BCR–ABL1, CBFB–MYH11 or PML–RARA variants were all correctly sub-classified by the MMA (Table 4 and Supplementary Table 1). Overall, there was 100% negative agreement (52/52) and 98% positive agreement (99/101) (Table 5). The two false-negative samples were the two discrepant MLL–AFF1 samples already identified in the cytogenetics comparator study, ID no. 146 and 184. As the qualitative LDT used at site 3 reports the presence of MLL–AFF1 fusion transcripts but does not specify which variant is detected, we further investigated those samples. RT-PCR amplification with individual primer pairs showed that both transcripts encompassed the AFF1 exon e5 and the MLL exons e10 and e11 (Supplementary Figure 1). Bi-directional sequencing established that sample ID no. 184 contained the variant e11e4 while sample ID no. 146 had an atypical long variant containing MLL exon e12 (e12e5). Thus, both samples did not contain one of the MLL–AFF1 variants targeted by the MMA (e9e5 and e10e4) and the MMA correctly classified those samples as negative.

Quantitative analysis

To assess the performance of the MMA relative to the qualitative positive/negative cutoff at 350 MFI, we performed a quantitative analysis of the signal distributions at each site and for each type of signal, positive, negative and endogenous control. The distributions and median values for the 3360 probe signals generated in the study were different at each site (Figure 3). The positive distributions were tight at sites 1 and 4 with 80% (37/46) of the positive probe signals between 3000 and 6000 MFI. At sites 2 and 3 the distributions were wider and there was a significant difference in signal intensity. The median signal at site 3 was about twofold higher than at site 2 with 80% (32/40) of the positive signals at site 3 above 3500 MFI and only 40% at site 2 (24/59). The endogenous control GAPDH signals followed a similar pattern. Probe signals were all above 3000 MFI at sites 1 and 4 with 80% (101/127) of the signals in the 4000–6000 MFI range but varied widely from 643 to 7995 MFI at sites 2 or 3 with only 37% (56/153) in the 4000–6000 MFI range. The negative-probe signals, generated by both negative and positive samples, were low at all sites with 100% of the 2073 negative signals at sites 1, 2 and 4 below 250 MFI and 99.6% (857/862) below 250 MFI at site 3 (Figure 3).

Figure 3
figure 3

Quantitative analysis of signal output. The box plot shows the distribution in the log space of the positive (Pos), negative (Neg) and endogenous control (GAPDH) signals generated at each of the four sites for the 280 samples tested with the MMA. The boxes represent the 25th, 50th (median) and 75th percentiles of the signal distributions for each category. The tails of the distributions are indicated by whiskers corresponding to 1.5 times the interquartile range (IQR=75th percentile value minus the 25th percentile value). The median MFI values for each signal distribution and the qualitative 350 MFI cutoff value (dash line) are also shown.

To further establish the range of negative signals in well characterized, true negative samples, a LOB study was conducted by repetitively testing three types of fusion transcript-negative samples at site 4. The distributions for the 968 negative-probe signals generated across 88 replicates were similar for a no RNA control, a no fusion transcript cell line RNA, and RNA samples from asymptomatic control donors extracted with two different methods and tested at two inputs (Figure 4a). Analysis at the probe level also showed similar performance for individual probes (Figure 4b). Mean signals (80–107 MFI), standard deviations (35–45 MFI) and maximum signals (173–218 MFI) were all in the same range. The calculated LOB values (mean plus 1.645 times the s.d.) on this data set were at least twofold lower than the cutoff value for all probes but BCR–ABL1 e1a2 (2 × LOB=294–353 MFI; Figure 4b). These data confirmed that the cutoff at 350 MFI was appropriate to safely and accurately classify both negative and positive samples independently of the pre-analytical and analytical variations between sites.

Figure 4
figure 4

Limit of blank study. (a) Summary of the MFI probe signals obtained from repeat testing with the MMA of a no RNA control sample, a total RNA sample purified from the translocation-negative HL60 cell line, and eight total RNA samples purified from asymptomatic control donors’ white blood cells (WBC control RNA). Minimum (MIN), maximum (MAX), median, mean and s.d. (STDEV) values for the 11 fusion-transcript-specific probes combined are shown for each sample type and overall. (b) Results by probe type for all sample types combined. The graph shows the mean, maximum (MAX) and twice the limit of blank (2 × LOB) values for each of the 11 fusion-transcript-specific probes relative to the qualitative 350 MFI cutoff value (dash line). The error bars represent the s.d. of each probe-specific distribution. The complete data set is presented in Supplementary Table 2.

Panel expansion

Many genetic alterations are relevant to the accurate classification of leukemia1 and they sometimes can generate multiple transcript variants challenging to detect with sequence-specific molecular methods. To determine whether the validated MMA technology could be applied to expanded panels targeting distinct leukemia entities, we built and tested two additional prototype panels. One assay was focused on the MMA targets resulting from t(9;22), t(12;21), t(4;11) and t(1;19) commonly found in ALL (Figure 5a). Evaluation with individual synthetic in vitro transcripts at site 4 showed that the modified assay could also co-detect the rare ETV6–RUNX1 e5e3 and TCF3–PBX1 e13e2i27 variants and four additional MLL–AFF1 variants without increasing the number of bead-conjugated probes. Further evaluation at site 3 on a subset of 31 samples previously tested with the MMA resulted in 100% agreement with cytogenetics and independent molecular methods (data not shown) including efficient detection of the two samples ID no. 146 and 184 initially classified as false negative for MLL–AFF1 with the MMA (Figure 5a).

Figure 5
figure 5

Panel expansion. Representative example of results (MFI) with two prototype assays detecting 23 different targets prepared by in vitro transcription and spiked in a background of translocation- and mutation-negative HL60 RNA (400 ng input). (a) Specific detection of 13 fusion transcripts commonly found in CML and ALL. Results for the two samples false negative for MLL–AFF1 with the MMA at site 3 (study ID no. 146 and no. 184) are also shown. (b) Specific detection of three NPM1 mutant transcripts and seven fusion transcripts commonly found in AML. For this assay, the NPM1 wild-type sequence (NPM1 WT) is used as an endogenous control. Target-specific positive signals above the qualitative cutoff (350 MFI) are highlighted.

The other assay focused on markers associated with AML with the addition of three beads to specifically detect the CBFB–MYH11 type E and PML–RARA bcr2 variants and the three most common NPM1 mutant transcripts (Figure 5b). With this assay a positive signal only on the bead specific for RARA exon 3 indicates the presence of PML–RARA bcr2, a variant with inclusion of a variable number of RARA intron 2-derived nucleotides. Extensive validation studies also confirmed the performance of the MMA technology for the reliable and sensitive detection of these additional transcripts and 34 independent variants overall.17 We concluded that the MMA is an accurate molecular method for the rapid multiplex detection of diagnostically and prognostically relevant markers, including for submicroscopic abnormalities, such as ETV6–RUNX1 or NPM1 mutations, that cannot be detected by cytogenetic methods.


Our retrospective blinded study was designed to assess as many sources as possible of variability commonly found in the molecular diagnostic setting, including pre-analytical, analytical and clinical parameters associated with different specimen collection and testing sites. At sites 1 and 4 the sample set represented a broad array of myeloid neoplasms and acute leukemias positive or negative by cytogenetics. It included 75 specimens (63%) with normal karyotype or various chromosomal abnormalities not included in the MMA panel to challenge the assay specificity. At sites 2 and 3 the sample set was enriched for relevant translocation-positive specimens, including pediatric cases, to challenge the assay sensitivity with multiple independent specimens for every target included in the MMA panel. Other parameters, such as cytogenetic methods and interpretation, specimen collection method, RNA extraction method and MMA operator or instrument variability, were confounded with each site. Overall, 281 samples were evaluated with a relatively balanced design enabling assessment of both clinical sensitivity and specificity (145 positive and 135 negative).

The MMA accuracy relative to cytogenetics and independent validated molecular LDT was high, 98.6% and 98.7%, respectively, with 95% confidence intervals ranging from 95.4 to 99.6%. Investigation of the two apparent false-positive and two apparent false-negative cases showed that these discrepant results were in fact in agreement with the design and analytical specificity and sensitivity of the MMA, resulting in a virtual accuracy of 100% (95% confidence interval: 98.6–100%). All the samples in our retrospective study were well characterized by clinically accepted reference methods such as karyotyping, FISH and independent molecular tests when available. We therefore did not evaluate cases where translocations involving regions with similar banding patterns can result in cryptic sites detected by molecular methods but missed by karyotyping. However, in a single-institution prospective study on 330 diagnostic adult acute leukemia, King et al.11 recently reported a 12% (7/57) false-negative rate for karyotyping relative to a molecular LDT based on the same technology as the MMA. This high rate was within the range reported in independent studies for various clinical populations (0.5–15%)6, 13 and highlighted the complementary role of molecular and cytogenetic methods in the clinical management of leukemia.

Conventional G-banding analysis has the intrinsic ability to detect any structural or numerical aberration, including novel, uncharacterized abnormalities.3 Multiplex RT-PCR methods such as the MMA will not detect specific chromosomal features such as numerical aberrations or deletions and therefore cannot be a substitute for karyotypic analyses. However, when poor quality cultures, lack of analyzable metaphases or complex, masked and cryptic variants affect cytogenetic results, molecular methods can provide entity-defining, therapy-determining and prognosis-relevant information.1, 3 For example, detection of BCR–ABL1 e13a2 or e14a2 in CML is generally associated with a favorable outcome with tyrosine kinase inhibitor treatment. Identification of the rare e1a2 variant would considerably affect the patient’s risk profile and might mandate early allogeneic stem cell transplantation.27 In AML, where the favorable prognosis associated with t(8;21), inv(16), or t(15;17) is well documented, detection of RUNX1–RUNX1T1, CBFB–MYH11 or PML–RARA variants, as well as NPM1 mutations in certain cases, would result in a significant change from an intermediate risk group for normal karyotype to a favorable risk group.1, 16, 28 Similarly, risk-based classification of ALL would be impacted by the detection of specific fusion transcripts with BCR–ABL1 or MLL–AFF1 being associated with unfavorable prognosis and TCF3–PBX1 or ETV6–RUNX1 with an intermediate to very good prognosis.1, 2 Not only are many of these genetic alterations characterized by frequent cytogenetically cryptic sites but in certain cases, such as t(12;21) ETV6–RUNX1 or NPM1 mutations (four-nucleotide long insertions), they cannot be detected by conventional banding techniques because of their submicroscopic nature.

Most of the archived RNA samples evaluated in this study were from cases at presentation (n=243). Among the 18 cases at follow-up, 9 with no residual disease, normal cytogenetic, or with chromosomal abnormalities not included in the MMA panel and 7 in cytogenetic relapse for t(9;22), t(4;11) or t(8;21) were correctly classified with the MMA (Supplementary Table 1). The MMA also detected BCR–ABL1 in two t(9;22) positive CML or ALL cases in cytogenetic remission following treatment. Sample no. 69 was from an ALL patient who had received bone marrow transplantation 5 weeks before the low-positive MMA signal. Although he was clinically and cytogenetically free of disease at that time, he relapsed and died of leukemia 6 months later. Independent from this study, we are aware of another patient with AML with t(8;21) who had successfully completed induction therapy, was cytogenetically and flow cytometrically normal, and had a low positive MMA value for RUNX1–RUNX1T1 4 months before relapsing. These observations are consistent with a demonstrated MMA analytical sensitivity equivalent to 1% blast or less (Figure 2) enabling detection of molecular relapse earlier than cytogenetics.

Although the output of the MMA is a qualitative positive or negative call, quantitative analyses further shed light on the performance and robustness of the assay. The distributions and median probe signals varied between sites but were consistent with the pre-analytical and protocol differences. The positive signal distributions were narrow at sites 1 and 4 where a constant amount of RNA was used (400 ng) and were broad at sites 2 and 3 where a constant volume of RNA was used independently of the concentration (the recommended protocol is 400–1000 ng input in 1–6 μl volume). Review of historical total RNA concentration data at site 2 revealed a range 30–150 ng/μl. With a 5 μl input per RT reaction, the corresponding input range was therefore only 150–750 ng per RT with about 40% of the samples expected to be below the minimum recommended input of 400 ng. At site 3, the range of concentration was 80–1000 ng/μl with a 3-μl constant input. Thus, only about 5% of the samples were below the minimum input and more than 70% of the samples were expected to be in the 1000–3000 ng input range. This high input likely contributed to the uncharacteristic cross-hybridization signals between the PML–RARA probes observed only at site 3 in two out of 26 t(15;17) positive samples. It also resulted in very high true-positive signals at site 3 where 30% (12/40) of the positive samples generated signals between 7006 and 9502 MFI, a range of signal intensity observed only at this site.

Overall, the median positive signal was 4020 MFI and 95% of the true-positive signals were at least twofold above 350 MFI. In contrast, the negative-probe signals were well below 350 MFI with maxima at 197, 208 and 245 MFI at sites 1, 2 and 4, respectively. Only five out of 2935 negative signals (0.17%) were above 250 MFI, all at site 3 where the highest RNA input was used. Furthermore, analytical studies showed that 95% of the probe signals generated in the absence of target analyte are expected to be at least twofold below 350 MFI (LOB=147–177 MFI for 968 probe signals; Figure 4). Supplementary calculations of the area under the curve for a one-sided normal distribution indicates that 99.999999% of the probe signals would be below 343 MFI (mean plus 5.612 times the s.d.) or the equivalent of 1 potential false-positive signal for every 100 million reactions performed with true negative samples (data not shown). This very favorable signal-to-noise ratio enabled positive detection of fusion transcripts in 1–10 ng of RNA input, an analytical sensitivity of at least 1% when 400 ng or more of RNA is used, and a very simple and safe qualitative data interpretation relative to a single fixed cutoff at 350 MFI.

Successful evaluation of the MMA at four independent sites also demonstrated that the method is compatible with the clinical molecular laboratory. Although formal workflow or time-motion comparative analyses were not performed, the potential gain in operational efficiency is evident. Simultaneous analysis by multiplex RT-PCR is a practical and time-saving solution relative to cytogenetics or repeated series of individual RT-PCR. The entire procedure can be completed in 6 h, up to 93 samples can be tested for 12 different analytes in each run, and a same-day turn-around time is achievable for urgent cases.11 However, one limitation of the dual amplification/detection specificity technology is the inability to detect alternative splicing isoforms and rare or novel variants unless specific primers and/or probes are added to the design. This was illustrated at site 3 where two MLL–AFF1 fusion transcripts were missed by the MMA. One was e11e4, a relatively rare variant, >5%, in non-infant t(4;11) ALL7, 18 and the other was e12e5, a variant to our knowledge not previously described in the literature. The only similar case identified by searching sequence databases was an e12e4 variant detected in an infant pre-B ALL (GenBank accession number JN169752.1).

Another potential drawback of the MMA is the inclusion of targets associated with leukemia of both myeloid and lymphoblastic lineages in a single reaction, which can result in unnecessary testing when the lineage of the neoplastic cells has already been established by morphological, cytochemical and/or immunophenotypic analyses. Both limitations can be overcome as shown by the successful development and evaluation of additional disease-focused panels with increased content. The alternative splicing variants ETV6–RUNX1 e5e3 and TCF3–PBX1 e13e2i27 are relevant in 5–10% of the t(12;21) or t(1;19)-positive cases29, 30 and the multiple MLL–AFF1 variants have a different prevalence in specific ALL populations. For example, the e11e4 variant is infrequent in pediatric and adult ALL but represents >50% of the t(4;11) infant cases.7, 18 In AML, both the CBFB–MYH11 type E and PML–RARA bcr2 variants represent about 5% of the inv(16) and t(15;17)-positive populations31, 32 and small insertions in NPM1 exon 12 at 5q35 are the most frequently observed genetic abnormalities in AML, found in about 50% of the cytogenetically normal de novo AML cases.28 Independent studies also validated that the 23 variants described here and a total of 34 variants resulting from 18 different genetic alteration sites, including multiple MLL (11q23) and RARA (17q21) rearrangement partners relevant in acute leukemia, are detected with high sensitivity, specificity and precision using the MMA technology.17

In summary, our data demonstrate that multiplex molecular tests are a reliable and sensitive technology with a broad clinical utility to complement the morphological, immunophenotypic and cytogenetic diagnosis of leukemia. In the future, validation of expanded and disease-focused panels could provide additional flexibility and clinical sensitivity to reliably identify genetic abnormalities such as submicroscopic alterations or cryptic translocations that are paramount for optimal patient management.