Cost-effective multiplexing before capture allows screening of 25 000 clinically relevant SNPs in childhood acute lymphoblastic leukemia


Genetic variants, including single-nucleotide polymorphisms (SNPs), are key determiners of interindividual differences in treatment efficacy and toxicity in childhood acute lymphoblastic leukemia (ALL). Although up to 13 chemotherapeutic agents are used in the treatment of this cancer, it remains a model disease for exploring the impact of genetic variation due to well-characterized cytogenetics, drug response pathways and precise monitoring of minimal residual disease. Here, we have selected clinically relevant genes and SNPs through literature screening, and on the basis of associations with key pathways, protein–protein interactions or downstream partners that have a role in drug disposition and treatment efficacy in childhood ALL. This allows exploration of pathways, where one of several genetic variants may lead to similar clinical phenotypes through related molecular mechanisms. We have designed a cost-effective, high-throughput capture assay of 25 000 clinically relevant SNPs, and demonstrated that multiple samples can be tagged and pooled before genome capture in targeted enrichment with a sufficient sequencing depth for genotyping. This multiplexed, targeted sequencing method allows exploration of the impact of pharmacogenetics on efficacy and toxicity in childhood ALL treatment, which will be of importance for personalized chemotherapy.


The cure rate for childhood acute lymphoblastic leukemia (ALL) after first-line therapy approaches 80–85%.1 To further improve prognosis, extensive personalization of therapy based on extensive targeted genetic analyses will be required.2 The overall goal is to avoid unnecessary adverse and life-threatening toxicities, unacceptable late effects due to overtreatment and risk of relapse due to undertreatment.3, 4 Several studies have indicated that for patients with the most favorable genetic variants, event-free survival may be more than 90%.5, 6 High-throughput technologies in combination with imputing allow genome-wide mapping of genetic variants that subsequently can be associated with treatment failures or specific toxicities.7 In addition to the genes and single-nucleotide polymorphisms (SNPs) already known to be involved in drug disposition or specific toxicities (for example, thrombosis, osteopenia or immune function), such genome-wide variation studies (GWAS, genome-wide association study) are likely to reveal important variations in genes not previously linked to the biological issue in question. However, extensive research including clinical trials will subsequently be needed before this new information can be implemented in childhood ALL treatment protocols. Furthermore, the impact of the genetic variations can often only be fully understood within the frame of a specific treatment protocol.2 Finally, current commercially available solutions for GWAS for hypothesis-driven genetic investigations are not easily applied clinically, as the techniques and commercial platforms are either designed to explore random variations across the genome that rarely cover all variations of interest for a specific study, or the costs of custom-made approaches are too high for implementation in clinical settings. We here describe a novel multiplexing method enabling us to screen childhood ALL patients for 25 000 clinically relevant SNPs simultaneously, targeted by custom-designed baits. Furthermore, eight childhood ALL samples are pooled together before capture enrichment, making this a very cost-effective platform, allowing future targeted genetic mapping of large cohorts of patients. The choice of pooling 8 patient samples was based on results from a pilot study, where we pooled and sequenced 4, 6 and 12 test samples, respectively, labeled with different barcodes.

Materials and methods


A total of 48 samples from Danish childhood ALL patients (aged between 1 and 15 years at the time of diagnosis) diagnosed with B-cell precursor or T-lineage ALL and enrolled in the Nordic Society for Pediatric Hematology and Oncology (NOPHO) ALL-2000 protocol were included. For the pilot study, four human DNA and two HapMap DNA test samples were used in different combinations (Supplementary Table 1). The study was approved by The Danish Data Protection Agency (2007-41-1289) and The Committee on Biomedical Research Ethics (H-D-2007-0100, KF 01 265848).

Library preparation, pooling, target enrichment and sequencing

DNA shearing and library preparations were performed according to the SureSelect Target Enrichment System protocol version 1.2 April 2009 (Agilent Technologies, Santa Clara, CA, USA) with minor modifications. Briefly, 3 μg of genomic DNA was sheared by Covaris S2 System (Covaris Inc., Woburn, MA, USA) using 10% duty cycle, intensity of 5, cycles per burst of 200 for 6 cycles of 60 s, following purification of the DNA fragments by QIAquick PCR purification spin columns (Qiagen, Hilden, Germany). After each reaction, a purification step was performed. Then end repair was performed (by applying T4 DNA polymerase, T4 phosphonucleotide kinase and Klenow fragment enzyme) and 3′ end A-overhangs were produced (by applying Klenow 3′–5′exo minus). Custom-made adapters containing unique barcodes of four bases each were prepared. The complementary oligos (Supplementary Table 2; DNA technology A/S, Risskov, Denmark) were dissolved in DNase-free water to a final concentration of 300 μM. Complementary oligonucleotide pairs were mixed in ratio 1:1 in 1 × annealing buffer (10 × buffer contained 100 mM Tris-HCL, pH 8.1; 0.5 M NaCl). The barcoded adapter mix was heated to 90 °C for 2 min, then cooled down to 30 °C at a rate of 2 °C per minute and diluted to a working concentration of 15 μM. After ligation of the adapters to the DNA fragments, the fragments were size selected in the range of 150–250 bp by 4% agarose gel electrophoresis and excised. The DNA libraries were amplified applying Phusion High-Fidelity PCR Master Mix (Finnzymes, Espoo, Finland) with a denaturation time of 30 s at 98 °C, followed by 14 cycles of denaturation at 98 °C for 10 s, annealing at 65 °C for 30 s and extension at 72 °C for 30 s. Final extension was performed at 72 °C for 5 min. DNA quantity and quality was checked on a NanoDrop ND- 1000 UV-VIS Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and Agilent 2100 Bioanalyzer using the Bioanalyzer DNA High sensitivity (Agilent Technologies), respectively. The DNA libraries were mixed in groups of 8 (or 4, 6 and 12 in the pilot study) in equimolar ratios to yield a final concentration of 147 ng/μl of each pooled library. The pooled libraries were hybridized with our custom-designed SureSelect Oligo Capture Library SureSelect (Agilent Technologies; SureSelect Human X Chromosome Demo Kit was used for the pilot study (Agilent Technologies)) for 24 h according to the manufacturer’s instructions. After incubation, the selected hybrids were purified using magnetic beads and desalted with Qiagen minElute PCR purification column. Post-hybridization amplification PCR with standard primers from SureSelect Target Enrichment System kit and Herculase II Fusion DNA Polymerase (Stratagene, Agilent Technologies) was performed with a denaturation time of 30 s at 98 °C, followed by 18 cycles of denaturation at 98 °C for 10 s, annealing at 57 °C for 30 s and extension at 72 °C for 30 s. The final extension was performed at 72 °C for 7 min. After purification, DNA quantity and quality was checked. A 75-nucleotide (nt) single-end run on the Illumina GAIIx Genome Analyzer (Illumina Inc., San Diego, CA, USA) was performed following the manufacturer’s recommendations.

SNP selection and bait design

SNPs were selected to cover all known and putative clinically relevant variations with regard to childhood ALL treatment (Figure 1). First, a list of clinically relevant genes and SNPs was curated, and their influence (known and suspected) in terms of effect on metabolism, transport or drug targets interactions for the 13 most administered chemotherapeutic drugs and their clinical consequences in childhood ALL were evaluated.2 To extend the list of genes/proteins connected to these drugs, drug–protein associations from different sources such as DrugBank (version 2008)8 and PharmGKB (version 2008)9 were gathered. The resulting protein drug targets (from binding data), metabolizing enzymes and drug transporters were integrated into the previous list.

Figure 1

SNP selection and bait design. The list of curated genes and drug targets was expanded with the systems biology approach, and known SNPs in these genes were selected with potential functional significance. The baits targeting selected SNPs were designed to minimize the extent of cross-hybridization and self-folding of the baits, as well as extreme levels of GC content.

The list of clinically relevant genes and SNPs was further expanded by including their known first-order protein–protein interaction partners using a high-confidence human interactome and other genes participating in the same pathways.10 This approach allows for investigating complex effects, where several SNPs (potentially in different genes) could exert the same effect through a common mechanism, for example, affecting the same pathway. A list of 969 genes was generated and all genomic variations associated with those genes were extracted using the Ensembl API version 57 (Ensembl, EBI and WTSI, Hinxton, UK) based on the Single Nucleotide Polymorphism Database (dbSNP) 130. SNPs with potential functional impact on the genes of interest were chosen by selecting SNPs resulting in amino-acid changes or frameshifts, SNPs in annotated regulatory regions, variations affecting a stop codon or a splice site, as well as variations within non-coding genes and within mature microRNAs. Furthermore, the list was expanded by including SNPs, which could potentially disrupt predicted microRNA target sites of the listed genes found in Patrocles database.11

Baits for the SureSelect Target Enrichment System were designed for all identified SNPs. Each variation was targeted by two baits with a 50% overlap, where the variation was positioned exactly in the middle of the overlap region (Supplementary Figure 1). The physical and cross-hybridization properties of the baits were explored using the oligonucleotide design software OligoWiz12 and sequence-matching tools BLAST13 and SeqMap.14 The extent of baits prone to cross-hybridization, self-folding, extreme levels of GC content or baits targeting highly variable regions, which could decrease specificity or efficiency of the baits, was limited to a minimum. However, some potentially problematic baits were still included in the design because of the clinical importance of their target region (Supplementary Figure 2). To exploit the whole capacity of the method and to probe for key deletions, additional baits were designed tiling all the exons of ETV6, MTAP and OPRM1, and the entire genomic regions of CDKN2A, GSTM1 and GSTT1, allowing full sequencing of these genes of particular importance in childhood ALL. The baits tiling the genomic regions of the drug-metabolizing genes GSTM1 and GSTT1 were used to detect the deletion state of those genes. To estimate copy number, a depth ratio was calculated from the number of reads in the targeted genomic region normalized by size of the region and total number of reads for the sample. The final design included baits targeting 25 602 clinically relevant SNPs, as well as 1200 baits targeting the exons or the genomic regions of the above-mentioned genes, with specific impact on the treatment outcome in childhood ALL (Figure 1).


In the pilot study, 12 different barcodes (Supplementary Table 2), with the last base being a thymidine (T) necessary for ligation to chromosomal DNA fragments with a 3′ adenosine (A) overhang, were tested. These four base barcodes in combination with 75 nt sequencing reads render high quality, unambiguously mapped reads, while using only 5% of the read length for sample identification.

Data analysis

The high-quality reads obtained from sequencing were aligned to the NCBI37 reference human genome (version GRCh37) using the Burrows–Wheeler Alignment Tool.15 The alignment was refined by means of quality score recalibration and around indel realignment using Genome Analysis ToolKit package.16 SNP calling was performed with SAMtools package17 using default settings. The threshold set for SNP calling was minimum 10x sequencing depth; however, 4x was also accepted as a threshold for high-priority SNPs when 10x depth was not available. The data was further analyzed with help of SAMtools and BEDtools18 packages and custom-written Perl scripts.


Pilot study

The applied baits from the SureSelect Human X Chromosome Demo Kit were designed to capture 85% of the human X chromosome exons, tiling exons with a 50% overlap between consecutive baits. In an attempt to reflect the application of this method to custom-genotyping purposes, the performance was assessed on a set of target positions defined as base pairs positioned exactly in the middle of the 50% overlap region of two consecutive baits (Supplementary Figure 1). The average sequence depth achieved with this approach is higher and more uniform in the immediate surrounding of the target position when compared with placing the target position in the middle of a single bait.

The results of the pilot study showed that average sequence depth decreases with an increasing number of pooled samples; however, the distribution of reads is well balanced and sufficient for genotyping even with 12 pooled samples (Supplementary Figure 3). Each sample had between 55 and 65% sequencing reads mapped to the targeted regions and between 65 and 75% of the reads mapped to targeted regions plus/minus 100 bp of target (Supplementary Figure 4), indicating that sample competition during hybridization was not an issue. An average depth of at least 10x was achieved for 88% of the target positions, with a standard deviation of 10% when pooling 12 samples (Supplementary Figure 5). Increased pooling affected the number of detected target positions. However, depending on the desired depth of sequence reads and required target size (number of SNPs for genotyping studies), the amount of pooling can be adjusted appropriately. Genotyping using depths from 4x to 12x has recently been tried19, 20 and consensus for 10x SNP calling is relatively high, while mapping against a known reference genome such as the human.

SNP calls from the targeted sequencing approach were compared with genotype calls using in-house Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) and public HapMap data. The comparison was restricted to all probes on the array matching SNP calls within targeted regions from the sequencing kit. The accuracy of the multiplexed targeted sequencing approach was assessed on 12-pooled samples based on concordance for sequencing and array-based genotype calls. SNP calls with the coverage depths of 4x, 10x and 20x were compared with the SNP calls obtained from the array, and the percent concordance for each sample was calculated as the percent of calls that were the same for each depth (Supplementary Figure 6). As expected, the number of compared SNPs decreased with increasing depth required; however, it did not affect the concordance rates significantly. In this study, calling SNPs at 4x depth appeared acceptable when 10x depth was not available. All samples achieved concordance well above 95% at 10x coverage depth (the same holds for 4x coverage, with an exception of the CGCT labeled sample). Examining regions that had 20x or better sequence depth, 9 out of the 12 samples were 100% concordant; however, the number of SNPs decreased by 38% because of minimum sequencing depth requirement (20x) as compared with 10x coverage depth.

Childhood ALL samples

Sequencing of the 48 samples generated a total of 39 Gb of data in FASTQ format and 32 Gb passed the default Illumina quality filter. Out of those, 23.2 Gb were uniquely mapped to the reference genome (Table 1). On average, 53% (sd of 9%) of the high-quality sequencing reads were mapped to the target regions, which shows that the distribution of reads was relatively balanced. In addition, on average, 94% of the targeted SNPs were covered at least with one sequencing read for each sample, and 73% of those SNPs achieved at least 10x sequencing depth. The average sequence coverage for the covered variations from the list of targeted SNPs was 23x across all samples. The average depth coverage for the targeted exons of ETV6, MTAP and OPRM1 genes was 32x, whereas the average coverage for the genomic region of CDKN2A gene was 31x. Average depth coverage for GSTs could not be estimated, as it varies with the deletion state of the genes.

Table 1 Summary of the sequencing of the childhood ALL samples

Bait design

The performance of capturing baits depends on their physical and cross-hybridizational properties (Supplementary Figure 2). Baits used in the design were explored using the OligoWiz program for their probability of self-folding and cross-hybridization for the presence of low-complexity regions and their GC content. All the tested parameters seem to influence bait performance; therefore, these properties should be taken into consideration during bait design. If required, higher depths for regions of particular interest can be obtained by targeting them with higher numbers of overlapping baits. In this study, for a list of high-priority SNPs with known significant impact on the treatment outcome in childhood ALL,2 four different baits (instead of two) have been designed for each SNP. The obtained average coverage for those was 35x, as compared with the average coverage of 23x for the whole list of SNPs. Special care must be taken when targeting regions on a human mitochondrial genome, as these DNA fragments are overrepresented in a genomic DNA sample. Mitochondrial regions will attract significantly more reads, and might therefore dominate the sequencing results. In our study, 66 baits targeting regions on the mitochondrial genome were included, achieving an average depth of 4350x and the reads corresponding to those constituted 15% of all the mapped sequencing reads.

Genotype validation

Genotype data from the 48 childhood ALL samples on seven SNPs and two gene deletions were used for validation of the sequencing results (Table 2). Patients were previously genotyped for CYP3A5*3 6986A>G (rs776746), RFC1 80G>A (rs1051266), TPMT*3B 460G>A (rs1800460), TPMT*3C 719A>G (rs1142345), MTHFR 677C>T (rs1801133) and MTHFR 1298A>C (rs1801131) by allelic discrimination5, 6 (and unpublished data). GSTP1 313A>G (rs1695), GSTM1 and GSTT1 deletions were genotyped by multiplexing PCR, which simultaneously detects GSTT1 and GSTM1 gene copy number and GSTP1 313A>G.21 Based on the calculated coverage depth ratio for the genomic regions of GSTM1 and GSTT1, it was possible to distinguish three distinct clusters corresponding to homozygous deletions, heterozygous deletions and the wild type (Figure 2). Concordance for the seven SNPs was between 91 and 100%, and 100% for the two gene deletions (Table 2).

Table 2 Genotype concordance for the 48 childhood ALL samples analyzed by single PCR and by multiplexing sequencing
Figure 2

Copy number detection in sequence data. The state of gene deletion is assessed based on the distribution into three distinct clusters corresponding to the wild-type, heterozygous and homozygous deletions. Depth ratio was calculated from the number of reads in the targeted genomic region normalized by size of the region and total number of reads for the sample. Results were validated for 42 of the 48 samples using multiplexing PCR and showed 100% concordance. (a) Deletion state for the GSTM1 gene. (b) Deletion state for the GSTT1 gene.


In recent years, clinically important genetic variations have been thoroughly investigated in childhood ALL.2 Even though minimal residual disease monitoring22 and extensive toxicity scoring have been established in some treatment protocols,1 still very few groups include genetic variations in their treatment strategies.6 This primarily reflects the lack of extensive targeted analyses of genetic variants and the costs associated with such analyses. Multiplexing before capture, target enrichment and sequencing allows screening of 25 000 custom-selected SNPs simultaneously and could therefore be a solution. As shown in the pilot study, pooling of 4–12 samples in a single lane of an Illumina GAIIx with 75 nt single-ended reads is sufficient to generate 10–20x sequence depth over more than 80% of the target region per sample (Supplementary Figure 3). If a 10–20% drop in coverage (or target size) is allowed, pooling 6 or 12 samples can easily be achieved. Pooling of up to 12 samples showed relatively balanced results, but larger dispersion at higher depth. To reduce this variation, we hypothesized that in case of pooling 8–10 samples, an average sequence depth of 10x for 80% of the intended regions could be expected. The amount of pooling was therefore adjusted to eight samples at a time because of the large size of the desired target and need for high sequence depth. The next-generation sequencing technology is rapidly evolving; hence, in future studies, it will be possible to either pool more samples or obtain higher coverage. It was observed in different sequencing runs that not all barcodes performed equally well. For example, we found that barcode CGCT had a lower performance compared with the rest (Supplementary Table 1, Supplementary Figures 5 and 6). This could be inherent to the barcode itself, but also influenced by experimental variation (for example, small differences in the amount and quality of the different adapters and variation in library preparation). Additional studies exploring this and the individual barcodes could provide information to further improve the study design.23

We here demonstrate that it is possible to reliably genotype childhood ALL patients for a large number of SNPs simultaneously using multiplexed target enrichment, followed by sequencing. Sequencing data for 94% of the targeted SNPs for each sample was obtained, and for 73% of those, the achieved coverage depth was sufficient for high-confidence genotype calling (at least 10x). In addition, this methodology is easy to adapt in a sequencing lab and has a low entry level (80–100 samples), thus allowing redesign of content during the course of large sample projects. In a single design, exon tiling can be combined with SNP or somatic-mutation detection, and we show here that copy number can be reliably inferred through sequencing. Whole-exome sequencing is gaining popularity among targeted sequencing efforts as a way to improve sequence depth and reduce cost compared with whole-genome sequencing, as exons span only 1.2% of the whole human genome. This is a valuable approach and it has recently been shown that some multiplexing (3–5 samples) can be accomplished.24 However, when sequencing the exons only, many regions are missed that may have important biological functions such as transcriptional or translational regulation of the protein-coding sequences. Many studies indicate non-coding SNPs to be clinically relevant; therefore, it is crucial for pharmacogenetic studies to also investigate regions outside of the protein-coding parts of the genome. We demonstrate that a more targeted hypothesis-driven panel can be constructed, assayed reliably and at much lower costs than exome sequencing. Pooling of eight patient samples before capture reduces the costs of the capture library, hybridization reagents and, none the least, the costs of sequencing by eight times. Furthermore, the patients are genotyped for 25 000 targeted SNPs simultaneously, reducing the cost per SNP per patient even further compared with conventional methods.

The presented method for performing high-throughput, low-cost, customized genotyping will allow wider application of studying clinical impact of genomic variations. Immediate applications include validation of GWAS, assaying somatic mutations or a panel of SNPs such as in drug toxicology studies. Furthermore, the flexibility of the bait design enables researchers to adapt the SNP content as new knowledge emerges. Future applications of this method in upcoming childhood ALL studies will move us closer to pharmacogenetic-based personalization of therapy in childhood ALL—and possibly other cancers. Such studies are currently ongoing in the NOPHO Study Group.


  1. 1

    Schmiegelow K, Forestier E, Hellebostad M, Heyman M, Kristinsson J, Soderhall S et al. Long-term results of NOPHO ALL-92 and ALL-2000 studies of childhood acute lymphoblastic leukemia. Leukemia 2010; 24: 345–354.

  2. 2

    Davidsen ML, Dalhoff K, Schmiegelow K . Pharmacogenetics influence treatment efficacy in childhood acute lymphoblastic leukemia. J Pediatr Hematol Oncol 2008; 30: 831–849.

  3. 3

    Schmiegelow K, Al-Modhwahi I, Andersen MK, Behrendtz M, Forestier E, Hasle H et al. Methotrexate/6-mercaptopurine maintenance therapy influences the risk of a second malignant neoplasm after childhood acute lymphoblastic leukemia: results from the NOPHO ALL-92 study. Blood 2009; 113: 6077–6084.

  4. 4

    Lund B, Åsberg A, Heyman M, Kanerva J, Harila-Saari A, Hasle H et al. Risk factors for treatment related mortality in childhood acute lymphoblastic leukaemia. Pediatr Blood Cancer 2011; 56: 551–559.

  5. 5

    Gregers J, Christensen IJ, Dalhoff K, Lausen B, Schroeder H, Rosthoej S et al. The association of reduced folate carrier 80G>A polymorphism to outcome in childhood acute lymphoblastic leukemia interacts with chromosome 21 copy number. Blood 2010; 115: 4671–4677.

  6. 6

    Schmiegelow K, Forestier E, Kristinsson J, Soderhall S, Vettenranta K, Weinshilboum R et al. Thiopurine methyltransferase activity is related to the risk of relapse of childhood acute lymphoblastic leukemia: results from the NOPHO ALL-92 study. Leukemia 2009; 23: 557–564.

  7. 7

    Relling MV, Yang W, Das S, Cook EH, Rosner GL, Neel M et al. Pharmacogenetic risk factors for osteonecrosis of the hip among children with leukemia. J Clin Oncol 2004; 22: 3930–3936.

  8. 8

    Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006; 34 (Database issue): D668–D672.

  9. 9

    Hewett M, Oliver DE, Rubin DL, Easton KL, Stuart JM, Altman RB et al. PharmGKB: the Pharmacogenetics Knowledge Base. Nucleic Acids Res 2002; 30: 163–165.

  10. 10

    Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, Rigina O et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol 2007; 25: 309–316.

  11. 11

    Hiard S, Charlier C, Coppieters W, Georges M, Baurain D . Patrocles: a database of polymorphic miRNA-mediated gene regulation in vertebrates. Nucleic Acids Res 2010; 38 (Database issue): D640–D651.

  12. 12

    Wernersson R, Juncker AS, Nielsen HB . Probe selection for DNA microarrays using OligoWiz. Nat Protoc 2007; 2: 2677–2691.

  13. 13

    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ . Basic local alignment search tool. J Mol Biol 1990; 215: 403–410.

  14. 14

    Jiang H, Wong WH . SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 2008; 24: 2395–2396.

  15. 15

    Li H, Durbin R . Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010; 26: 589–595.

  16. 16

    McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010; 20: 1297–1303.

  17. 17

    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009; 25: 2078–2079.

  18. 18

    Quinlan AR, Hall IM . BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 2010; 26: 841–842.

  19. 19

    Li Y, Vinckenbosch N, Tian G, Huerta-Sanchez E, Jiang T, Jiang H et al. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat Genet 2010; 42: 969–972.

  20. 20

    Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Durbin RM et al. A map of human genome variation from population-scale sequencing. Nature 2010; 467: 1061–1073.

  21. 21

    Buchard A, Sanchez JJ, Dalhoff K, Morling N . Multiplex PCR detection of GSTM1, GSTT1, and GSTP1 gene variants: simultaneously detecting GSTM1 and GSTT1 gene copy number and the allelic status of the GSTP1 Ile105Val genetic variant. J Mol Diagn 2007; 9: 612–617.

  22. 22

    Campana D . Progress of minimal residual disease studies in childhood acute leukemia. Curr Hematol Malig Rep 2010; 5: 169–176.

  23. 23

    Craig DW, Pearson JV, Szelinger S, Sekar A, Redman M, Corneveaux JJ et al. Identification of genetic variants using bar-coded multiplexed sequencing. Nat Methods 2008; 5: 887–893.

  24. 24

    Nijman IJ, Mokry M, van BR, Toonen P, de BE, Cuppen E . Mutation discovery by targeted genomic enrichment of multiplexed barcoded samples. Nat Methods 2010; 7: 913–915.

Download references


We are grateful to the patients who participated in the study and their referring physicians. We thank Kirsten Kørup Rasmussen for very helpful technical assistance and Jannie Gregers for providing us with previously generated SNP data. We acknowledge The Technical University of Denmark Multi-Assay Core for providing technology consultation and laboratory resources. AW, MDD and LB analyzed, interpreted data and wrote the manuscript. AW performed the sequence analysis. MDD, LB, HL, KS and RG designed the experimental research project setup. LB, MDD, LRH and BFN performed the experimental work and the Affymetrix 6.0 SNP Arrays. RG performed data analysis supervision. MB and NT performed the Illumina sequencing. KA, LG, TSP and NW performed parts of the data analysis. JN provided cell lines. LG, HL, RG, SB and KS provided critical input to the project and manuscript. This study was supported by grants from The Danish Cancer Society (Grant numbers R2-A56-09-S2 and R20-A1156-10-S2), The Danish Childhood Cancer Foundation, The Otto Christensen Foundation, The Villum Kann Rasmussen Foundation, The Ministry of Health (Grant number 2006-12103-250), The Novo Nordisk Foundation, The Danish Research Council for Health and Disease (Grant numbers 271-06-0278, 271-08-0684), The University Hospital Rigshospitalet, Denmark, The Lundbeck Foundation, The research program of the UNIK: Food, Fitness and Pharma for Health and Disease, The Danish Ministry of Science, Technology and Innovation and The Wilhelm Johannsen Centre for Functional Genome Research that is established by the Danish National Research Foundation.

Author information

Correspondence to R Gupta.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies the paper on the Leukemia website

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wesolowska, A., Dalgaard, M., Borst, L. et al. Cost-effective multiplexing before capture allows screening of 25 000 clinically relevant SNPs in childhood acute lymphoblastic leukemia. Leukemia 25, 1001–1006 (2011).

Download citation


  • multiplexed genotyping
  • next-generation sequencing
  • target-enrichment
  • clinically relevant SNPs
  • childhood acute lymphoblastic leukemia

Further reading