Comprehensive diagnostics of acute myeloid leukemia by whole transcriptome RNA sequencing

Acute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally diverse and currently detected by different diagnostic assays. This study sought to establish whole transcriptome RNA sequencing as single, comprehensive, and flexible platform for AML diagnostics. We developed HAMLET (Human AML Expedited Transcriptomics) as bioinformatics pipeline for simultaneous detection of fusion genes, small variants, tandem duplications, and gene expression with all information assembled in an annotated, user-friendly output file. Whole transcriptome RNA sequencing was performed on 100 AML cases and HAMLET results were validated by reference assays and targeted resequencing. The data showed that HAMLET accurately detected all fusion genes and overexpression of EVI1 irrespective of 3q26 aberrations. In addition, small variants in 13 genes that are often mutated in AML were called with 99.2% sensitivity and 100% specificity, and tandem duplications in FLT3 and KMT2A were detected by a novel algorithm based on soft-clipped reads with 100% sensitivity and 97.1% specificity. In conclusion, HAMLET has the potential to provide accurate comprehensive diagnostic information relevant for AML classification, risk assessment and targeted therapy on a single technology platform.


Introduction
Acute myeloid leukemia (AML) is caused by functionally complementary genetic mutations that cause uncontrolled proliferation and maturational arrest of myeloid precursor cells [1,2]. Overt AML carry on average 13 nonsynonymous mutations [3,4]. Acquisition of AML-initiating mutations in hematopoietic stem cells can precede AML diagnosis by decades [5][6][7]. The 2016 revision of the WHO classification of hematologic malignancies [8] distinguishes nine AML subtypes of clinical and prognostic importance. Genetic aberrations permit precise classification with risk assessment in 50-55% of AML cases and may guide targeted therapy. A proposed genomic classification based on an extended panel of recurrently mutated genes permits classification of 80-85% of AML cases [9].
Accurate classification of AML, however, requires different techniques to capture structurally diverse genetic abnormalities such as larger insertions and deletions (indels), fusion genes, and structural chromosomal aberrations in addition to small variants. Moreover, expression levels of structurally normal genes can have decisive prognostic impact [10][11][12]. Therefore, AML diagnosis and risk assessment remains complex, expensive, and frequently incomplete with current technological platforms. To address this unmet medical need, we implemented whole transcriptome messenger RNA sequencing (mRNAseq) without concomitant sequencing of germ-line DNA as a single platform to acquire all genetic information relevant for current and future classification and risk categorization. In this diagnostic paradigm, an accredited mRNAseq protocol acquires comprehensive data on a diagnostic sample. For single-run data analysis, we developed an integrated bioinformatic pipeline designated (Human AML Expedited Transcriptomics (HAMLET); Fig. 1). HAMLET is flexibly adaptable and tailored to interrogate sequence variants and expression levels of a selected panel of genes as well as translocations from mRNAseq data.
In this report, we apply HAMLET (version 1.0) for AML classification according to the WHO 2016 [8] and genomic classifications [9] and for risk assessment according to the European Leukemia Net [1].

AML samples
One hundred cryopreserved AML samples obtained with written informed consent were selected from the Hematology Biobank of Leiden University Medical Center (LUMC) with approval by the institutional review board (no. B 18.047). All samples had been genotyped for NPM1 mutations and FLT3 internal tandem duplications (FLT3-ITD) by the accredited LUMC molecular diagnostic laboratory.

Whole messenger transcriptome sequencing
At least 1 µg of total RNA was isolated from mononuclear cells per sample without prior enrichment for leukemic blasts (RNAqueous kit, Thermo Fisher Scientific, Bleiswijk, Fig. 1 The bioinformatics pipeline HAMLET. HAMLET was developed as a bioinformatics pipeline to call all relevant information for diagnosis and prognosis of AML from raw mRNAseq data. HAMLET integrates four modules using algorithms to detect fusion genes (STARfusion ∩ FusionCatcher), small variants (VARSCAN), large tandem duplications (ReSCU), and gene expression. HAMLET permits immediate AML classification into subtypes according to WHO 2016 or genomic classifications, provides additional prognostic information according to the European Leukemia Net, and identifies molecular targets for therapy. The Netherlands), followed by treatment with DNAse I and silica membrane purification (RNeasy kit, Qiagen, Hilden, Germany). RNA was quantified using the Qubit RNA HS assay kit (Thermo Fisher Scientific) and diluted to 100 ng/µl. The quality of total RNA was confirmed by Fragment Analyzer (Advanced Analytical Technologies, Heidelberg, Germany). Samples with RNA quality numbers ≥6 were selected for RNA library preparation in an ISO/IEC 17025-accredited protocol (TruSeq RNA library preparation kit v2, Illumina, San Diego, CA). After enrichment of mRNA by oligo dT magnetic beads and fragmentation, cDNA synthesis was performed, followed by adapter ligation and PCR amplification. Paired-end sequencing with a read length of 126 bp was performed on an Illumina HiSeq 2500v4 sequencer to at least 12.5 Gbp per sample according to the manufacturer's protocol, yielding~50 million read pairs. Image analysis, base calling, and quality check was performed with Illumina data analysis pipeline RTA v1.18.64 and Bcl2fastq v1.8.4. RNAseq reads were provided in compressed Sanger FASTQ format.

Quality parameters and alignment to human reference genome
Each input sample file was analyzed by FastQC (version 0.10.1). Known overrepresented sequences detected by FastQC were used for adapter clipping using cutadapt (version 1.2) setting the minimum accepted read length to 20 bp. Low-quality bases were then trimmed using sickle (version 1.33). Each read pair was synchronized to maintain the pair ordering using a custom Python script and were merged into a single file. Read pairs were aligned to human reference genome GRCh38 using GSNAP aligner (version 2014-12-23), setting it for novel splice sites discovery (--novel splicing 1) and unique alignments only (--npaths 1 and --quiet-if-excessive). Output SAM files were compressed, position-sorted, and indexed using Picard suite (version 1.141).

Gene expression
To measure expression levels of individual exons, the total number of single read bases aligned to that exon was calculated. This number was normalized to exon length and to the total number of aligned bases for each sample. Similarly, to calculate the expression level of a gene, bases aligned to any exon of the gene were counted and normalized to the sum of gene exon lengths and to the total number of aligned bases. Expression levels were scaled to the number of single read bases aligned to 1 kbp of an exon or gene, respectively, per 10 9 of total aligned bases.

Gene signature for CEBPA double mutants
Gene expression data were also employed to inspect previously published signatures that distinguish between CEBPA double mutant carriers from CEBPA single mutant and wild-type carriers [13,14]. For this purpose, normalized count data, either per gene or per exon, was filtered on minimal expression with the requirement that at least 75% of the samples should have a minimal expression of 1.
Remaining expression values were voom-transformed using limma [15] for R and Z-scaled. Normalized expression levels of genes or exons indicative of CEBPA double mutation status were then plotted in a heatmap using the ggplots2 R package (https://ggplot2.tidyverse.org).

Small variants
BAM files were processed using samtools mpileup (version 1.3.1) and VarScan (version 2.4.2) for variant calling. We disabled the VarScan strand filter (--strand-filter 0), set the minimum base frequency for variant calling to 10% of the total depth at any given position (--min-var-freq 0.1), and relaxed the p-value threshold to 0.05 (--p value 0.05). The resulting VCF files were then filtered with the BEDtools suite (version 2.21.0) for variants in 13 genes. The genes were selected based on a mutation frequency of >5% in cosmic AML samples and relevance for classification and prognostication: NPM1 (Ensembl ID: Since each of the 13 selected genes is mutated in >5% of AML, a panel of 100 samples was selected in order to detect genetic aberrations in each of the respective genes in at least two different samples. Gene positions are defined according to Ensembl ID throughout this manuscript. Filtered VCF files were annotated using Variant Effect Predictor (VEP version 77) and included annotation of allele frequencies from the 1000 Genomes Project [16] and the Genome of The Netherlands (GoNL) [17]. Variants were filtered based on an allele frequency of <5% in any of the subpopulations of the 1000 Genomes Project [16] and the GoNL [17] and annotated for allele frequencies and predicted effects. Based on somatic mutations in AML as reported in COSMIC, mutation hotspot regions were defined for NPM1, FLT3, DNMT3A, ASXL1, KIT, IDH1, IDH2, and NRAS. Variants are displayed using a custom R script that highlights its VEP-predicted impact on the transcript. Initial screening revealed calling of an identical TET2 ENST00000380013.6:c.5844dup in every sample that could not be validated by RT-PCR and Sanger sequencing. Therefore, this specific variant was excluded from the analysis as sequencing artefact.

Detection of tandem duplications in FLT3 and KMT2A
To detect internal tandem duplications in FLT3 (FLT3-ITD) and partial tandem duplications in KMT2A (KMT2A-PTD), FASTQ reads were aligned to the FLT3 primary gene transcript (ENST00000241453) or KMT2A primary gene transcript (ENST00000534358) using BWA MEM (version 0.7.13), setting the penalty of 5′ and 3′ clipping to 2 and 2 (-L 2,2), respectively. Samtools (version 1.3.1) was used to discard unaligned reads. Alignment files were then compressed, sorted, and indexed using the Picard suite (version 1.141). Soft-clipped (SC) fragments of reads partially aligning to FLT3 exon 14-15 (1787-2024 in ENST00000241453 and 1705-1942 in coding sequence) or KMT2A exon 2-13 (456-4719 in ENST00000534358 and 433-4696 in coding sequence) were extracted by Fidus. SC shorter than six bases (the ceiling of log 4 (region size)) were discarded. Each SC is classified by its location at the start (sSC) or end of the aligned region (eSC). Per SC class, local alignment back to a portion of the target region was performed to find its anchor position, defined as the unique terminal position of the local alignment. Alignments with a score of <50% of the possible maximum were discarded. For a given soft clip sequence, partial ITD or PTD candidates are defined as the region enclosed by its position and its anchor position. Final ITD or PTD candidates are defined as pairs of partial ITD or PTD candidates consisting of a sSC and eSC that reciprocate each other within a maximum fuzziness value, i.e., when the anchor region of a sSC contains the soft clip position of an eSC or vice versa. Calling of definite ITD or PTD candidates identified as the region between reciprocal pairs of SC events was performed by the newly developed Reciprocal Soft Clip Utilization (ReSCU) algorithm.

The integrated HAMLET pipeline
All bioinformatic RNAseq analyses were integrated into the HAMLET pipeline with a pdf file as primary output that displays the analyzed, filtered, and annotated results of fusion genes, small variants, FLT3-ITD, KMT2A-PTD, and EVI1 expression in an easily interpretable graphic format. Annotations include small variants with protein consequences as determined by ENSEMBL Variant Effect Predictor, i.e., coding single nucleotide variants (SNV) and small indels. In addition, for all selected variants with protein consequences, a short summary is provided with information on variant type, protein consequence, read depth, location in hotspots, annotation in COSMIC, annotation as SNP with minor allele frequencies in different human subpopulations, and effect on protein function as predicted by PolyPhen and Sift scores.

Validation of HAMLET output
HAMLET results were interpreted by investigators in a blinded fashion, i.e., without prior knowledge of routine interphase cytogenetics, FISH analysis, and molecular diagnostics of NPM1 and FLT3.
Fusion transcripts were correlated to routine metaphase cytogenetics and FISH assays performed by the accredited diagnostic LUMC Laboratory of Clinical Genetics. In addition, fusion transcripts were extensively validated by PCR with primers flanking the breakpoints.
Sequence variants were validated by PCR with custom primers on cDNA generated from the same RNA source as for RNAseq followed by Sanger sequencing or targeted NGS. PCR amplification was performed using the PWO SuperYield DNA Polymerase kit (Roche, Mannheim, Germany) or Phusion Flash PCR kit (Thermo Fisher Scientific). PCR products were purified using Wizard SV Gel and PCR Clean-Up System kit (Promega Corporation). For targeted NGS on cDNA, libraries were prepared from PCR products by ligating barcoded TruSeq adapters (Illumina) using the KAPA Library Preparation kit (KAPA Biosystems) [18]. After pooling the libraries, sequencing was performed on the MiSeq sequencer (Illumina) using v3 sequencing reagents according to the manufacturer's protocol.
Expression of EVI1 was validated by quantitative RT-PCR performed in the accredited diagnostic laboratory of Erasmus Medical Center [11]. EVI1 expression relative to the PBGD housekeeping gene was calculated using the delta-delta CT method. Levels >0.1 relative to the EVI1overexpressing cell line SKOV3 were defined as positive.

Data deposition
Raw data files are available under controlled access in the European Genome-phenome Archive under accession numbers EGAC00001000956 (DAC), EGAS00001003096 (study), and EGAD00001004187 (dataset).

Results
Accredited mRNAseq on 100 AML samples (96 diagnostic pre-treatment samples, three longitudinal samples of relapses, one longitudinal sample of a presumed therapy-related AML; Tables 1 and SI) yielded 48-85 million read pairs (97-170 million reads) per case with a median insert size (distance between the 5′ termini of the paired reads) of 149-177 bases (Table SII). The average read length after quality control was 123-124 bases.

Detection of small variants
The VARSCAN component of HAMLET identified 246 small variants in the selected 13 AML genes with 100% specificity (Tables 2, SVII, and SVIII). The sensitivity of HAMLET was assessed for FLT3-ITD and hotspot regions of NPM1 and DNMT3A in all 100 cases and for the remaining gene regions a total of 418 targeted resequencing reactions on samples that were called as wild type by HAMLET (Table 2). In accordance with the well-known limitation of regular variant callers to detect longer insertions [9], VARSCAN identified only seven rather short FLT3-ITD of the 34 FLT3-ITD-carrying AML. Furthermore, HAMLET failed to call a 23 bp ASXL1 deletion (c.1900_1922del; case 3-030) and a 30 bp C-terminal CEBPA insertion (c.936_937ins30; case 3-003). In both cases, HAMLET correctly identified a canonical C-terminal insertion in ASXL1 (3-030) and a N-terminal frameshift mutation in CEBPA (3-003). The overall type (coding SNVs and in-frame or frameshift indels) and distribution of the small variants as detected by HAMLET corresponded well with their distribution as reported in COSMIC (Fig. S3).
Variants called by HAMLET that are listed in COS-MIC and located within defined genetic hotspots (Table SIX) are readily interpretable as pathogenic class A variants (n = 114; 46.3%) (Table SVII). Sixty variants (24.4%) were reported in COSMIC but not located in hotspots. Most of these class B variants were detected in genes where mutations are known to occur throughout the entire coding sequence (CEBPA, TET2, RUNX1, WT1, and TP53). The remaining variants were unreported (class C variants: n = 52; 21.1%) or not listed in COSMIC but described as SNP (class D variants, n = 20; 8.1%). Class D variants probably represent true genetic polymorphisms with low allele frequency. Class C variants can be interpreted based on their predicted pathogenic effects. According with these principles for interpretation, more than 90% of class A variants, app. 50% of class B and C variants, and <1% of class D variants have been reported as recurrent mutations in myeloid malignancies by Jaiswal et al. [25] (Table SVII).
In the TCGA AML cohort [3], small variants that are present in genomic DNA have been shown to be absent in the corresponding RNA sequence analysis. In particular nonsense variants and frameshift variants with a premature stop codon may be poorly expressed as a result of nonsensemediated RNA decay [26,27]. We therefore validated HAMLET results for small variants in RUNX1, TET2, TP53, DNMT3A, KIT1, and IDH1 by targeted NGS on genomic DNA isolated from 56 AML. These cases were selected for genetic features associated with an increased likelihood for mutations in TP53, RUNX1, and TET2 according to Papaemmanuil et al. [9]. In these 56 AML cases, HAMLET detected 67 variants including 18 nonsense and frameshift variants that are potential targets for nonsense-mediated RNA decay. Of these 67 variants, one TET2 variant (c.4317dup) in three AML cases was missed by targeted NGS (Table SX). This A insertion occurs in a stretch of 6 other A nucleotides, which is apparently difficult to sequence on the Ion GeneStudio S5 System. Reads for this variant were present in the raw data with allele frequencies of 2-3%. There were also two variants detected by targeted NGS that were missed by HAMLET. These variants included a RUNX1 missense variant which has been detected by targeted NGS with a low allele frequency of 10% and a polymorphism in IDH1. In conclusion, our data showed that HAMLET accurately called variants in TET2, TP53, RUNX1, KIT, IDH1, and DNMT3A including all variants encoded by transcripts that are potential targets for nonsense-mediated decay. We also determined the sequencing depth that is required to detect small variants by HAMLET-RNAseq and ran the pipeline on 20, 30, 40, and 47.5 million read pairs for each of the 100 AML. Of the 246 variants called by HAMLET, 53 variants were nonsense mutations or frameshift mutations with a premature stop. The sensitivity of HAMLET to detect these variants decreased from 100, 96, 85 to 70%, while the decrease for the remaining variants was from 100, 98, 96 to 81%. These data show that a high sequencing depth of~50 million read pairs is essential to ensure detection of all variant types including potential targets for nonsense-mediated RNA decay (Table SXI and Fig. S4).
Bi-allelic CEBPA mutations confer a favorable prognosis to patients with AML with a normal karyotype [1,8] and typically combine an N-terminal frameshift mutation within the first 357 bp of the coding sequence on one allele with a C-terminal in-frame insertion or deletion between c.834-1074 that disrupts the DNA-binding basic zipper (bZIP) domain on the other allele [28]. The distance between these events frequently precludes formal proof of their location on opposite chromosomes by commonly achievable read lengths on the Illumina platform. However, bi-allelic CEBPA mutations are associated with a specific gene expression profile [13,14]. mRNAseq-based gene expression profiling showed the characteristic bi-allelic CEBPA mutation-associated signature for all four AML samples carrying two CEBPA mutations (2-009, 2-039, 2-045, and 3-003), including case 3-003 whose C-terminal mutation was missed by HAMLET (Fig. 3). Of eleven AML with single CEBPA variants, one case (2-025) with the characteristic N-terminal frameshift mutation clustered with the double CEBPA mutants. Combined analyses of variant calling and gene expression profiling can therefore resolve uncertainties of variant calling alone. Since the gene signature for CEBPA double mutants is descriptive and not quantitative, it has not been implemented for clinical use in HAMLET.

Detection of tandem duplications in FLT3 and KMT2A
Reliable detection of the common tandem duplications in exon 14-15 of FLT3 is of special importance in AML diagnostics since presence of the FLT3-ITD is an indication for therapy with a tyrosine kinase inhibitor [29]. In addition to the categorical detection of a FLT3-ITD, its position, size, and mutant-to-wild-type allelic ratio have important prognostic impact in AML [30][31][32]. To improve the insufficient detection of FLT3-ITD by VARSCAN, we developed ReSCU as novel algorithm based on calling of partially aligning, SC reads that occur at reciprocal gene positions (Fig. S5).
ReSCU identified 36 cases with a FLT3-ITD, including all seven cases with a small ITD as detected by VARSCAN and all 34 cases identified by accredited diagnostics (Fig. 4a; Tables SXII and SXIII). Targeted resequencing by NGS additionally confirmed presence and length of the FLT3-ITD at positions identified by the soft clipping approach in all but two of these cases (Table SXIV). In the two discrepant cases that were called by ReSCU but not by standard diagnostics, the SC reads represented <1% of total coverage at their respective positions. Careful review of the raw data of the results of the accredited test revealed a weak FLT3-ITD signal below the detection threshold in both cases, suggesting higher sensitivity for HAMLET (Fig. S6). The allelic ratio of mutant-to-wild-type FLT3 fragments Validation of NPM1 mutations and FLT3-ITD was performed by PCR on genomic DNA and electrophoretic fragment size analysis. Validation of other variants was performed by RT-PCR on cDNA followed by Sanger sequencing or NGS using MiSeq. Gene hotspots are defined in Table SVI. mut variant detected, wt wild type.
correlated well with percentages of SC reads as called by ReSCU (Fig. 4b).

Clonal evolution
HAMLET revealed clonal evolution with acquisition and loss of genetic aberrations in all four pairs of samples taken at diagnosis and subsequent relapse (Table SXVIII). Persistence of an IDH2 mutation as the presumed founding event established a common origin of a de novo AML and subsequent presumed therapy-related AML despite markedly divergent genotypes (cases 3-021, 3-038). Therefore, the diagnosis of an independent, therapy-related AML was corrected to relapse of the original AML, although both samples would be classified as different AML subtypes.  CEBPA  CTNNA1  DLC1  LRRC28  MARVELD1   MEIS3  NDFIP1  RAB13  RAB34  SFXN3  SHD  SUCLG2  SVOPL  TNS3 UGT2B28 UMOLD1 Fig. 3 Detection of double CEBPA mutants. Expression was analyzed by RNAseq for 16 genes included in the 19-probe signature associated with bi-allelic CEBPA mutants [14]. Up-and downregulated expression levels are shown in green and red, respectively. All 100 AML cases are indicated on the Y-axis. All three double CEBPA mutants shared the same gene signature (2-009, 2-039, and 2-045). Two single CEBPA mutant cases (2-025, 3-003) clustered with the double mutants. Sanger sequencing revealed that one case (3-003) had a 30 bp C-terminal insertion that was missed by HAMLET, while the other case was a true single N-terminal mutant without C-terminal mutation (2-025). Similar results were obtained for the proposed 55probe signature (data not shown) [13].

Classification and prognostication of AML
Based on HAMLET and metaphase karyotyping results, 99 cases could be reclassified according to WHO 2016 and genomic classifications (Fig. 6) [8,9]. For this genetic classification and risk assessment, all variants were selected that have been reported as recurrent mutations in hematological malignancies by Jaiswal et al. [25]. HAMLET information facilitated risk reassessment in six cases, most notably assignment to adverse risk through NUP98-NSD1 fusions in AML without class-defining lesions, or through EVI1 overexpression without inv(3)(q21q26).

Discussion
This study demonstrates the feasibility and accuracy of mRNAseq for the classification and prognostication of AML. Our strategy is built on the novel diagnostic paradigm to capture comprehensive genetic information on a single platform and interrogation of the primary data with respect to currently relevant information. Results of the individual case analysis are reported automatically in a user friendly, easily interpretable format. Both elements of this strategy can improve independently with advances in sample processing and sequencing technology. Relevant new insight into AML subtypes, prognostic factors, and actionable targets can be readily incorporated into the HAMLET algorithm. By linking mRNAseq output to public databases, reliable data interpretation is possible without concomitant germ-line DNA sequencing. Advances in sequencing technology can be implemented to improve diagnostic sensitivity with regard to minor AML subclones without major redevelopment of the data analysis. HAMLET accurately detects fusion genes with diagnostic relevance for AML, including KMT2A fusions not originating from a t(9;11) but from various different partner genes [38]. Accurate detection of the oncogenic fusion genes ETV6-LYN [19,20], NUP98-NSD1 [21][22][23], and PIM3-SCO2 [24] facilitates their recognition as genetically defined entities for AML classification. The remarkable robustness in detecting fusion genes should render HAM-LET especially valuable to recognize various fusions of tyrosine kinase genes in Philadelphia chromosome-like acute lymphoblastic leukemia [39].

AML sample
Our current experience with RNAseq-HAMLET indicates its potential to replace all genetic tests for classification and risk assessment of AML except for metaphase cytogenetics, which is still required to detect complex karyotypes and aneuploidies. Since RNAseq has successfully been used to detect large chromosomal aberrations in acute lymphoblastic leukemia [46], the aim is to implement this in the next HAMLET version. Although multiplexed PCR and targeted NGS on large gene panels is currently widely implemented in routine diagnostics, long duplications and GC-rich genes remain difficult to sequence by targeted NGS and are therefore often missed (KMT2A-PTD) or analyzed separately (FLT3-ITD and CEBPA). As demonstrated by downsampling the number of RNA reads (Tables SIV, SXI, and SXIII), a sequencing depth of~50 million read pairs is essential to accurately detect all structurally diverse genetic aberrations in AML. A high sequencing depth is particularly needed to detect nonsense and frameshift variants with a premature stop that may be poorly expressed due to nonsense-mediated RNA decay [3,26,27]. This is also the case for genetic aberrations detected by reads that partially align to the GRCh38 genome such as junction reads and spanning fragments for detection of fusion transcripts, as well as SC reads for detection of long tandem duplications in FLT3 and KMT2A. While measurement of FLT3-ITD allelic ratios should be performed on DNA according to the ELN recommendations [1], a recent study in pediatric AML suggests measuring FLT3-ITD preferentially on RNA, since ITD expression is more strongly associated with poor overall survival than its presence per se [40]. However, RNA-based assays should be sufficiently sensitive to detect FLT3-ITDs in AML subclones in order to start treatment with FLT3 inhibitors [29].
HAMLET results at a sequencing depth of 50 million read pairs can be obtained at a competitive price compared with the conventional diagnostic workflow within 5 days from sample receipt to finished report. The pipeline requires 4 h for sample receipt and isolation of mononuclear cells, 1 h for isolation of total RNA, 24 h for mRNA isolation, and cDNA library preparation, 44 h for RNA sequencing, 24 h for RNAseq analysis and 15 min for data interpretation (excluded time for sample shipment, data transfer and other logistics that are dependent on geographic location, computational networks and negotiations).
In conclusion, HAMLET appears to be a reliable approach to obtain comprehensive genetic information with prognostic and therapeutic relevance for AML in a single assay. RNAseq-based diagnostics also have potential for improved risk assessment and prediction of sensitivity to targeted agents on an individual basis [4,47]. Based on results of the present study, we are currently implementing HAMLET as routine diagnostic procedure at Leiden University Medical Center to improve risk-classification and personalized treatment of AML.

Code availability
The latest version of HAMLET is available under the MIT licence from https://git.lumc.nl/hem/hamlet. The version of HAMLET used in this publication is available in a separate branch, i.e., https://git.lumc.nl/hem/hamlet/tree/publication.