Deep learning detects acute myeloid leukemia and predicts NPM1 mutation status from bone marrow smears

Eckardt, Jan-Niklas; Middeke, Jan Moritz; Riechert, Sebastian; Schmittmann, Tim; Sulaiman, Anas Shekh; Kramer, Michael; Sockel, Katja; Kroschinsky, Frank; Schuler, Ulrich; Schetelig, Johannes; Röllig, Christoph; Thiede, Christian; Wendt, Karsten; Bornhäuser, Martin

doi:10.1038/s41375-021-01408-w

Download PDF

Article
Open access
Published: 08 September 2021

ACUTE MYELOID LEUKEMIA

Deep learning detects acute myeloid leukemia and predicts NPM1 mutation status from bone marrow smears

Leukemia volume 36, pages 111–118 (2022)Cite this article

10k Accesses
29 Citations
54 Altmetric
Metrics details

Subjects

Abstract

The evaluation of bone marrow morphology by experienced hematopathologists is essential in the diagnosis of acute myeloid leukemia (AML); however, it suffers from a lack of standardization and inter-observer variability. Deep learning (DL) can process medical image data and provides data-driven class predictions. Here, we apply a multi-step DL approach to automatically segment cells from bone marrow images, distinguish between AML samples and healthy controls with an area under the receiver operating characteristic (AUROC) of 0.9699, and predict the mutation status of Nucleophosmin 1 (NPM1)—one of the most common mutations in AML—with an AUROC of 0.92 using only image data from bone marrow smears. Utilizing occlusion sensitivity maps, we observed so far unreported morphologic cell features such as a pattern of condensed chromatin and perinuclear lightening zones in myeloblasts of NPM1-mutated AML and prominent nucleoli in wild-type NPM1 AML enabling the DL model to provide accurate class predictions.

A guide to artificial intelligence for cancer researchers

Article 16 May 2024

Prediction of DNA methylation-based tumor types from histopathology in central nervous system tumors with deep learning

Article 17 May 2024

Segment anything in medical images

Article Open access 22 January 2024

Introduction

A fundamental component in the diagnostic workflow of acute myeloid leukemia (AML) is cytomorphology [1]. The assessment of myeloblast counts and their morphology is essential for correct diagnosis, response assessment, and relapse detection. Cytomorphology may, in some cases, also lead to the suspicion of possible underlying genetics [2], e.g., in acute promyelocytic leukemia with t(15;17) and PML-RARα [3, 4] and AML with t(8;21), inv(16), or t(16;16) [5]. One of the most commonly mutated genes in AML is Nucleophosmin 1 (NPM1). It plays a critical role in disease initiation and is utilized for molecular risk stratification in the recent European Leukemia Net 2017 (ELN2017) recommendations [1]. Mutated NMP1 can be found in a third of all adult AML cases and up to 50–60% in AML with a normal karyotype [6, 7] and is considered a distinct disease entity in the current WHO classification [8]. So far, different morphological subtypes of AML according to the FAB classification [3] have been associated with different frequencies of NPM1 mutations [9]. However, the interpretation of cytomorphologic image data is subjective, time-consuming, and suffers from intra- and inter-observer variability [10, 11]. Artificial neural nets (ANN) have demonstrated excellent capabilities in the processing of large quantities of image data [12]. Deep learning (DL) models are large-scale ANN consisting of a multitude of interconnected parallel processing units called artificial neurons [13, 14]. Especially convolutional neural nets (CNN) achieve outstanding results in image recognition [15]. These capabilities can be used for computer vision purposes in the diagnosis of acute leukemias [16,17,18].

In our study, we present a CNN-based scalable model that accurately distinguishes between AML cases and healthy subjects from digitalized bone marrow images. Further, our model accurately predicts NPM1 mutation status from bone marrow cytomorphology and unveils distinct morphologic features for the prediction of NPM1 mutation status.

Methods

Data set and molecular analysis

We identified 1251 patients who have been newly diagnosed and treated with AML in the previously reported multicentric trials (AML96 [19]), AML2003 [20], AMLCG1999 [21], AML60+ [22], AMLCG2008 [23], and SORAML [24]) or from the multicentric patient registry of the German Study Alliance Leukemia (SAL, NCT03188874) via retrospective chart review. Eligibility criteria for the AML cohort were newly diagnosed AML according to WHO criteria [8], age ≥18 years, and available biomaterial at initial diagnosis. A control cohort was comprised of 236 bone marrow samples from healthy bone marrow donors who underwent bone marrow donation at our center. Figure 1 shows the set-up of the study cohort and the split of augmented image data between training and test set (4:1). All mentioned studies were previously approved by the Institutional Review Board of the Technical University Dresden. All participants gave their written informed consent according to the Declaration of Helsinki. The preparation of squash slides was performed from anticoagulated bone marrow by experienced laboratory technicians within 2 h after the sample was taken, as recommended by WHO guidelines [25]. Sample staining was performed with the May–Grunwald–Giemsa method [26]. Screening for NPM1 mutations was performed as described previously [27] and a 5% variant allele frequency (VAF) mutation cut-off was used. High-resolution pictures of representative regions of the bone marrow smears (BMS) were taken using the Nikon Eclipse E600 microscope (50-fold magnification) with the Nikon DSFi2 mounted camera and Nikon Imaging Software Elements D4 for image processing. Corresponding regions of interest were manually selected by hematologists and measured 0.1775 × 0.1325 mm. For selection of archived BMS, image acquisition, and upload of images to the database, 10 min of manual labor were needed per case. Per case, a median of 168.5 cells were captured (interquartile range: 124–217). Samples were randomly assigned to either a training or a validation set with a split of 4:1.

DL model

A multi-step machine learning workflow with individual DL models for different tasks was set up as shown in Fig. 2. Step 1: BMS were digitalized and uploaded to an online segmentation and labeling platform that we developed for the purpose of this work. A human-in-the-loop cell segmentation approach was performed by hematologists with a Faster Region-based Convolutional Neural Net [28] (FRCNN). First, initial segmentation was done with the VGG Image Annotator [29] tool. Then, the FRCNN was trained with the segmented images and created new segmentation proposals for unsegmented images which were manually corrected by hematologists. The loop was closed by the refinement of segmentation proposals and repeated network training. A quarter of cases was segmented using this human-in-the-loop approach while for the remainder of cases the FRCNN worked autonomously without the need for human intervention for re-segmentation of cells. This way, segmentation quality improved substantially over iterations eliminating the need for manual segmentation in the following unsegmented images. Additionally, hyperparameter optimization was performed automatically using the Optuna [30] framework with a predefined hyperparameter space. Step 2: Feature extraction was performed manually by hematologists. In all, 8500 individual cells were labeled according to lineage, cell type, and characteristics like Auer rods. Features like cell size, eccentricity, and color range were automatically derived by the computer vision algorithms. Step 3: For distinction between AML and healthy control samples based on segmented images, we trained a multitude of DL models for binary predictions of cell types and characteristics that expressed results as ratios (e.g. ratio of myeloblasts among all cells or features such as presence or absence of Auer rods). The aggregated results given by these individual models were used as input for an Ensemble Neural Net (ENN) for final classification decisions. Model architecture for the distinction between AML and healthy control samples was based on the Xception CNN [31] utilizing transfer learning. Xception architecture was modified to receive BMS images (2560 × 1920 pixels) as input at the top level. Fully connected output layers for the 2048-dimensional output vectors of the core Xception architecture were established via hyperparameter optimization. Hyperparameters differed between individual models. Step 4: For NPM1 status prediction, transfer learning with a ResNet50 [32] pretrained on ImageNet was utilized on BMS images. The ResNet50-architecture was modified to use larger input sizes (2000 × 1500 pixels) and the output layer was reshaped to a fully connected layer with two neurons to accommodate the binary classification problem before retraining. Hyperparameter optimization was performed for learning rate, learning rate gamma, momentum, and weight decay. Occlusion sensitivity maps were used to derive information from classification decisions for NPM1 status prediction. DL models were implemented in Python version 3.7.9 with Keras version 2.3.0, TensorFlow version 2.1.2, and PyTorch version 1.5.0. Computations were performed using a high-performance computing system.

**Fig. 2: Schematic workflow of the multi-step deep learning platform.**

Model performance and statistical analysis

For performance analysis of the classification models, we used precision–recall curves and receiver operating characteristics (ROC) with the area under the curve (AUC). Precision is the fraction of true positives among all positive predictions of the DL model while recall is the fraction of all positive predictions of the DL model among all relevant events. The final models were evaluated on the validation set that was withheld from model training. To compare NPM1 VAF the Mann–Whitney U test was used. Computational and statistical analysis was performed in Python (version 3.7.9) and R (version 4.0.3).

Results

DL accurately distinguishes between BMS of AML patients and healthy bone marrow donor samples

We retrospectively identified 1251 AML patients, 386 of which harbored mutated NPM1 according to molecular analyses. Detailed information on patient characteristics and controls is provided in Table 1. A total of 94,162 individual cells were manually segmented to iteratively train the FRCNN. Subsequently, automatic cell segmentation with the FRCNN achieved a mean average precision and a mean average recall of both 0.97 at an intersection over union ratio of 0.5. Inaccuracies were mostly due to overlapping cells. We then applied a CNN-based binary classification model on the previously segmented images to distinguish between cell types and characteristics and aggregated results were used by an ENN to distinguish between AML and healthy donor samples. We found this multistage approach to substantially increase accuracy over simple whole image classification with only one CNN. Single cell-based disease status prediction was tested, but did not yield satisfactory results (AUROC 0.53). To adjust for the moderate sample size and to balance the data set, simple image augmentation techniques like linear transformations or adjustment of color channels and brightness range BMS images were applied. Thereby, we reached an augmented sample size of 5204 AML and 5428 non-AML (healthy donor) BMS images. To prevent overfitting, we used a pooling dropout of 0.25 as suggested by automated hyperparameter optimization. The AML classification model achieved an average AUC for the precision–recall curve of 0.9691 (95% CI: 0.9669–0.9713; Fig. 3A) and an average AUROC of 0.9699 (95% CI: 0.9677–0.9721; Fig. 3B) with a corresponding micro-average accuracy of 0.91. Table 2A shows the distribution of correctly and incorrectly identified AML and control samples in the validation set.

Table 1 Patient characteristics.

Full size table

**Fig. 3: Performance measures of the AML classification model and *NPM1* prediction.**

Table 2 Classification accuracy in the validation set.

Full size table

DL accurately predicts NPM1 mutation status and unveils morphologic features

Further, we investigated whether DL could predict the mutational status of NPM1 from bone marrow morphology. BMS image classification at a 50-fold magnification was performed using a ResNet50-architecture using transfer learning. Mirroring and random cropping plus resizing was used to increase sample size and to balance the data. Weight decay of 0.0003 is utilized to prevent overfitting and the data was divided into training and test set with a split of 4:1. The model achieved an accuracy of 0.86 for NPM1 prediction and an AUROC of 0.92 (95% CI: 0.8768–0.9631; Fig. 3C). Table 2B shows the distribution of correctly and incorrectly identified NPM1-mutated and NPM1 wild-type samples in the validation set. Classification on single cells compared to whole image classification did not improve accuracy. To identify key morphological features that led the DL model to the prediction of the respective mutation status we used occlusion sensitivity maps (Fig. 4). This method iteratively blocks pixels of an image from being evaluated by the DL model for classification, which may reduce its predictive capabilities. Thereby, image areas that are essential for high accuracy can be detected as they greatly reduce model performance when being blocked (Fig. 4 ii, iii). By analyzing the heatmaps, we observed that the prediction of mutated NPM1 was associated with a pattern of condensed chromatin and perinuclear lightening zones in myeloblasts (Fig. 4a, orange arrows indicate examples). The prediction of NPM1 wild type was driven by prominent nucleoli (Fig. 4b, yellow arrows indicate examples) which could only very rarely be observed in samples with mutated NPM1 and in that context led to misclassification (false negatives). We further analyzed patient samples from the validation set grouped by NPM1 mutation status and true or false predictions given by the DL model regarding clinical and molecular data. NPM1-mutated AML samples that were correctly identified by the model (true positives) had a significantly higher median VAF than NPM1-mutated AML samples that were misclassified (false negatives) (true positives: 0.41 [95% CI: 0.39–0.42] vs. false negatives: 0.31 [95% CI: 0.1–0.42], p = 0.018, Fig. 5). Further, the rate of patients with therapy-associated AML (tAML) was significantly higher among false negatives compared to true positives (27.3% vs. true positives: 4.1%, p = 0.02) and false negatives had a significantly lower median white blood cell count (WBC) (false negatives 13.16 GPt/l [IQR:2.25–28.57] vs. true positives: 37.48 GPt/l [IQR:17.84–84.95], p = 0.007) and a trend for lower blast counts in peripheral blood (false negatives 25.5% [IQR: 4.75–38.5] vs. true positives: 52.5% [IQR: 16–75.25], p = 0.062). No significant differences for age, sex, ELN2017 risk category, absence or presence of a complex karyotype, bone marrow blast count, Hb, and platelet count were detected. For patients with wild-type NPM1 AML, there was a trend for lower median WBC for true negatives (3.54 GPt/l [IQR: 1.52–19.82] vs. false positives 11.3 GPt/l [IQR:3.95–36.849], p = 0.095), but age, sex, ELN2017 risk category, AML type (de novo, secondary or therapy-associated), absence or presence of a complex karyotype, bone marrow or peripheral blast count, Hb and platelet count showed no differences. As another internal sanity check, we applied the pretrained classifier to the healthy bone marrow donor image data set and found that 214/236 (91%) of cases were correctly identified as NPM1 wild type while only 22/236 (9%) were labeled as NPM1 mutated (false positives). However, we want to point out that the NPM1 classifier has never been trained on healthy controls. Considering its AML-specific training, the very low false-positive rate on newly presented and differently structured image data of healthy controls underlines the distinct morphology picked up by the classifier for correct predictions in NPM1-mutated AML. Interestingly, when we reviewed patient chart data and molecular results in the validation set, we found one sample that was incorrectly labeled as mutated NPM1, but was in fact wild-type NPM1, and the corresponding BMS image was correctly identified as such by the DL model.

**Fig. 4: Application of occlusion sensitivity maps to detect features derived by deep learning for the prediction *NPM1* mutation status in AML.**

**Fig. 5: Variant allele frequency of *NPM1* true positives and *NPM1* false negatives.**

Discussion

We here present a machine learning approach for cell segmentation and image classification which provides a fast, scalable, and highly accurate method to identify AML samples from bone marrow cytomorphology. Our FRCNN achieved a cell segmentation accuracy of 0.97 from BMS. The binary classification model showed an AUC of 0.97 for both the ROC and the precision–recall curve and a micro-average accuracy of 0.91 distinguishing between AML and healthy bone marrow donor samples. Our model can potentially be applied in initial diagnosis when a case of suspected AML is evaluated upon first contact in a hematologic center. It operates autonomously once BMS images are uploaded and detects AML with high accuracy. The model could operate synchronously with lab technicians to flag cases that are highly suspicious of AML for fast evaluation by experts while results from other diagnostic procedures like flow cytometry, cytogenetics, and molecular genetics are still pending. However, a human-in-the-loop approach is still needed as we manually selected representative regions of the BMS for evaluation by the DL model. Also, it is to be noted that bone marrow donors in our cohort were substantially younger than AML patients. Increased age is associated with observable changes in the bone marrow microenvironment such as cellularity, proliferative activity, and apoptosis [33] and such systematic differences could introduce bias to a CNN classifier which needs to be taken seriously not only in our use case but also considering other applications of more subtle changes in bone marrow morphology. Further evaluation of the model using more diverse multicenter data is warranted. Another limitation is the necessity for manual selection of BMS areas representative for disease classification by human judgment. Since this is a potential source of bias, future work will focus on implementing whole slide imaging and an automatization of region-of-interest selection given recent advances such as DL-based automated focusing [34]. Further automatization of BMS development can be achieved using automated BMS staining devices [35].

Furthermore, we used DL to predict the mutation status of NPM1 from cytomorphology alone. For NPM1 prediction our DL model achieved a high accuracy of 0.86 in predicting mutation status. AML with mutated NPM1 has previously been associated with cup-like blast morphology [36, 37]. When analyzing the features that the model used for NPM1 classification with occlusion sensitivity maps, we found so far unreported features like a pattern of condensed chromatin accompanied by perinuclear lightening zones for NPM1-mutated blasts. We observed prominent nucleoli in myeloblasts as a feature the DL model derived to predict wild-type NPM1 AML while these could only rarely be observed in NPM1-mutated AML samples and then led to misclassification by the model. Further, we found a significantly higher VAF in NPM1 true positives while the group of false negatives was comprised of a significantly higher rate of tAML. Wild-type NPM1 serves as a critical structural protein of the nucleolus, but mutations lead to a delocalization to the cytoplasm [38]. This process is partially triggered by insertions causing a frameshift of the C-terminal end of NPM1 and the formation of nuclear export signals [39, 40]. Weakened anchoring and predominant export signals subsequently result in increased nuclear export of NPM1 [38]. Arguably, a cytomorphologic correlate of this process may be the presence of prominent nucleoli in wild-type NPM1 AML and the absence thereof in mutated NPM1 AML—both detected as highly predictive features by our DL model.

Our study shows that DL can derive morphologic features from cytomorphology that predict mutation status. Future work will focus on other clinically important mutations and their morphologic imprint that DL may be able to pick up. In line with our findings, a recent study showed that DL can associate the morphology of myelodysplastic syndromes (MDS) with distinct genetic imprints [41]. However, in order to be integrated into clinical practice, machine-learning models need to be accurate and generalizable. As their development is complex, collaborations between physicians and software engineers is needed in an iterative approach to increase model performance. Since the majority of recently proposed machine-learning models—along with our model—are built on retrospective data, future studies will have to implement such models in a prospective setting to confirm their diagnostic value [42]. Due to the heterogeneity of cell morphology as well as close proximity of cells, disease classification from bone marrow is much more complex than in peripheral blood. Our use case to delineate AML from healthy bone marrow serves as groundwork for more complicated applications of CNNs in bone marrow morphology. AML is defined by bone marrow blast count [1] and CNNs can use a ratio of blasts to accurately detect AML. However, more complex use cases such as reactive bone marrow changes, benign disorders such as vitamin B₁₂ deficiency, or hematologic neoplasms such as MDS are associated with subtle changes of cell morphology [43, 44]. In this scenario, CNNs have to be trained to accurately detect and assign such morphologic changes to the respective disorders. Mori et al. [45] recently used CNN-guided detection of decreased granules—one of the most common dysplastic changes in MDS—and report high accuracy for their classifier based on the ResNet-152. Accordingly, integrated analysis of more complex morphologies can potentially be achieved by feature engineering using a knowledge bank of expert-annotated cells with sufficiently sized training sets per morphologic feature (conceivably in the four- to five-digit number range). Since many hematologic neoplasms are rare disease entities, the development of such a large database requires extensive cooperation and data sharing between institutions and countries to ideally provide an open-source bone marrow database where independent ML models can be trained on analogous to existing cancer data bases such as The Cancer Genome Atlas [46]. Nevertheless, samples need to be properly anonymized to warrant patient data safety. If maintained and funded properly, such a database may vastly accelerate the development of clinically relevant computer vision tools for hematologic diagnostics. Standardization of data acquisition and accessible documentation of methodologies should be implemented to limit bias inherent to local methodologies of digitizing BMS and reporting patient data. Further, an integration of different diagnostic modalities such as cytomorphology both of bone marrow and peripheral blood, flow cytometry as well as genetic and clinical data seems warranted to build ML models that may aid in clinical decision making since evaluating only one modality at a time is insufficient for accurate diagnosis. Ensemble learning could be used to integrate the outputs of different ML models for different diagnostic modalities and provide a comprehensive and interpretable output to the clinician. Future work will focus on the extension of our ML pipeline for other use cases as well as different diagnostic modalities. As our study was limited to our center only, future studies will focus on transferability.

In conclusion, we here present a DL approach for the fast and accurate detection of AML from bone marrow cytomorphology. Our DL model accurately predicts NPM1 mutation status and derived so far unreported morphologic features that indicate absence or presence of NPM1 mutations from myeloblast morphology. This approach can be implemented to aid in clinical decision making, accelerate diagnosis, and may serve as a proof-of-concept for further studies of genetic imprints on disease morphology using DL.

Data availability

De-identified original BMS image data that supported the findings of this study are publicly available under https://www.kaggle.com/sebastianriechert/bone-marrow-slides-for-leukemia-prediction.

Code availability

Python code for the DL models developed and implemented for the purpose of this study is publicly available under https://github.com/TimSchmittmann/DL_detection_of_AML_from_BMS and https://github.com/SebastianRiechert/autofrcnn and https://github.com/SebastianRiechert/npm1-training.

References

Döhner H, Estey E, Grimwade D, Amadori S, Appelbaum FR, Büchner T. et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood. 2017;129:424–47.
Article PubMed PubMed Central Google Scholar
Bain BJ, Béné MC. Morphological and immunophenotypic clues to the WHO categories of acute myeloid leukaemia. Acta Hematol. 2019;141:232–44.
Article CAS Google Scholar
Bennett JM, Catovsky D, Daniel MT, Flandrin G, Galton DA, Gralnick HR. et al. Proposals for the classification of the acute leukaemias. French-American-British (FAB) co-operative group. Br J Haematol. 1976;33:451–8.
Article CAS PubMed Google Scholar
de Thé H, Chomienne C, Lanotte M, Degos L, Dejean A. The t(15;17) translocation of acute promyelocytic leukaemia fuses the retinoic acid receptor alpha gene to a novel transcribed locus. Nature. 1990;347:558–61.
Article PubMed Google Scholar
Nishii K, Usui E, Katayama N, Lorenzo VF, Nakase K, Kobayashi T. et al. Characteristics of t(8;21) acute myeloid leukemia (AML) with additional chromosomal abnormality: concomitant trisomy 4 may constitute a distinctive subtype of t(8;21) AML. Leukemia. 2003;17:731–7.
Article CAS PubMed Google Scholar
Falini B, Martelli MP, Bolli N, Sportoletti P, Liso A, Tiacci E. et al. Acute myeloid leukemia with mutated nucleophosmin (NPM1): is it a distinct entity?. Blood. 2011;117:1109–20.
Article CAS PubMed Google Scholar
Grimwade D, Ivey A, Huntly BJP. Molecular landscape of acute myeloid leukemia in younger adults and its clinical relevance. Blood. 2016;127:29–41.
Article CAS PubMed PubMed Central Google Scholar
Arber DA, Orazi A, Hasserjian R, Thiele J, Borowitz MJ, Le Beau MM. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127:2391–405.
Article CAS PubMed Google Scholar
Rose D, Haferlach T, Schnittger S, Perglerová K, Kern W, Haferlach C. Specific patterns of molecular mutations determine the morphologic differentiation stages in acute myeloid leukemia (AML). Blood. 2014;124:2388–2388.
Article Google Scholar
Dasariraju S, Huo M, McCalla S. Detection and classification of immature leukocytes for diagnosis of acute myeloid leukemia using random Forest algorithm. Bioengineering (Basel). 2020;7:120.
Fuentes-Arderiu X, Dot-Bach D. Measurement uncertainty in manual differential leukocyte counting. Clin Chem Lab Med. 2009;47:112–5.
Article CAS PubMed Google Scholar
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60:84–90.
Article Google Scholar
Jain AK, Jianchang M, Mohiuddin KM. Artificial neural networks: a tutorial. Computer. 1996;29:31–44.
Article Google Scholar
Basheer IA, Hajmeer M. Artificial neural networks: fundamentals, computing, design, and application. J Microbiol Methods. 2000;43:3–31.
Article CAS PubMed Google Scholar
Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS. Deep learning for visual understanding: a review. Neurocomputing. 2016;187:27–48.
Article Google Scholar
Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(S1):46–53.
Article PubMed Google Scholar
Matek C, Schwarz S, Spiekermann K, Marr C. Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks. Nat Mach Intell. 2019;1:538–44.
Article Google Scholar
Ahmed N, Yigit A, Isik Z, Alpkocak A. Identification of leukemia subtypes from microscopic images using convolutional neural network. Diagnostics (Basel). 2019;9. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6787617/.
Röllig C, Thiede C, Gramatzki M, Aulitzky W, Bodenstein H, Bornhäuser M. et al. A novel prognostic model in elderly patients with acute myeloid leukemia: results of 909 patients entered into the prospective AML96 trial. Blood. 2010;116:971–8.
Article PubMed Google Scholar
Schaich M, Parmentier S, Kramer M, Illmer T, Stölzel F, Röllig C. et al. High-dose cytarabine consolidation with or without additional amsacrine and mitoxantrone in acute myeloid leukemia: results of the prospective randomized AML2003 trial. J Clin Oncol. 2013;31:2094–102.
Article CAS PubMed Google Scholar
Buchner T, Berdel WE, Haferlach C, Schnittger S, Haferlach T, Serve H. et al. Long-term results in patients with acute myeloid leukemia (AML): the influence of high-dose AraC, G-CSF priming, autologous transplantation, prolonged maintenance, age, history, cytogenetics, and mutation status. Data of the AMLCG 1999 Trial. Blood. 2009;114:485–485.
Article Google Scholar
Röllig C, Kramer M, Gabrecht M, Hänel M, Herbst R, Kaiser U. et al. Intermediate-dose cytarabine plus mitoxantrone versus standard-dose cytarabine plus daunorubicin for acute myeloid leukemia in elderly patients. Ann Oncol. 2018;29:973–8. 01.
Article PubMed Google Scholar
Braess J, Amler S, Kreuzer K-A, Spiekermann K, Lindemann HW, Lengfelder E. et al. Sequential high-dose cytarabine and mitoxantrone (S-HAM) versus standard double induction in acute myeloid leukemia—a phase 3 study. Leukemia. 2018;32:2558–71.
Article CAS PubMed PubMed Central Google Scholar
Röllig C, Serve H, Hüttmann A, Noppeney R, Müller-Tidow C, Krug U. et al. Addition of sorafenib versus placebo to standard therapy in patients aged 60 years or younger with newly diagnosed acute myeloid leukaemia (SORAML): a multicentre, phase 2, randomised controlled trial. Lancet Oncol. 2015;16:1691–9.
Article PubMed Google Scholar
Swerdlow SH, Campo E, Pileri SA, Harris NL, Stein H, Siebert R. et al. The 2016 revision of the World Health Organization classification of lymphoid neoplasms. Blood. 2016;127:2375–90.
Article CAS PubMed PubMed Central Google Scholar
Bain BJ, Clark DM, Wilkins BS. Bone marrow pathology. Wiley; 2019. p. 736.
Thiede C, Koch S, Creutzig E, Steudel C, Illmer T, Schaich M. et al. Prevalence and prognostic impact of NPM1 mutations in 1485 adult patients with acute myeloid leukemia (AML). Blood. 2006;107:4011–20.
Article CAS PubMed Google Scholar
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017;39:1137–49.
Article PubMed Google Scholar
Dutta A, Zisserman A The VIA annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia. Association for Computing Machinery (MM’19); 2019. p. 2276–9. Available from: https://doi.org/10.1145/3343031.3350535.
Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery (KDD’19); 2019. p. 2623–31. Available from: https://doi.org/10.1145/3292500.3330701.
Chollet F. Xception: Deep learning with depthwise separable convolutions. arXiv:161002357 [cs] [Preprint]. 2017 [cited 2021 Jan 12]. Available from: http://arxiv.org/abs/1610.02357.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. arXiv:151203385 [cs] [Preprint]. 2015 [cited 2021 Apr 15]. Available from: http://arxiv.org/abs/1512.03385.
Ogawa T, Kitagawa M, Hirokawa K. Age-related changes of human bone marrow: a histometric estimation of proliferative cells, apoptotic cells, T cells, B cells and macrophages. Mech Ageing Dev. 2000;117:57–68.
Article CAS PubMed Google Scholar
Rai Dastidar T, Ethirajan R. Whole slide imaging system using deep learning-based automated focusing. Biomed Opt Express. 2019;11:480–91.
Article PubMed PubMed Central Google Scholar
Gemen EFA, de Wit NCJ, van Gerven MPB, de Jongh-Leuvenink J. The Sysmex SP1000i for automated bone marrow slide smear staining. Lab Med. 2009;40:719–23.
Article Google Scholar
Kroschinsky FP, Schäkel U, Fischer R, Mohr B, Oelschlaegel U, Repp R. et al. Cup-like acute myeloid leukemia: new disease or artificial phenomenon?. Haematologica. 2008;93:283–6.
Article PubMed Google Scholar
Park BG, Chi H-S, Jang S, Park C-J, Kim D-Y, Lee J-H. et al. Association of cup-like nuclei in blasts with FLT3 and NPM1 mutations in acute myeloid leukemia. Ann Hematol. 2013;92:451–7.
Article CAS PubMed Google Scholar
Falini B, Brunetti L, Sportoletti P, Martelli MP. NPM1-mutated acute myeloid leukemia: from bench to bedside. Blood. 2020;136:1707–21.
Article PubMed Google Scholar
Falini B, Bolli N, Shan J, Martelli MP, Liso A, Pucciarini A. et al. Both carboxy-terminus NES motif and mutated tryptophan(s) are crucial for aberrant nuclear export of nucleophosmin leukemic mutants in NPMc+ AML. Blood. 2006;107:4514–23.
Article CAS PubMed Google Scholar
Falini B, Bolli N, Liso A, Martelli MP, Mannucci R, Pileri S. et al. Altered nucleophosmin transport in acute myeloid leukaemia with mutated NPM1: molecular basis and clinical implications. Leukemia. 2009;23:1731–43.
Article CAS PubMed Google Scholar
Nagata Y, Zhao R, Awada H, Kerr CM, Mirzaev I, Kongkiatkamon S. et al. Machine learning demonstrates that somatic mutations imprint invariant morphologic features in myelodysplastic syndromes. Blood. 2020;136:2249–62.
Article PubMed PubMed Central Google Scholar
Eckardt J-N, Bornhäuser M, Wendt K, Middeke JM. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects. Blood Adv. 2020;4:6077–85.
Article CAS PubMed PubMed Central Google Scholar
Cazzola M. Myelodysplastic ayndromes. N Engl J Med. 2020;383:1358–74.
Article CAS PubMed Google Scholar
Bain BJ. Diagnosis from the blood smear. N Engl J Med. 2005;353:498–507.
Article CAS PubMed Google Scholar
Mori J, Kaji S, Kawai H, Kida S, Tsubokura M, Fukatsu M, et al. Assessment of dysplasia in bone marrow smear with convolutional neural network. Sci Rep. 2020;10:14734.
Article CAS PubMed PubMed Central Google Scholar
Ley TJ, Miller C, Ding L, Raphael BJ, Mungall AJ, Robertson A.Cancer Genome Atlas Research Network et al. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med. 2013;368:2059–74.
Article PubMed Google Scholar

Download references

Acknowledgements

We thank all contributing physicians, laboratories, and nurses associated with the German Study Alliance Leukemia and especially participating patients for their valuable contributions. The authors are grateful to the Centre for Information Services and High-Performance Computing TU Dresden for providing its facilities for high throughput calculations.

Funding

This study was funded in part by a MeDDrive Grant number 60499 Machine Learning for advanced integrated diagnostics in hematological malignancies to Dr. Middeke from the Technical University Dresden. Open Access funding enabled and organized by Projekt DEAL.

Author information

These authors contributed equally: Jan-Niklas Eckardt, Jan Moritz Middeke.
These authors contributed equally: Karsten Wendt, Martin Bornhäuser.

Authors and Affiliations

Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
Jan-Niklas Eckardt, Jan Moritz Middeke, Anas Shekh Sulaiman, Michael Kramer, Katja Sockel, Frank Kroschinsky, Ulrich Schuler, Johannes Schetelig, Christoph Röllig, Christian Thiede & Martin Bornhäuser
Institute of Circuits and Systems, Technical University Dresden, Dresden, Germany
Sebastian Riechert, Tim Schmittmann & Karsten Wendt
German Consortium for Translational Cancer Research DKFZ, Heidelberg, Germany
Martin Bornhäuser
National Center for Tumor Diseases (NCT), Dresden, Germany
Martin Bornhäuser

Authors

Jan-Niklas Eckardt
View author publications
You can also search for this author in PubMed Google Scholar
Jan Moritz Middeke
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Riechert
View author publications
You can also search for this author in PubMed Google Scholar
Tim Schmittmann
View author publications
You can also search for this author in PubMed Google Scholar
Anas Shekh Sulaiman
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kramer
View author publications
You can also search for this author in PubMed Google Scholar
Katja Sockel
View author publications
You can also search for this author in PubMed Google Scholar
Frank Kroschinsky
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Schuler
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Schetelig
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Röllig
View author publications
You can also search for this author in PubMed Google Scholar
Christian Thiede
View author publications
You can also search for this author in PubMed Google Scholar
Karsten Wendt
View author publications
You can also search for this author in PubMed Google Scholar
Martin Bornhäuser
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.-N.E., J.M.M., K.W., and M.B. designed the study. J.-N.E., J.M.M., A.S.S., K.S., F.K., U.S., J.S., C.R., C.T., and M.B. provided and analyzed patient samples. S.R. T.S., and K.W. developed and implemented the image processing and classification pipeline with neural networks. All authors analyzed and interpreted the data. J.-N.E. wrote the draft. All authors critically revised and edited the manuscript. All authors approved the final version of the manuscript.

Corresponding author

Correspondence to Jan-Niklas Eckardt.

Ethics declarations

Competing interests

C.T. is CEO and co-owner of AgenDix GmbH, a company performing molecular diagnostics. This study was funded in part by a MeDDrive Grant number 60499 Machine Learning for advanced integrated diagnostics in haematological malignancies from the Technical University Dresden to Dr. Middeke. The remaining authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Eckardt, JN., Middeke, J.M., Riechert, S. et al. Deep learning detects acute myeloid leukemia and predicts NPM1 mutation status from bone marrow smears. Leukemia 36, 111–118 (2022). https://doi.org/10.1038/s41375-021-01408-w

Download citation

Received: 30 April 2021
Revised: 12 August 2021
Accepted: 27 August 2021
Published: 08 September 2021
Issue Date: January 2022
DOI: https://doi.org/10.1038/s41375-021-01408-w

This article is cited by

Comparative analysis of feature-based ML and CNN for binucleated erythroblast quantification in myelodysplastic syndrome patients using imaging flow cytometry data
- Carina A. Rosenberg
- Matthew A. Rodrigues
- Maja Ludvigsen
Scientific Reports (2024)
A lightweight deep learning model for acute myeloid leukemia-related blast cell identification
- Bing Leng
- Hao Jiang
- Gangyin Luo
The Journal of Supercomputing (2024)
Bildorientierte KI zur Unterstützung der zytomorphologischen Leukämiediagnostik
- Christian Matek
- Karsten Spiekermann
- Carsten Marr
InFo Hämatologie + Onkologie (2024)
High-accuracy morphological identification of bone marrow cells using deep learning-based Morphogo system
- Zhanwu Lv
- Xinyi Cao
- Huangling Deng
Scientific Reports (2023)
Morphological diagnosis of hematologic malignancy using feature fusion-based deep convolutional neural network
- D. P. Yadav
- Deepak Kumar
- Mohd Asif Shah
Scientific Reports (2023)