Main

The molecular characterization of myelodysplastic syndromes has dramatically improved stratification of this heterogeneous group of disorders into clinically relevant subgroups. Although identification of chromosomal abnormalities has demonstrated utility in the diagnosis and prognostic stratification of myelodysplastic syndromes, the significance of somatic mutations in the diagnosis of these disorders is not as well established.1, 2, 3, 4, 5 It has been estimated that approximately 80% of myelodysplastic syndrome cases have at least one mutation in one of approximately 40 genes.6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 However, it is essential to note that numerous studies have demonstrated that many of the same mutations associated with myelodysplastic syndromes are found in 5–10% of individuals over 70 years of age and without evidence of overt neoplasia or any clinical evidence of hematologic abnormality, a condition termed clonal hematopoiesis of indeterminate potential.19, 20, 21 Individuals with clonal hematopoiesis of indeterminate potential appear to have an increased risk of developing a hematologic malignancy, a risk that increases with increasing variant allele frequency. Clonal hematopoiesis of indeterminate potential thus appears to represent a benign precursor condition that increases the risk of subsequent malignancy, similar to the relationship between monoclonal gammopathy of undetermined significance and plasma cell myeloma and small monoclonal B-cell lymphocytosis and chronic lymphocytic leukemia. However, the absolute risk of progressing to malignancy remains low and, while many malignancies that develop in individuals with clonal hematopoiesis of indeterminate potential are myelodysplastic syndrome or acute myeloid leukemia, other non-myeloid malignancies develop, which limits the ability to predict the clinical behavior of individuals with this condition. Thus, identification of these mutations in individuals without cytopenias or other clinical evidence of hematologic malignancy is of unclear significance at the present time.

Although there is now substantial evidence for clonal hematopoiesis involving mutations in driver genes of hematologic malignancy in asymptomatic individuals, few studies have evaluated whether mutations in these genes are also found in patients with pancytopenia but without peripheral blood or bone marrow evidence of myelodysplasia. Previous work has identified a category of patients with idiopathic cytopenia of undetermined significance, defined as unexplained cytopenias that do not meet criteria for a diagnosis of a myelodysplastic syndrome.22 Kwok et al23 recently examined the mutation status of 22 genes frequently mutated in myeloid malignancy and found that 45–62% of patients with idiopathic cytopenia of undetermined significance harbored at least one somatic mutation in one of these genes. These data indicate that a subset of cases of idiopathic cytopenia of undetermined significance is associated with expansion of a hematopoietic clone that has mutations in myelodysplastic syndrome-associated genes23 and provide the basis for a proposed new diagnostic category of clonal idiopathic cytopenias of undetermined significance that may lie on a spectrum between clonal hematopoiesis of indeterminate potential and myelodysplastic syndromes. However, the clinical behavior of idiopathic cytopenias of undetermined significance and clonal idiopathic cytopenias of undetermined significance is not yet known. It is not yet known whether the cytopenias associated with idiopathic cytopenias of undetermined significance or clonal idiopathic cytopenias of undetermined significance are stable or whether these conditions are associated with a higher risk of progressing to ‘true’ myelodysplastic syndrome or another myeloid neoplasm. It is also not yet clear whether these mutations have a pathogenic role with respect to the patient’s cytopenias or are simply passenger variants that are associated with true driver mutations.

To add to the growing body of work regarding myelodysplastic syndrome-associated mutations in different populations, we evaluated the frequency of variants in 20 myelodysplastic syndrome-associated genes in 53 patients with pancytopenia who lacked evidence of malignancy on bone marrow aspirate and biopsy. We also provide some data that begin to address the clinical behavior of patients with pancytopenia and somatic mutations in myelodysplastic syndrome-associated genes.

Materials and methods

Case Selection

Bone marrow aspirates obtained from December 2007 to December 2012 were selected from the Stanford Department of Pathology, with the specific aim of collecting cases from patients with pancytopenia not meeting World Health Organization (WHO) 2008 criteria for a diagnosis of a myeloid neoplasm and with normal cytogenetics. The clinical features of most of the cases have been previously reported.24 The electronic pathology records were searched for bone marrow aspirates obtained in the above time frame for the term ‘pancytopenia.’ The final pathology report of each case and the clinical chart was reviewed by one of the authors (SFP). Fifty-three cases meeting these criteria were identified and are referred to as an idiopathic pancytopenia group. One case in the idiopathic pancytopenia group had a deletion of chromosome Y, which is considered to be an age-related event and not related to neoplasia, and no mutations were identified in this case.25 The rest of the 52 idiopathic pancytopenia cases had normal cytogenetics. The 53 idiopathic pancytopenia cases were subcategorized into one of 6 groups based on the final pathology report and clinical notes (Table 1). In addition, 38 cases from patients presenting with pancytopenia, but also meeting WHO 2008 criteria for a diagnosis of myelodysplastic syndrome (n=20), mixed myelodysplastic syndrome/myeloproliferative neoplasm (n=1), or acute myeloid leukemia (n=17), were also selected as a comparison group. A summary of all of the cases evaluated in the study is provided in Table 1.

Table 1 Cohort Characteristics

Targeted Exome Sequencing

Targeted exome sequencing was performed as described previously.26 Briefly, genomic DNA was extracted from archive unstained bone marrow aspirate slides using the Qiagen DNA blood and tissue kit (Qiagen, Valencia, CA, USA) according to the manufacturer’s instructions. A Haloplex (Agilent Technologies, Santa Clara, CA, USA) target enrichment panel of 20 genes and 49 selected exons in ASXL1, CBL, CEBPA, CSF3R, DNMT3A, EZH2, FLT3, FLT3-ITD, IDH1, IDH2, JAK2, MPL, NPM1, NRAS, RUNX1, U2AF1, SETBP1, SF3B1, SRSF2, TET2, and TP53 was designed using SureDesign (Agilent Technologies). Haloplex target enrichment DNA libraries were prepared according to the manufacturer’s instructions (Agilent Technologies). Samples were sequenced on a MiSeq sequencer using the MiSeq Reagent Kit v2 with 2 × 150 cycle paired-end reads (Illumina, San Diego, CA, USA).

Data were analyzed on Agilent SureCall 3.0 software, using FSQ data downloaded from Illumina’s BaseSpace. The average and median read depths of variant alleles were 1895 and 1019, respectively. The mean variant allele frequency of all of the variant alleles from all 91 cases was 0.31. Boxplot figures of the read depths and variant allele frequencies for all genes and cases are shown in Supplementary Figures 1 and 2. The distribution of the different types of mutations (eg, transitions, transversions, insertions, and deletions) of the variants selected for further analysis is shown in Supplementary Figure 4.

Somatic Variant Annotation

After sequencing, we used several criteria to identify somatic variants. Somatic variants were called if they were (i) predicted to cause a change in amino-acid sequence (ie, substitutions, deletions, non-sense, and frame-shift mutations) and (ii) had a variant allele fraction that did not fall within the range of 45–55% to exclude cases that might be heterozygous variants. In addition, variants were excluded if the allele frequency in the Exome Aggregation Consortium (ExAC) database was greater than 1% in order to exclude potential single-nucleotide polymorphisms. We also searched Cbioportal (http://www.cbioportal.org/) and the Catalogue of Somatic Mutations in Cancer (COSMIC, http://cancer.sanger.ac.uk/cosmic) to see whether variants had been identified as somatic variants in other malignancies. All variants that passed the above inclusion and exclusion criteria are shown in Supplementary Table 1 together with patient clinical characteristics.

Statistics

All statistics reported were performed using R or Excel software.

Results

Cohort Description

The 53 patients with idiopathic pancytopenia included 28 cases (53%) with no identifiable associated reason for their pancytopenia (idiopathic cytopenia of undetermined significance group), 13 (25%) with aplastic anemia, 4 (8%) with pancytopenia attributable to liver disease, 4 (8%) with pancytopenia associated with autoimmune disease, and 4 (8%) with pancytopenia suspected to be due to drug effect. The association of these patients’ pancytopenia with particular conditions was made based on the lack of evidence of hematologic malignancy after bone marrow examination and review of the clinical notes. We grouped the cases of myelodysplastic syndrome, myelodysplastic syndrome/myeloproliferative neoplasm, and acute myeloid leukemia into a malignant pancytopenia group. The idiopathic pancytopenia group had a lower average age (46 vs 66 years, P<0.0001, Wilcoxon rank-sum test) and a lower number of mutations per case that were statistically significant (idiopathic pancytopenia=0.81, range 0–5, vs malignant=1.18, range 0–5, P=0.041). The malignant group was more likely to have at least one mutation in any gene than the idiopathic pancytopenia group (68 vs 38%, P=0.012 chi-square test, Figure 1a). There was no difference between the groups in terms of the frequency of cases with two or more mutations. The distribution of the number of mutations in the two groups is shown in Figure 1b. When looking within age decades, among patients between 51 and 60 years of age, the malignant cases were more likely to have at least one gene mutation than idiopathic pancytopenia cases (P<0.05, Fisher’s exact test, Figure 1c).

Figure 1
figure 1

Frequency of mutations in 20 genes in 91 cases of pancytopenia. (a) Percent of cases with at least one mutation (P=0.012 chi-square test). (b) Distribution of the frequency of cases with 0, 1, 2, 3, 4, or 5 mutations per case. (c) Percent of cases with at least one mutation in different age decades. The asterisk indicates that P<0.05, Fisher’s exact test. IP, idiopathic pancytopenia cases; M, malignant cases.

For the idiopathic pancytopenia group, we had a mean follow-up time of 739 days (median=444 days, range 0–7.7 years). For the idiopathic pancytopenia cases with at least one mutation (n=20), the mean follow-up was 797 days, median was 397, range 0–2799 (7.7 years). Eight of the individuals with idiopathic pancytopenia died in the follow-up period available. Seven of the patients who died had co-morbidities with high mortality rate, including malignancy (1 glioblastoma multiforme, 1 metastatic hepatocellular carcinoma), cirrhosis (4 patients), and necrotizing pancreatitis (1 patient). One patient died in a motor vehicle accident. Of the eight idiopathic pancytopenia cases who died during the follow-up period, 5 cases had no mutation identified, 2 cases had 1 mutation, and 1 case had 3 mutations. Of the 53 individuals in the idiopathic pancytopenia group, none progressed to a myelodysplastic syndrome or an acute myeloid leukemia in the follow-up time available (Table 1).

Somatic Mutations in U2AF1 are More Frequently Found in Acute Myeloid Leukemia and Myelodysplastic Syndromes than in Idiopathic Pancytopenias

In all, 11 of the 20 genes evaluated were more frequently mutated in malignant cases than in idiopathic pancytopenia cases, 7 were more frequently mutated in idiopathic pancytopenia cases, and no mutations were identified in 2 genes (IDH1 and NRAS; Figure 2). The frequency of somatic mutations between the two groups was statistically significant for U2AF1, which was mutated in 5 of the 38 malignant cases (13%) and in none of the idiopathic pancytopenia cases (P-value=0.0108, Fisher’s exact test, Figure 2). A U2AF1 mutation was found in 2 cases of refractory cytopenia with multilineage dysplasia, 1 case of refractory anemia with excess blasts-2, 1 case of mixed myelodysplastic syndrome/myeloproliferative neoplasm with 17% circulating blasts, and 1 case of acute myeloid leukemia. Four of the U2AF1 mutations identified in our study result in Q157P substitutions and the other mutation results in a Q157R substitution. Our findings are consistent with several reports that identify heterozygous Q157 substitutions in U2AF1 (also known as U2AF35) mutations in cases of myelodysplastic syndrome and acute myeloid leukemia.27, 28 Although the frequency of mutations in the other genes was not different between the groups at the P<0.05 significance level, TP53, SETBP1, and RUNX1 were more frequently mutated in malignant cases (Figure 2).

Figure 2
figure 2

Frequency of mutations in 20 genes evaluated in idiopathic pancytopenia vs malignant cases. The asterisk indicates P-value=0.0108, Fisher’s exact test.

Discussion

Our findings are remarkably similar to those published in previous work that finds a significant number of somatic mutations in myelodysplastic syndrome-associated genes in patients with idiopathic cytopenias. In addition, in our cohort, individuals with idiopathic pancytopenia did not progress to overt myelodysplastic syndrome or acute myeloid leukemia within the clinical follow-up available for our study, even when a mutation is detected. In aggregate, the findings of our work and those of others raise additional questions regarding the utility of testing for myelodysplastic syndrome genes in patients with cytopenias who do not meet conventional diagnostic criteria for myelodysplastic syndrome.

It is interesting to compare our findings with those of Kwok et al, who also evaluated the frequency of somatic mutations in myelodysplastic syndrome-associated genes in cases of idiopathic cytopenia of undetermined significance. The genes studied by Kwok et al included the same 20 genes evaluated in our study except that they did not evaluate CEBPA and they additionally examined ETV6, PHF6, and ZRSR2 in their study. Somatic mutations in the four genes that differ between our study and the prior study represent a small fraction of the variants found in either study. In our idiopathic pancytopenia group, which includes patients with idiopathic cytopenia of undetermined significance as well as patients with co-morbidities who are often associated with cytopenias, we find evidence of clonal hematopoiesis in 38% of cases. It is likely that this is a significant underestimate of the true frequency of clonal hematopoiesis in these patients as we only looked at selected exons in 20 genes. In the study by Kwok et al, it is not entirely clear whether the idiopathic cytopenia of undetermined significance cases selected included cases with the co-morbidities whom we include in our idiopathic pancytopenia group. It will be interesting to see whether other investigators find evidence of mutations in myelodysplastic syndrome-associated genes in patients with cytopenias that are attributed to patient co-morbidities.

Although our study cohort differs from that of Kwok et al in several ways, our findings largely corroborate the prior ones. In both studies, the frequency of cases with at least one somatic variant in a myelodysplastic syndrome-associated gene is similar (38% in our idiopathic pancytopenia group and 35% overall in their idiopathic cytopenia of undetermined significance group). TET2, DNMT3A, and ASXL1 were three most frequently mutated genes in both our idiopathic pancytopenia group and their idiopathic cytopenia of undetermined significance group. Importantly, the frequency of mutations in these genes and most of the others evaluated did not significantly differ between the idiopathic pancytopenia/idiopathic cytopenia of undetermined significance groups and the myelodysplastic syndrome/acute myeloid leukemia cohorts, suggesting that variants in these genes are not useful in deciphering between myelodysplastic syndrome and several other causes of cytopenia. In both studies, U2AF1 variants are not typically found in cases of idiopathic cytopenia of undetermined significance. We found no somatic mutations in U2AF1 in idiopathic pancytopenia cases and Kwok et al only found one case of idiopathic cytopenia of undetermined significance that had a U2AF1 gene mutation out of 369 cases of idiopathic cytopenia of undetermined significance. These findings suggest that variants in this gene are relatively specific for bona fide myelodysplastic syndrome/acute myeloid leukemia vs idiopathic cytopenia of undetermined significance.

One difference between the studies is that Kwok et al conclude that SF3B1 is somewhat specific for dysplasia. In our study, the frequency of cases in which SF3B1 is mutated is similar for the idiopathic pancytopenia group (6%) and the myelodysplastic syndrome/acute myeloid leukemia group (5%). Of the five cases with an SF3B1 mutation in our cohort, two were diagnosed as malignant (one hypoplastic myelodysplastic syndrome and one therapy-related acute myeloid leukemia) and three were categorized as idiopathic cytopenia of undetermined significance. The only SF3B1-mutated case with ring sideroblasts (<15%) was from a 76-year-old woman with rheumatoid arthritis, neutropenia, and macrocytosis who additionally had mutations in RUNX1 and TET2. On the basis of the bone marrow evaluation and clinical notes, this patient’s cytopenias were favored to represent drug effect. In this patient, there was no evidence of progression to a myeloid neoplasm in the 6 months of follow-up available.

Another finding that differs between our study and that of Kwok et al is that the prior study found a statistically significant higher frequency of cases with more than one mutation per case, while in our study this metric was not different (frequency of cases with more than one mutation: idiopathic pancytopenia 26% (14 of 53), malignant 29% (11 of 38)). These differences may be because our cases of idiopathic pancytopenia were not divided into those with some vs no dysplasia as was done in the work by Kwok et al, or perhaps this difference is simply due to the lower power of our study.

It is important to emphasize that in our study, the majority of mutations in the myelodysplastic syndrome-associated genes we evaluated are, with few exceptions (eg, missense mutations of U2AF1 Q157), not recurrent. Thus, it is still not clear whether these somatic variants are simply bystander/passenger mutations that serve as markers of clonal hematopoiesis or whether they have a pathogenic role in driving clonal proliferation. Interestingly, in our study there is some evidence of linear correlation between the number of base pairs evaluated and the number of somatic coding sequence-altering variants found per gene (Supplementary Figure 3). Although our study is not intended to determine whether certain variants may be drivers or passengers of the clonal process, we think that this correlation perhaps suggests that many of these variants are at most weak drivers or even passengers. Alternatively, perhaps some of these variants confer a competitive advantage to clones compared with other marrow progenitors only in certain contexts in which normal hematopoiesis is impaired (eg, liver disease and autoimmune disease), but at the moment, these proposals remain purely speculative. Addressing these questions will be important in order to understand the fundamental biology of these clonal proliferations and will have interesting implications regarding the diagnostic utility of identifying these variants in routine clinical practice.

Through several large-scale studies, it is now well established that variants in myeloid neoplasm-associated genes are found in a significant number of patients without any clinical or laboratory evidence of hematologic malignancy.19, 20, 21 The available data also show that individuals with these mutations show an only slightly higher risk for developing neoplasia, and most will never develop a myeloid neoplasm. Adding to the uncertain diagnostic importance of finding variants in myelodysplastic syndrome-associated genes, several groups have shown that some patients with aplastic anemia have mutations in ASXL1 and DNMT3A, which are implicated as drivers of abnormal myeloid proliferations malignancy and that are frequently mutated in patients with clonal hematopoiesis of indeterminate potential and idiopathic cytopenias of undetermined significance.29, 30 Thus, it is clear that mutations in many genes must be interpreted in the context of the clinical findings to avoid over-diagnosis of pre-malignant or malignant conditions.

The limitations of our study include the relatively small sample size, the relative heterogeneity of both the idiopathic pancytopenia and malignant groups, and the relatively short clinical follow-up. Regarding the heterogeneity of the idiopathic pancytopenia group, we note that attributing a patient’s pancytopenia with a particular co-morbidity (eg, autoimmune or liver disease and drug), reasonably requires exclusion of myelodysplastic syndrome or other hematologic malignancy. Our study is too small to determine whether the non-idiopathic cytopenias of undetermined significance cases in the idiopathic pancytopenia group have a different frequency or spectrum of mutations in myelodysplastic syndrome-associated genes. However, we find it remarkable that 32% (n=25) of the non-idiopathic cytopenias of undetermined significance cases in which cytopenia was favored to be due to a co-morbidity have at least one mutation in an myelodysplastic syndrome-associated gene. This non-idiopathic cytopenias of undetermined significance idiopathic pancytopenia group is composed of a significant number of cases of aplastic anemia, which likely lowers this estimate as these patients tend to be younger and had the lowest frequency of cases with at least one mutation (Table 1). In addition, the inclusion of acute myeloid leukemia cases would be expected to potentially artificially increase the frequency of cases with mutations in our malignant group; however, the frequency of mutations in the cases of acute myeloid leukemia included in our study was actually lower than the cases of myelodysplastic syndrome. While our malignant cohort does not represent a ‘perfect’ comparison group, the frequency of mutations in that group represents an estimate of the frequency of cases with mutations in myelodysplastic syndrome-associated genes that does not differ much from that reported by Kwok et al (68% in our malignant group and 71% in myelodysplastic syndrome cases evaluated by Kwok et al).

Another limitation of our study is that it lacks matched normal tissue as a control to definitively identify somatic variants; however, our approach is similar to prior studies of clonal hematopoiesis of indeterminate potential and clonal idiopathic cytopenias of undetermined significance, in which variant allele frequency and database searches are used to call somatic variants.20, 23 We also did not re-sequence cases or otherwise confirm the variants we identified, which may cause us to include variants that were introduced during sequencing. This would be expected to increase the number of variants identified, however, again, our findings are remarkably similar to the prior study and thus we think that the contribution of artifactual variants is likely minimal. In addition, the rate of false positives due to sequencing errors would likely be the same for both the idiopathic pancytopenia group and the malignant group, and thus would not likely influence our comparisons between these groups. Our study, as in many studies in this field of research, also does not determine whether the variants identified are present in bone marrow stromal cells or in the hematopoietic progenitors. Further work is required to determine which cell lineage within the marrow compartment contains the variants identified.

The identification of sequence variants associated with myeloid neoplasms is proceeding at a rapid pace and further knowledge of these variants will continue to improve diagnostic categorization of these entities. Clinical application of powerful new sequencing technologies will no doubt improve the diagnostic classification of myelodysplastic syndromes and their precursor lesions. However, as knowledge of the clinical behavior of myelodysplastic syndrome precursor lesions and the impact of myelodysplastic syndrome-associated genes on prognosis is limited, further characterization of these entities is needed. Our work suggests that the results of targeted gene sequencing in the setting of pancytopenia must be interpreted with caution when attempting to provide support for myeloid neoplasia.