Main

Acute myeloid leukaemia (AML) is a heterogeneous disease with complex interplay of genetic and/or cytogenetic alterations in haematopoietic progenitors, together with aberrant cytokines and small molecule (analyte) expression in the tumour microenvironment, that contribute to AML pathogenesis. Circulating analytes are a heterogeneous group of soluble small polypeptides or glycoproteins that mediate crosstalk between normal and neoplastic cells. Their roles in the pathophysiology at different stages of carcinogenesis are increasingly recognised (Hanahan and Weinberg, 2000). Although deregulated expression of circulating analytes in AML is well documented (Van Etten, 2007; Seruga et al, 2008; Tsimberidou et al, 2008; Kornblau et al, 2010; Brenner et al, 2017), considerable baseline variability of inter- and intra-assay plasma analytes levels are reported in AML (Kornblau et al, 2010; Kupsa et al, 2014). In addition, plasma baselines for some analytes that are biomarkers for other cancers have not been previously reported in AML. A comprehensive profiling of plasma analytes might give us a greater insight of baseline variability and co-expression signature, potentially leading to the identification of novel diagnostic biomarkers and therapeutic targets in AML.

In this study, we aim to validate the baseline of multiple analytes and report baseline of novel analytes in AML. We also aim to examine the differential expression and categorise co-expression patterns of analytes, followed by exploratory analysis and validation using The Cancer Genome Atlas (TCGA) data (Cancer Genome Atlas Research Network, 2013).

Materials and methods

Peripheral blood samples were collected and fresh plasma was separated from 38 individuals: 19 AML cases and 19 healthy controls (Table 1 and Supplementary Table S1). Cytogenetic studies were performed as a routine diagnostic test for all AML cases. The diagnostic leftover samples were used for further study. Based on Malaysian guideline on the use of human biological specimen for research, no patient consent was required for this study. The study was approved by the Medical Ethics Committee (NMRR-16-1384-31900 S1 R0). Two sets of analytes were selected for this study (Figure 1A): (1) 22 reported analytes (baselines have been previously reported in AML) and (2) 10 novel analytes (baselines have not been previously reported in AML). Reported analytes are AFP, CA 15-3, Leptin, IL-6, sFasL, CEA, CA 125, IL-8, HGF, sFas, TNFα, Prolactin, SCF, OPN, FGF2, bHCG, TGFα, VEGF, Galectin-3, MPO, IGFBP3, and Ferritin. Novel analytes are tPSA, CA 19-9, MIF, TRAIL, CYFRA 21-1, HE4, Cathepsin D, FAPa, MIA, and SHBG. Using multiplex array technique, all 32 analytes (Supplementary Table S2) were simultaneously detected and quantified from 38 plasma samples (details of the methods are shown in Supplementary File).

Table 1 Demographics of 19 AML cases and 19 healthy controls
Figure 1
figure 1

Deregulated cytokines and small molecules in AML patients. (A) Distribution of the analytes. The total analytes (n=32) are divided into two sets: reported analytes (baselines reported previously in AML) and novel analytes (baselines have not been reported previously in AML). (B) Beeswarm plots of the 16 significantly deregulated (P<0.005, Mann–Whitney U-test) analytes in AML cases (red or blue colour) compared with healthy controls (green colour). Plots for the upregulated (red colour) and downregulated (blue colour) analytes are ordered based on the calculated P-value. Novel analytes are denoted by an asterisk (*). The y-axis shows plasma level in log2 scale. Nonsignificant analytes are shown in Supplementary Figure S1. (C) Volcano plot showing the relationship between median fold change (x-axis) and P-value of the Mann–Whitney U-test (y-axis). Positive values on the x-axis show upregulation, and negative values show downregulation. Green circles denote significantly deregulated (P<0.005) analyets. (D) Patient-specific aberrant expression of analytes. Columns denote the 19 studied AML patients and the 32 interrogated analytes are represented by rows. Horizontal and vertical sidebars summarise the distributions per analyte and patient, respectively. Red colour denotes upregulation, blue downregulation, and green no significant deregulation (within normal/healthy control range). Grey boxes denote missing values.

Results

The plasma expression level of the selected 32 analytes for both healthy controls and AML cases was curated from available peer-reviewed literature (Supplementary Tables S3 and S4). Comparing our study with previously published data, 17 out of 22 reported analytes show similar expression range. The expression range of three analytes (FGF2, MPO, and sFas) are observed higher, whereas two analytes (SCF and sFasL) are observed lower in our study (Supplementary Table S4). All the novel analytes were within the detectable range in AML plasma. In total, 16 analytes are found to be significantly deregulated in AML compared with healthy control (Mann–Whitney U-test, P-value <0.005, Figure 1B) and 5 of them (Cathepsin D, MIF, TRAIL, SHBG, and FAPa) are novel analytes. In all, 13 analytes are upregulated and 3 are downregulated. The remaining 16 analytes are found to be nonsignificant (Supplementary Figure S1). Although some of the nonsignificant analytes (like TGFα, HE4, Leptin, and Total PSA) pointed large median fold changes, they are not significant probably because of inconsistent expression. Fold change and corresponding P-values are shown as volcano plot in Figure 1C. We next investigated the consistency of the expression levels across the AML cohort (Figure 1D). The first 9 analytes in Figure 1D are above the normal expression range for >80% of AML patients. Based on consistent expression, fold change, and P-value, seven analytes (Cathepsin D, Ferritin, MIF, Galectin-3, HGF, MPO, and IL8) are predicted to be a multiplex panel for AML diagnosis.

To determine co-expression patterns, we calculated pairwise Pearson’s correlation coefficients between analytes in AML and observed five distinct groups (analytes 1–5, Figure 2A). Although analytes in the second group are uniquely co-expressed, the underlying biology behind the observed co-expression groups is yet to be explored. We use principal component analysis (PCA) for 32 analytes and track the co-expressed groups and 5 groups are plotted separately (Figure 2B). Then, we check the distribution of each group’s aggregated median across the samples and analyte 3 could significantly differentiate AML from control (Supplementary Figure S2A). PCA is also applied to 38 samples (Figure 2C); AML and control are distinctive and two groups of AML are visible by PC2. Using ‘Bimodality Index’ (Wang et al, 2009), 18 analytes are identified as informative (Supplementary Table S5, BI >1.4) for sample clustering and 2 distinct groups (AML1 and AML2) are also visible in AML patients by unsupervised hierarchical clustering dendrogram (Figure 2D). These data support the previous reports (Bruserud et al, 2007; Kornblau et al, 2009, 2010; Brenner et al, 2017) that demonstrated the circulating analytes expression-based AML patient clustering. Both MPO and HGF are differentially expressed between AML1 and AML2 (Supplementary Figure S2B and C). Interestingly, 5 out of 6 patients with favourable karyotypes are clustered in AML2. The CA-125 and TGFα are deregulated (up and down, respectively) in the favourable group compared with other karyotypes (nonsignificant, Supplementary Figure S2D and E). The TGFα has also exhibited similar pattern in TCGA (Cancer Genome Atlas Research Network, 2013) (P<0.005, Supplementary Figure S2F) that demonstrates prognostic potential.

Figure 2
figure 2

Cytokine and small molecule expression-based clustering. (A) Heatmap of pairwise Pearson’s correlation coefficients between analytes. Five distinct groups of analytes (Analytes 1–5) are identified. Analyte 2 is uniquely correlated to each other only, whereas Analyte 4 shows opposite trend. (B) Principal component analysis (PCA) plot of the first 2 PCs of the 32 analytes. The five groups and their colours are as defined in (A). (C) The PCA plot of the first 2 PCs of the 38 samples (AML=19, control=19). Components were calculated from the expression profiles of all 32 analytes. The AML and control groups are clearly separated. Two broad groups of AML could be inferred based on the PC2 coordinates. (D) Heatmap displaying the expression levels of 18 analytes that show bimodal distribution patterns (Bimodality Index >1.4). Unsupervised hierarchical clustering performed on patients (rows) is displayed as a dendrogram and suggests two subgroups of AML (AML1 and AML2). Five of the six AML patients with favourable karyotypes belong to the AML2 group.

The MIF is a novel analyte, known as a proinflammatory cytokine, and is found 30-fold upregulated in AML plasma compared with control (Figure 1B). Other cancer types, including breast, prostate, and gastric cancer, also reported elevated MIF level that prevents apoptosis and promotes tumour cells survival by directly activating the phosphoinositide-3-kinase (PI3K)/Akt pathway (Lue et al, 2007). The same pathway is also constitutively activated in 50% of AML blasts (Park et al, 2010). Another novel analyte is TRAIL, a potent inducer of apoptosis (Henry and Martin, 2017), found 2.75-fold downregulated in AML plasma. A negative correlation is noticed between MIF and TRAIL in AML in both our study and TCGA (Supplementary Figure S3A and B, respectively), but not in controls (Supplementary Figure S3C and D). We speculate that MIF is promoting survival of AML blasts using PI3K/Akt pathway through the anti-apoptotic mechanism.

Next, we sought to explore the analyte corresponding genes in TCGA AML data (Cancer Genome Atlas Research Network, 2013). We extracted promoter methylation status of the targeted 32 analyte corresponding genes from 194 AML cases and 30 healthy controls and identified 3 clusters of analytes based on promoter CpG site methylation (Supplementary Figure S4A). Interestingly, the promoters of FASLG and LGALS3 genes are highly methylated in the TCGA data set, and these analytes are abnormally highly expressed in our cohort, suggesting that perhaps feedback loops in the cells repress these genes when overexpressed. Both MPO and HGF show distinct promoter methylation patterns between cases and controls where MPO is a known prognostic factor for AML (Matsuo et al, 2003). We checked mRNA expression using mRNAseq and found three clusters but noticed some inconsistencies (Supplementary Figure S4B) compared with methylation clusters. In addition, we noticed significant differential expression for MPO, HGF, and LGALS3 (log2 fold change 7.26, 5.60, and 3.50, respectively) compared with healthy controls in TCGA mRNAseq data.

Discussion

The altered baseline of circulating analytes and their receptors has been found in numerous studies of patients with various types of cancer, including AML, both at primary and metastases stages compared with healthy people (Seruga et al, 2008). Analytes secreted in the bone marrow microenvironment form a complex functional network and play important roles in modulating cell survival, proliferation, differentiation, and the immune response (Welner et al, 2015). A comprehensive profiling of secreted proteins is important to understand their function during tumour development and progression and to identify common and actionable extrinsic pathways independent of mutation status in AML that might lead to developing targeted therapies with clinical efficacy (Carey et al, 2017).

In this study, we detected and quantified 32 different analytes in plasma using highly sensitive Bioplex multiplex technology and documented comparative plasma baseline variability between different studies in AML. This is the first published report of baseline for one-third of these analytes in AML.

We curated baseline from available peer-reviewed literature for both AML cases and healthy controls. We noticed that the baseline was more consistent with studies that used the Bioplex method that has been reported as having better sensitivity (Kornblau et al, 2010) as compared with the studies that used other methodologies. We observed that baseline expression ranges are higher for three analytes (FGF2, MPO, and sFas) and lower for two analytes (SCF and sFasL) compared with previously published reports.

In our study, 16 analytes are found to be significantly deregulated (13 higher and 3 lower), whereas other 16 analytes are found to be nonsignificant (Mann–Whitney U-test, P-value <0.005). We used higher cutoff for P-value to avoid false positive error. We also performed power analysis and found that almost 80% of the analytes in the significant group have the power of 1 (almost 100% probability of getting a significant P-value despite the sample size).

We reported for the first time plasma baselines for 10 novel analytes in AML, where 5 of them (Cathepsin D, MIF, TRAIL, SHBG, and FAPa) were statistically significant with high median fold changes.

We predicted a seven-analyte-containing multiplex panel (Cathepsin D, Ferritin, MIF, Galectin-3, HGF, MPO, and IL8) for diagnosis of AML, where they are almost always upregulated, regardless of the heterogeneity of the disease. Further validation is required for this panel before clinical use.

Interestingly, we found a negative correlation between MIF and TRAIL in AML that was also supported by TCGA data. We speculated that MIF is promoting survival of leukaemic blasts using PI3K/Akt pathway through attenuation of apoptosis. Based on existing literature (O'Reilly et al, 2016), we think MIF could be a possible therapeutic target in AML.

In addition, we observed that circulating analytes show co-expression and form five distinct groups in PCA. These co-expressed groups were differentially expressed across the AML cohort. We also clustered AML patients based on 18 informative analyte expression profile and noticed distinct groups. Previously, we reported that systematic aggregation of methylation and mutation profile improve AML subclassification (Islam et al, 2017). Our current observation lead us to speculate that circulating analyte expression signatures have the potential to be used for subclassification of AML, complementing cytogenetic, genetic, and epigenetic information.

One of the major limitation of this study is the small sample size, although we showed that the sample size is not affecting some of our major and notable findings like significant and consistent deregulation. In addition, some studies reported a notable baseline difference between plasma, serum, and bone marrow aspirate (Stroncek et al, 2005).

In conclusion, our current study demonstrates that circulating analyte expression in AML significantly differs from normal. We validated the baseline of multiple reported-analyte and reported novel-analyte levels. We also identified the differential analyte expression and categorise the co-expression patterns in AML. In future study, different sources of samples (serum and marrow aspirate), a larger sample size, and more elaborate analyte panels could be used to extend our current findings.