Introduction

The recent release of the draft map of human ‘tissue proteome’1 has rekindled the interests of many research groups for parallel mapping and annotation of the human ‘fluid proteome’ primarily due to the existence of small but noticeable fractions of proteins that are exclusively present in the bodily fluids such as blood plasma/serum, urine, CSF, etc.2. These proteins may otherwise escape easily from detection, if only ‘tissue’ is employed as a ‘sole source’ for proteome investigation, in a diseased state. While the efforts for human plasma proteome profiling was initiated back in 2002 by the Human Proteome Organization (HUPO)3, the dynamic abundance of circulating proteins and intrinsic person-to-person inconsistencies in the expression patterns and/or release profiles under different physiological and patho-physiological conditions, made the plasma proteome mapping and the discovery of ‘universally-acceptable’ circulating biomarkers, a rather challenging task. The limitations therefore necessitate the building up of the individual, disease-specific proteome maps with enrollment of samples from diverse population groups to ensure a better identification of diagnostic-, predictive-, prognostic- and/or therapy-associated biomarkers.

Chronic myeloid leukemia (CML) is a myeloproliferative neoplasm, which results from reciprocal translocation between chromosome 9 and 22 t(9;22) (q34;q11) [Philadelphia chromosome] generating BCR-ABL, a tyrosine kinase encoding oncogene4. During the course of CML progression (chronic-, accelerated- and blast-crises phases), underlying gradual amplification of BCR-ABL-driven genomic instability and secondary modifications at genetic/epigenetic levels are believed to have major knock-on effect in altering and activating the expression of different mitogenic, anti-differentiating and anti-apoptotic modulators and mediators with resultant profound influence on the proteome profiles4,5,6. Analysis of these altered protein profiles in patients and their healthy counterparts are likely to assist in keeping track of the underlying concealed perturbations while expediting the search for novel diagnostic biomarkers and therapeutic targets of the disease.

This study is designed for comparative plasma proteome analysis of chronic phase-chronic myeloid leukemia (CP-CML) patients to meet two-fold objectives i.e., 1) identify novel, differentially expressed proteins in peripheral blood plasma having potential to develop into predictive- or therapy-associated biomarkers, and 2) make modest yet significant contribution in the international efforts for building up of a precise, accurately annotated universal plasma proteome map by providing protein data from the South Asian region, in particular Pakistan. Findings of this work present two novel, potential candidate biomarkers of myeloid leukemia.

Results

2D proteome profiling and mass spectrometric analysis

Plasma proteins of the healthy and CP-CML subjects, enrolled in this study, were individually resolved by 2DE in three independent experiments, over a pH range 4 to 7 (Fig. 1, Tables 1 and 2). On average, 198 ± 76 spots in the healthy and 172 ± 83 spots in the CP-CML plasma appeared, when the 2D-gels were stained with colloidal-Coomassie. Altogether 68 ± 11 gel spots showed at least one-fold difference in intensities [as assessed by Dymension software (v 3.0.1.2)] and were therefore considered for MS analysis; those exhibiting minor or inconsistent changes were ignored. From the pooled control and the individual CP-CML samples, ~1300 gel spots were subjected to MALDI-TOF MS analysis, which led to the identification of 33 distinct proteins and/or their respective isoforms/subunits (Table 3). To address reliability issue of the peptide mass fingerprinting identification, PMF score for individual identifications was calculated using 79 as the cut-off value for positive hits7. The proteomics data was deposited to the ProteomeXchange Consortium8 via PRIDE partner repository (http://www.ebi.ac.uk/pride/) with the dataset identifier PXD002757.

Figure 1: 2D-gel images of plasma samples derived from normal and CP-CML subjects.
figure 1

Representative gel images of three independent experiments were merged to have a composite map in the pH range 4 to 7. Protein spots were visualized by staining with colloidal Coomassiee brilliant blue G250 and are numbered with N- and L-labels for normal and CP-CML samples, respectively. The protein spots showing at least one-fold difference were only considered for identification by MS analysis; those exhibiting minor or inconsistent changes were ignored and are therefore unlabeled.

Table 1 Hematological profile of healthy subjects (n = 50).
Table 2 Clinical manifestations of the patients enrolled in the study.
Table 3 List of proteins identified in the plasma samples of controls and the CP-CML subjects by MALDI-TOF MS.

When analyzed, majority of the identified proteins were represented by multiple spots in the CP-CML and the control groups (Fig. 1, Table 3). For instance, alpha-1-antitrypsin (AAT) and alpha-1-antichymotrypsin (AACT) are represented by 4 and 7 spots, respectively, in the CP-CML samples while the same proteins were represented by 3 and 4 spots in the healthy counterparts (Fig. 2). Slight to moderate pI or mass shifts between theoretical and experimentally-calculated values were also noticed. These observations seem to be the result of post-translational modifications such as phosphorylation, glycosylation and/or proteolytic cleavage that are likely to affect the electrophoretic mobility, stability, folding and interactions of the proteins and may be responsible for different protein isoforms. To substantiate that the discrepancy observed in the mass or pI is due to the glycosylation (a widely observed, structurally diverse event), the peptide mass fingerprinting data was subjected to the N-linked glycosylation analysis using NetNGlyc 1.0 (http://www.cbs.dtu.dk/services/NetNGlyc/) webserver. The glycosylation was predicted in many of the identified protein sequences namely AACT, AAT, VDBP, HP, etc., with very high scores (threshold ≥ 0.5) suggesting that mass and pI shifts in these proteins may be attributed to post-translational glycosylation (Supplementary data 1).

Figure 2: 3D-simulation and mass spectrometry based identification of representative protein spots.
figure 2

(A) Upper panel shows the image of encircled 2-D gel spots in healthy and CP-CML subjects while the lower panel shows their corresponding 3D-images obtained using Image Master 2D-Platinum software; difference in spot intensities amongst the two study subjects are clearly visible. (B) Peptide mass fingerprint data of selected protein spot obtained following MALDI-TOF/TOF mass spectrometry analysis. (C) Identification of protein using online MASCOT program; matched peptide sequences of identified protein having sequence coverage of 55%, are shown in red bold.

Prior to identifying the differentially-represented proteins we, therefore, summed up the intensities of the multiple spots of the same protein and applied the paired t-test followed by false discovery rate (FDR) determination as described previously9. The cut-off value for FDR (the probability of expected type 1 error in null hypothesis) was set as ≤0.05 to demonstrate that 95% findings are accurate. Only six proteins qualified the three-tier criteria that was set for screening of potential candidate biomarkers [p-value ≤ 0.05, FDR ≤ 0.05, PMF score ≥79; Table 3] and these were AAT, AACT, stress-induced phosphoprotein 1 (STIP1), CD5 molecule-like (CD5L), transthyretin (TTR) and vitamin-D binding protein (VDBP). Former four proteins were found at higher abundance while the later two showed decreased levels in colloidal Coomassie-stained gels of CP-CML in comparison with the control group. We selected all six differentially-represented proteins for further validation along with two statistically-insignificant/invariable proteins [haptoglobin (HP) and fibrinogen γ (FGG)], as control (Table 3).

Immunological validation of candidate marker proteins

Validation of candidate proteins in the pre- and post-treatment CP-CML patients was performed using quantitative ELISA. Blood samples from 17 patients, out of the 32 initially enrolled subjects, who had undergone Tyrosine Kinase Inhibitor (TKI)-based therapy (nilotinib) for one year, were redrawn. Other patients (n = 15) could not become the part of this follow-up study either because of their demise or non-traceability.

Except VDBP, all the candidate proteins showed differentiating expression patterns as manifestation of the disease (Fig. 3). More importantly, mean plasma concentration of CD5L in the pre-treated CP-CML subjects was 16.60 ± 7.99 ng/ml, which is nearly seven-times higher than the control group (2.29 ± 1.23 ng/ml). In the nilotinib-treated CP-CML (PT) cases, the normal levels of CD5L were, however, restored [(2.77 ± 1.37 ng/ml), Fig. 3B]. Likewise, prior to treatment, the patients group showed down-regulated expression of plasma TTR but they regained the normal levels following nilotinib therapy (Fig. 3C). The response of other candidate proteins viz. AAT, AACT and STIP1, in pre- and post-treatment CP-CML subjects was also not different from the above two markers; their ANOVA p-value was, however, higher than 0.0001 (Fig. 3D–F).

Figure 3
figure 3

ELISA-based quantitative estimations of (A1) fibrinogen gamma (FGG), (A2) haptoglobin (HP), (B) CD5 molecule-like (CD5L), (C) transthyretin (TTR), (D) alpha-1-antitrypsin (AAT), (E) alpha-1-antichymotrypsin (AACT), (F) stress-induced phosphoprotein 1 (STIP1) and (G) vitamin-D binding protein precursor (VDBP) in plasma samples of normal (N) and chronic-phase CML (CP-CML) subjects. PT denotes CP-CML cases that have undergone TKI-based therapy (nilotinib treatment) for one year. One-way ANOVA p-value and F-value were calculated using SPSS program.

In-silico characterization and pathway analysis

The molecular functions and biological processes, in which the MS identified proteins are involved in, according to the Gene Ontology database, were analyzed (Fig. 4A,B). As shown, majority of the proteins belong to the category of enzymes (9%), enzyme modulators (18%), transfer/carrier proteins (9%), immunity/defense proteins (15%), receptors (6%) and/or signaling molecules (15%). Interactive links between 11 such proteins could be traced using STRING and MetaCoreTM programs and are illustrated in the form of a curated pathway (Fig. 4C). This curated pathway was used as scaffolding to establish association of the elevated levels of plasma CD5L, AAT, AACT, STIP1 etc. in Philadelphia positive CP-CML cases.

Figure 4
figure 4

Classification of identified protein according to their (A) molecular functions and (B) biological processes. The assignments are based on Gene Ontology (GO) consortium (www.geneontology.org). (C) Network analysis of the MS identified differentially-abundant proteins in the dataset. The curated pathway is supported by at least one reference in the literature. Individual proteins are represented as nodes where shapes represent the fundamental class to which the proteins belong to. Small circle on-top of the protein symbols (red color) point towards an up-regulated response. Connecting lines between nodes define activation (green) and inhibition (red) while their thickness symbolizes the strength of interaction.

Although it is difficult to speculate the exact correlation of each protein or node, nonetheless our results reinforce the earlier findings that the BCR-ABL constitutive tyrosine kinase activity exerts strong influence on the apoptotic- and immunity/defense-related biofunctions10,11. This oncoprotein activates many signaling cascades including the Janus kinase (JAK) signal transducers and activators of transcription (STAT) pathway, a pathway that is frequently triggered in both acute and chronic forms of myeloproliferative diseases. Besides activating the JAK-STAT, BCR-ABL induces the production of JAK2-activating cytokines viz. interleukin-3 (IL-3), IL-6, granulocyte macrophage colony stimulating factor (GM-CSF), G-CSF, etc. This cytokine enriched microenvironment is capable of activating the STAT3 and STAT5 signaling pathways via JAK-2, in a BCR-ABL independent fashion10,11,12. Thus, elevated levels of circulating STIP1, AACT, AAT and CD5L, in the present study, are likely to be the consequence of aberrant STAT signaling and constitutive activation of STAT3 and STAT5.

Discussion

Myeloproliferative neoplasm CML is clinically diagnosed using a combination of complete blood cell count, molecular/cytogenetic testing and bone marrow aspiration and biopsy; blood-based protein biomarkers for screening- or monitoring the therapeutic response are, however, lacking. During the past decade, proteomic- and metabolomic approaches encompassing comparative analysis of proteins, peptides or small metabolites in healthy and diseased states has aided the discovery of several hundred candidate biomarkers of cancer diagnosis and/or prognosis and hence provided better insight into the disease mechanisms13,14,15. With an objective of identifying the robust, clinically-applicable, blood-based protein biomarkers of CML, we have compared the plasma proteome profiles of CP-CML subjects and their healthy counterparts using 2DE in conjunction with MALDI-TOF MS.

During the initial screening, eighteen proteins were found differentially-represented with FDR value ≤ 0.1 (90% confidence for accuracy) and amongst these six proteins displayed differential staining with better FDR value ≤ 0.05 (95% confidence for accuracy). These six proteins (Table 3, shown in bold) were selected for further validation, wherein except VDBP, all candidate biomarkers showed potential to discriminate the healthy control group from the patients and the pre-treatment cases from the post-treatment CP-CML group. AAT, TTR and CD5L proteins with ANOVA p-value ≤ 0.0001 appears to be of particular interest as they were better able to predict the patients’ clinical behavior and therapeutic response.

AAT, a 54 kDa glycoprotein is a serine protease inhibitor, which earlier has been described to be associated with tumor progression and metastasis in a wide spectrum of cancers including CML16,17. There is, however, no direct evidence in the literature showing the association of myeloid leukemia (either CML or AML) with differential abundance of TTR and/or CD5L. Thus not AAT but the other two proteins appear to be novel. Amongst these, TTR is an extracellular protein which is synthesized in the liver and is involved in the transport of thyroxin from blood to brain besides acting as a carrier of retinol18. In comparison with Chinese healthy subjects wherein plasma TTR levels have been reported as 129 ± 15.6 μg/ml19, our healthy control group showed significantly lower levels both in males (108 ± 31.99 μg/ml) and females (63.48 ± 24.29 μg/ml). Pronounced gender-associated differences in circulating TTR concentrations are also obvious. This is not surprising as many proteins including haemoglobin have shown gender-related differences in clinical settings. Much lower TTR levels in our control group, however, are somewhat interesting because Liu et al.19 proposed an optimal cut-off value of 115- and 88.5 μg/ml, respectively to discriminate the healthy subjects from those suffering from benign lung diseases and the lung cancer. Similar cut-off, if applied on our population, where even healthy females have circulating TTR lower than the threshold value set for the lung cancer diagnosis, is likely to result in large number of false-positives. This calls for the need of plasma proteome profiling from diverse population groups of variable ethnicity to ensure discovery of better and universally acceptable biomarkers.

Another interesting biomarker identified in this study is CD5L, a 347 amino acid long soluble, secreted protein. It is a member of SRCR superfamily, which is characterized by the presence of scavenger receptor cysteine rich (SRCR) domains with critical roles in lipid homeostasis, inflammation and immune responses20. In the validation study, the plasma concentration of CD5L was found significantly elevated in the CP-CML group, which dropped to the normal levels following TKI-based therapy (ANOVA p-value ≤ 0.0001 and F-value = 110.6), suggesting the effectiveness of candidate marker in monitoring the therapeutic-response as well.

In the Human Protein Atlas (http://www.proteinatlas.org), most of the cancer types such as breast, colorectal, head and neck, cervical, lung, liver, prostate, ovarian cancer, etc., were found negative for the presence of CD5L making it a specific biomarker of leukemia. However, it is of relevance that CD5L is a secretary protein and the tissue analysis may not accurately portray its expression profile. More so, significantly high levels of circulating CD5L has been reported in the patients suffering from pulmonary tuberculosis21, liver cirrhosis with HCV infection22,23 and hepatocellular carcinoma with non-alcoholic fatty liver disease24. We have noted that the CML patients are generally immune-compromised and majority of them suffer from hepatomegaly and/or splenomegaly [Table 2]. The question whether elevated levels of circulating CD5L in CP-CML reflects a coordinated response of infection and inflammation or relates to myeloid leukemia as a function of anti-apoptotic factor, suggests large-scale trials with enrolments of lymphoid- and myeloid-leukemia (ALL, CLL, AML, CML) patients from diverse population groups.

Taken together, in complex diseases such as cancers/leukemia, a single protein or peptide is unlikely to serve as disease biomarker in all population groups. AAT, TTR and CD5L, however, have shown potential to serve as predictive- or therapy-associated CP-CML biomarkers. Further investigation of their specific role and the cross-talk amongst the repertoires of immune- and apoptotic-effectors is likely to provide new clues about the cellular biology of myeloid leukemia.

Methods

Study population

The study population was comprised of 82 subjects in total that included healthy controls (n = 50, Table 1), BCR-ABL positive CP-CML subjects (n = 32, Table 2) and post-treatment CP-CML cases (n = 17; received nilotinib therapy for a period of one year). Informed consent was obtained from all subjects, prior to their enrolment in the research project. The study design was duly approved by the Ethical Review Committee of the School of Biological Sciences, University of the Punjab, Lahore, Pakistan [Ref. No. 873/12] and was in accordance with the principals of the Declaration of Helsinki for research involving human beings. The peripheral blood samples (3cc) from healthy donors and the CP-CML patients were collected in EDTA-coated tubes, centrifuged at 2,000 × g for 10 minutes to separate plasma and then stored at −80 °C, in 250 μl aliquots, until their use for analysis. The samples were processed within 30 minutes after collection.

Fractionation of proteins by two-dimensional gel electrophoresis

Total protein contents in the collected plasma samples were estimated by Bradford assay25 using bovine serum albumin (BSA) as standard. Applying 2D-gel electrophoresis, protein fractionation was performed according to the procedure described previously with minor modifications26. Briefly, the plasma sample was diluted with rehydration solution [7 M urea, 2 M thiourea, 2% CHAPS, 65 mM DTT and 0.25% Servalyte] to a concentration of 1 μg/μl and applied onto Servalyte 18 cm long, linear immobilized pH gradient (pH 4–7) strip (Serva Electrophoresis, Heidelberg, Germany). The dried strip was subjected to passive rehydration overnight at 20 °C and then focused on IEF flatbed (IEF-SYS, SciePlas, UK) for a total of 60kVhr.

Following first dimension IEF, strips were successively equilibrated with equilibration buffer-I [6 M urea, 2% SDS, 30% glycerol and 1% DTT in 1.5 mM Tris-Cl (pH 8.8)] and buffer-II [6 M urea, 2% SDS, 30% glycerol, 5% iodoacetamide in 1.5 mM Tris-Cl (pH 8.8)], each for 15 minutes. Equilibrated strips were aligned on 12% SDS-gel and electrophoresed at 80 V initially for 1 hour and then at 160 V until the bromophenol blue tracking dye reached the bottom of the gel. After electrophoresis, the gel was placed in fixative solution (30% ethanol, 10% acetic acid) overnight, stained with Coomassie colloidal blue dye and then destained with deionized water to a clear background. 2D gel images were scanned using Syngene gel documentation system and the individual protein spots were analyzed for pI and molecular weight, followed by their quantification and matching using the Dymension v.3.0.1.2 (Syngene, UK) software program.

In-gel digestion and mass spectrometric analysis

After matching the proteins of healthy and CP-CML subjects, individual gel spots were excised under sterile conditions, washed twice with deionized water, and then destained completely by incubation with 100 μl of 0.2 M ammonium biocarbonate (AB) and 50% acetonitrile solution (1:1) at 37 °C for 30 min. Proteins in gel spots were thereafter reduced and alkylated by successive incubations with 100 μl 20 mM tris (2-carboxyethyl) phosphine containing 25 mM AB and 40 mM iodoacetamide containing 25 mM AB, each at 37 °C for 30 min. in the dark. The gel pieces were washed with 100 μl of 5 mM β-mercaptoethanol containing 25 mM AB for 15 min. at 37 °C and dried completely in a speed vac. For tryptic digestion, the gel slices were rehydrated with 20 μl of 0.02 mg/ml sequencing grade trypsin (Promega, V511A) and left for overnight digestion at 37 °C. Resulting peptides were extracted from the gel by centrifugation, washed with 40 mM AB/acetic acid (incubation 37 °C for 30 min.) and spotted on target plate for mass spectrometric analysis.

For MALDI analysis, 1 μl of the digested peptides was mixed with equal volume of freshly prepared saturated solution of α-cyano-4-hydroxycinnamic acid prepared in 0.1% triflouroacetic acid/acetonitrile. 1 μl of this mixture was then spotted on to the target plate, air dried until solvent evaporation, and then analyzed using MALDI-TOF-TOF MS (Ultraflex III, Bruker Daltonics, Germany). A 337 nm nitrogen laser and a 2 GHz digitizer were used at a laser frequency of 100 Hz and an intensity of 60–70%. Spectra were obtained in linear positive ion mode with accelerating voltage of 25 kV and lens potential of 6 kV. Delayed extraction was performed at 100 ns, the detector gain was set to 7.5 and the sample rate to 0.5 GS/s. Spectra were obtained in the mass to charge (m/z) range of 1000–5000.

Functional and pathway analysis of proteins using bioinformatics tools

Peptide mass spectra obtained from MS analysis were searched against the SWISS-PROT and NCBInr databases using the MASCOT Wizard 1.1.2. from Matrix Science (www.matrixscience.com) and further confirmed by MS-fit program from Protein Prospector (www.prospector.ucsf.edu). The proteins identified with PMF score 79 or higher were considered as acceptable (27). Search parameters included Homo sapiens as species with methionine oxidation and carboxymethylation of cysteine residues as the variable and the fixed modifications, respectively, and an allowable peptide mass tolerance of 50–100 ppm. Enzymatic digestion of proteins was performed with trypsin (V115, Sigma Aldrich, USA). The identified proteins were categorized according to their GO (www.geneontology.com) annotations based on molecular functions. Network construction analyses and canonical pathways were generated through the use of MetaCoreTM (Functional Genomics Centre, University of Zurich, Switzerland) and protein functional analysis software, STRING v.9.1 (string.db.org).

Statistical analysis and validation of biomarker candidates

The statistical analyses were performed using SPSS ver.20 and/or GraphPad Prism 6.01 software programs. The spot intensity differences obtained from the 2D-gel images of at least 3 different sets of independent plasma samples were analyzed by non-parametric Mann-Whitney U test. Proteins with high fold-changes (≥2.5) were considered for validation studies by enzyme linked immunosorbant assays (ELISA). To minimize sample handling bias or pipetting errors, pre-coated ELISA plates against human AAT (also called SERPIN A1), AACT (also known as SERPIN A3), TTR (also called as pre-albumin), CD5L, STIP1, VDBP, HP and FGG were obtained from GenWay Biotech Inc., CA, USA and Biomatik Corporation, USA, and used in the validation of candidate biomarkers. All the assays were performed in triplicates according to the recommended instructions of the supplier. One-way analysis of variance (ANOVA) with an unpaired Student's t-test (applied to check the significance of differences amongst the group mean values) was applied to compare the antibody titer between test and the control groups. Associations with a p-value ≤ 0.05 were considered as statistically significant, while those with p-value ≤ 0.0001, as clinically important.

Additional Information

How to cite this article: Fatima, I. et al. CD5 molecule-like and transthyretin as putative biomarkers of chronic myeloid leukemia - an insight from the proteomic analysis of human plasma. Sci. Rep. 7, 40943; doi: 10.1038/srep40943 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.