Introduction

Acute lymphoblastic leukemia (ALL) is the most common cancer among children, accounting for approximately 26% of all pediatric cancers in the USA1. Although the 5-year survival rate of pediatric leukemia patients is as high as 86%, the side-effects of the treatment can severely impact the quality of life of survivors2. In particular, vincristine, a core chemotherapeutic drug administrated as part of the ALL therapy that has been in use for more than 50 years, has a dose-limiting toxicity: peripheral neuropathy. Vincristine-induced peripheral neuropathy (VIPN) is characterized primarily by numbness, tingling, and a painful sensation felt in the hands and feet, muscle weakness, and constipation due to its effect on the sensory, motor, and autonomic nerves3,4,5. In some instances, VIPN can be prolonged and may last even after discontinuation of the treatment, impairing patients’ motor skills6,7,8,9 which results in limitation of their daily life activities for many years after completion of therapy10. While VIPN is severe in a subpopulation of patients, another subpopulation experiences negligible neuropathy. Currently, there are neither established ways of predicting susceptibility to VIPN in patients, nor ways to treat it effectively, resulting in a suboptimal management for both the cohorts. Predicting VIPN susceptibility in patients will enable better dosage decision making tools for physicians and in turn may improve the quality of life of these patients.

Several researchers have studied the association between genomics and VIPN incidence; however, the majority of the results have been controversial. Most of the studied differences are based on race, CYP3A5, ABC transporter, and, more recently, CEP72 expression. While some studies have reported a significant association between race and VIPN incidence11,12,13,14, others could not confirm those results5,15. CYP3A5 metabolizes vincristine and some studies have reported a significant association between CYP3A5 expression and VIPN13,16,17; however, one study could not establish such an association18. Similarly, there is evidence for association between the ABCB1 transporter and VIPN19,20; and there also exists evidence to the contrary18. Additional details on the associations and non-associations found between VIPN and other variables is addressed by van de Velde et al.4 and Mora et al.3. Correlations between pharmacokinetics or other patient-specific co-variates and VIPN have also been evasive due to a large variability in the available interpatient and intrapatient data, making interpretation of these pharmacokinetic studies difficult21,22,23,24,25. Recently, significant correlation was discovered of a SNP in the promoter region of the CEP72 gene with VIPN, during the maintenance phase of the treatment. CEP72 expression was associated with VIPN in human neurons derived from human induced pluripotent stem cells, as well12. While some studies have been able to reproduce this finding during the late phase of the treatment26,27, this association was not found in the earlier phase of therapy28,29. The literature clearly shows that an established predictor for the early stage of the treatment is still lacking.

In this paper, we take a detour from the previous genomics and pharmacokinetics related studies and employ a novel approach using metabolomics to predict VIPN at an earlier stage of the treatment. The need to go beyond genomics has been raised before30, and, specifically, the role of metabolomics in predicting drug response has garnered more attention31,32 in recent years. The recent interest in pairing metabolomics and drug response is a result of phenotypic variations that can be captured at the downstream metabolite level. These phenotypic variations arise due to differences in environment, lifestyle, stochasticity in biochemical reactions, etc., and may be as important as, if not more important than, the upstream genomics to predict drug response30. Information transfer for expression of a disease outcome takes place at various stages, starting from transcription of a gene to mRNA, then translation to a protein, and, finally, to the synthesis of metabolites. Variations at any stage may induce a different response to any drug. Because of this, drug response of a homogeneous population of patients with a similar genome may vary. Moreover, variations at the genetic level may not get transferred to the downstream metabolite level because of the robustness of the metabolic pathways. Therefore, a metabolomics approach is employed to find metabolites that can accurately predict VIPN in a cohort of pediatric patients who underwent ALL treatment, with vincristine as the core chemotherapy drug. We determined small sets of metabolites that can accurately predict VIPN at different stages of the treatment and used them to develop a strategy to identify patients that are highly susceptible to VIPN. Using this strategy, we developed a free user-friendly tool, VIPNp33,34, for physicians to apply our models for prediction of VIPN susceptibility at different stages of a patient’s chemotherapy treatment.

Results

In this section, we elaborate on our workflow and the results. We categorized the neuropathy data obtained from physician evaluations of VIPN severity experienced by patients. We then performed hierarchical clustering to differentiate between metabolite profiles measured at both early and late time points of the treatment process. Based on their metabolite profiles measured at different time points, we found a small subset of biomarker metabolites which was used to build a predictive model of VIPN susceptibility. We were able to identify the molecular structure of some of the biomarkers and then performed pathway analysis.

Pediatric ALL patient data description

A total of 36 patients’ data was collected. These were pediatric patients treated with AALL0932 protocol for B-ALL enrolled in a study from 2010 to 2014 (see Methods for details). Vincristine was part of the core chemotherapeutic treatment. All subjects were phenotyped for peripheral neuropathy using the Total Neuropathy Score – Pediatric Vincristine (TNS©-PV)35. Twenty-four of the 36 patients (67%) were phenotyped as patients experiencing high neuropathy (HN), while the remaining 12 patients (33%) were classified as patients experiencing low neuropathy (LN) after the treatment completion. Patient demographics are described in Table 1. Non-fasting blood samples for these patients were collected prior to administration of drugs at three time points of the treatment: during the induction phase (day 8 and day 29) and the consolidation phase (around 6 months).

Table 1 Patient characteristics were defined according to gender, age, and body mass index (BMI).

Patients were assessed regularly using the TNS©-PV scoring scheme throughout the 2–3 years of treatment. A score less than 3 was considered of low intensity and a score greater than 8 was considered of high intensity. A patient was classified as LN if the TNS©-PV intensity remained low throughout the entire duration of the treatment which lasted 2–3 years, regardless of the drug amount. A patient was classified as HN if the TNS©-PV intensity was high at any time point during the treatment. This classification scheme matched the course of treatment for patients experiencing severe neuropathy. A patient who presents a score of 8 or higher at least once indicates susceptibility for severe neuropathy. The degree of neuropathy may improve during later phases of treatment when the drug is given less frequently or after dose modifications have been made for severe VIPN. However, these adjustments do not make a patient less susceptible to neuropathy in the future; if a patient was susceptible to high neuropathy even once, we classified the patient as HN. As a result of this classification system, patients who never had a TNS©-PV greater than 8 but had a TNS©-PV greater than 3 at some point (medium intensity) were not considered for the purposes of this study. This classification, as well as, threshold scores of 3 and 8 TNS©-PV were delineated to ensure a clear demarcation between patients who were susceptible to severe neuropathy and those who experienced negligible neuropathy.

Figure 1 shows how the neuropathy score changed during the course of the treatment time points considered and how it compared to the overall neuropathy susceptibility of a patient. At day 8, most of the patients had a low TNS©-PV intensity (TNS©-PV less than 3) while the incidence of high neuropathy (TNS©-PV greater than 8) increased with time. However, one patient had low TNS©-PV at day 8, high TNS©-PV at day 29, and low again at the month 6 time point. Figure 1 shows that by 6 months, 17 out of 24 HN (TNS©-PV greater than 8 at least once during the treatment) patients had already experienced high neuropathy, indicating a need for dosage adjustment earlier in the treatment process. Consequently, this data shows it is imperative to find biomarkers that enable the prediction of overall neuropathy susceptibility during the early stage of the treatment. The focus of this study is on finding metabolites that can differentiate between HN and LN during these time points.

Figure 1
figure 1

A bar plot showing distribution of the neuropathy score of patients over time. A TNS©-PV less than 3 corresponds to low, a score between 3 and 8 corresponds to medium, and a score above 8 corresponds to high. The first group shows the overall susceptibility of patients to neuropathy (LN versus HN), based on regular assessment throughout the entire duration of the treatment that lasted 2–3 years. The next three groups show the TNS©-PV intensity at that particular time point. Since patients with an overall medium TNS©-PV intensity were not considered in this study, the number of such patients is zero in the first group. Some HN patients had medium TNS©-PV intensity (TNS©-PV greater than 3 but lesser than 8) at some points during the treatment, as seen in the next three groups.

Longitudinal versus independent analysis of metabolite profiles

Following the collection of patient blood samples, we performed metabolite profiling using liquid chromatography with tandem mass spectrometry (LC-MS/MS) (refer to Methods for description). We then analyzed the metabolite profile across the three time points, initially using hierarchical clustering. The hierarchical clustering dendogram showed that the profiles clustered according to the time points, as shown in Fig. 2. Each branch in the figure corresponds to a sample. Since the samples were clearly demarcated according to the time points, it implies that the metabolite profiles were distinctly expressed at the three time points. Following the hierarchical clustering, we used an algorithm to develop a longitudinal support vector classifier (SVC)36 model to further confirm the distinct expression of metabolites at the three time points. The algorithm was used to estimate a parameter β according to the following equation:

$$Y=f({\tilde{{\bf{X}}}}_{{t}_{1}}+\beta {\tilde{{\bf{X}}}}_{{t}_{2}}),$$
(1)

where, \({t}_{1}\) and \({t}_{2}\) are two time points, Y is the neuropathy response, and \(\tilde{{\bf{X}}}\) is the metabolite expression matrix. This algorithm was used to estimate β by minimizing the error between the predicted and actual response. For different combinations of the time points, β was approximately zero, regardless of the initial guess, implying that no identical set of metabolites can accurately predict neuropathy at different time points. Based on these two results, we hypothesized that different metabolites should be predictive of neuropathy at the different time points leading us to analyze the profiles separately as discussed in the next section.

Figure 2
figure 2

A dendogram created based on the Euclidean distance shows that the metabolite profiles are clustered according to their corresponding time points. Day 8 and day 29 metabolite profiles belong to the same primary branch and are consequently closer to each other.

Metabolite selection and model building

In order to find a small set of metabolites that can accurately predict overall susceptibility to neuropathy, we used recursive feature elimination (RFE) along with cross validation to develop a linear SVC model. Since hierarchical clustering and longitudinal SVC modeling (described in the previous section) indicated that the metabolite profiles are distinctly expressed across the three time points, we applied RFE separately on each of the time point-specific metabolite profile matrix. Our matrix of features included metabolite expression and vincristine concentration at the three time points: 5 metabolites at day 8, 46 metabolites at day 29, and 42 metabolites at the month 6 time point were chosen, with an area under the receiver operating characteristics curve (AUROC) of 0.97, 0.95, and 0.96, respectively. The vincristine concentration was not one of the selected features at any time point. As shown in Table 2A, these selected metabolites performed well at predicting neuropathy at all the three time points, with an average AUROC greater than 0.90 and standard deviation of AUROC less than 0.1.

Table 2 Metrics obtained by performing RFE on the data sets at the three time points.

We also performed RFE to find metabolites that can predict time point-specific neuropathy (i.e., high or low intensity of TNS©-PV) in patients (Table 2B). Since only 1 patient had high TNS©-PV during day 8, we did not perform the analysis on that data set. From the day 29 metabolite profile data, only 2 metabolites were chosen that could predict TNS©-PV intensity at that point, with an AUROC of 0.83 (lower accuracy than that obtained from the previous analysis shown in Table 2A). From the month 6 data, a small set of metabolites could not be selected; 1955 metabolites were chosen (Table 2B) with a lower accuracy (AUROC 0.81) as compared to the previous case (Table 2A). This implies that the metabolites are more effective in classifying patients based on the overall VIPN susceptibility, rather than VIPN intensity at the specific time points.

After selecting the small set of metabolites, that can classify between HN and LN, using RFE, we investigated integration of their chromatogram peaks at the corresponding retention times to ensure that they were correctly integrated and were not chosen because of any potential error in integration. In order to rigorously validate the selected metabolites as biomarkers, we manually reviewed integration of peaks at the retention times of every selected metabolite, for every sample, and at all the time points. We used the Agilent ProFinder software for this purpose. We found the peaks by matching the m/z and retention time from the metabolite profile matrix to the data loaded in ProFinder and then manually visualized each of the peaks. If any integration was not accurate, we re-integrated that peak. This resulted in a polished metabolite profile matrix.

We repeated the RFE with this polished metabolite profile matrix to find metabolites that can classify between HN and LN. Using this polished data, 6 metabolites were selected at day 8, 48 metabolites at day 29, and 45 metabolites at month 6 time point, with AUROC of 0.94, 0.86, and 0.92, respectively. Table 2C shows the metrics obtained from the model fitting. In this case, the day 8 and month 6 metabolites performed better than those from the day 29 data, based on the AUROC and corresponding standard deviation.

To obtain the final list of accurately integrated and predictive metabolites, we chose metabolites that were selected both from the polished (2C) and the unpolished data (2A) using RFE on each of them. We further discarded the metabolites whose peaks looked erroneous and were difficult to properly integrate. This refined procedure provided a final list of 2, 14, and 21 metabolites from day 8, day 29, and month 6 data, respectively. We performed repeated cross validation again to train an SVC model using these final set of metabolites. Repeated cross validation was performed to estimate the predictability of the model and to find the optimal tuning parameter (the “cost” parameter) for the SVC model. The average cross validation AUROC was 0.93, 0.75, and 0.91 for day 8, day 29, and month 6 data, respectively. The average ROC’s for the trained models are shown in Fig. 3.

Figure 3
figure 3

ROC plots for the final trained models at the three time points. (a) Day 8, (b) Day 29, (c) Month 6. AUC: Area Under Curve. CI: Confidence Interval.

After training the model corresponding to each time point using repeated cross validation, we chose an optimal probability threshold to classify patients based on the output from the SVC model. To find the optimal probability threshold, we evaluated the cross validation trained models’ accuracy upon varying the threshold value. Evaluation of an appropriate threshold is needed since our data is unbalanced; keeping a threshold of 0.5 will lead to a bias towards HN. Supplementary Figure S1 shows sensitivity, specificity, Youden’s J statistic, and distance from best possible cutoff (i.e. sensitivity and specificity equal to 1) for the day 8, day 29, and month 6 trained models, where the models were built with the optimal tuning parameters obtained from cross validation. We chose the threshold based on the minimum of distance from the best possible cutoff which led to thresholds of 0.7, 0.65, and 0.7 for the three respective time points. We finally evaluated the cross validated trained models’ performance based on this newly determined threshold. Model performance was based on a confusion matrix (provided in Supplementary Table S2) and metrics calculated as shown in Table 3. The metrics shown in Table 3 show the ability of the models to make predictions when trained using a subset of the data. We have only reported cross validation accuracy metrics since we did not have a test data set to assess the performance of our final trained model. We finally trained the model using the whole data set and the optimal tuning parameters, which can be used to test new patient data.

Table 3 Cross validation accuracy metrics for the models with optimal tuning using the final selected metabolites.

We used multiple metrics to determine the accuracy and validity of the predictive models. First, the p value of the one sided test for an accuracy greater than the No Information Rate is less than 0.05 for all the three time point models. This implies that the models predict better than simply random guessing. However, with accuracy as a metric, day 8 and month 6 data outperform. The balanced accuracy ((sensitivity + specificity)/2) is higher for day 8 and 6 month data as well, indicating that the model for day 8 and month 6 data seems to be more reliable in predicting overall VIPN susceptibility. We have developed a user-friendly interface, named VIPNp, to use our trained models for day 8 and month 6 time points, freely available at GitHub34 and the Shinyapps server33.

Metabolite structure identification

After we obtained the final list of 2, 14, and 21 metabolites at day 8, day 29, and month 6, respectively, we wanted to identify the molecular structure of these metabolites based on their mass-to-charge ratio (m/z), retention times, and tandem mass spectrometry (MS/MS) spectra. MS/MS spectra was available for only a subset of the final list: 1 metabolite from the day 8 list, 5 metabolites from the day 29 list, and 8 metabolites from the month 6 list. MS/MS spectra was compared to the METLIN37 database and the Human Metabolome Database (HMDB)38. No match was found for the day 8 metabolite; for day 29, one metabolite matched with Adenosine 5’-monophosphate (AMP) for all the collision energies; for month 6 data, one metabolite matched with L-pipecolic acid for one of the collision energies. In order to confirm the identity of these metabolites, MS/MS of commercially sourced AMP and pipecolic acid standards was performed to determine the retention times. The retention times of the external compounds matched with that from the patient data (1 min of commercially sourced pipecolic acid versus 0.802 min as identified from the data, 1.4 min versus 1.77 min for AMP) confirming the identity of these two metabolites.

We used the LC-MS CEU Mass Mediator (CMM) search tool39,40 with m/z value, adduct, and retention time inputs in the HMDB database to identify remaining metabolites. The most reasonable options were shortlisted. For example, if one of the searches was a metabolite that could only be found due to consumption of alcohol, it was discarded as a potential option since our focus in this study is on pediatric patients. Except for 1,7-Dimethylguanosine and Phenylalanylproline, all other potentially identified metabolites were at least previously detected in blood, according to HMDB. Using these approaches, we were able to identify 4 and 9 metabolites in the day 29 and month 6 lists, respectively. Tables 4 and 5 define the identified metabolites from the day 29 and month 6 lists (Supplementary Tables S3, S4 and S5 contain the complete list of metabolites, their mass, retention time, and adduct information).

Table 4 Identified metabolites that can accurately predict neuropathy susceptibility at the day 29 time point.
Table 5 Identified metabolites that can accurately predict neuropathy susceptibility at the month 6 time point.

Pathway analysis

In order to investigate the relevance of the chosen and identified metabolites, we used the pathway analysis tool in Metaboanalyst 4.041,42. We selected the hypergeometric test for over representation analysis and the relative-betweeness centrality for pathway topology analysis.

For day 29, two metabolites (Adenosine diphosphate, Adenosine monophosphate) were identified to be part of purine metabolism, and one metabolite (N-Acetylornithine) was identified to be part of the arginine biosynthesis (details in Supplementary Table S6). Since there were very limited metabolites to perform pathway analysis and only one or two of them belonged to a specific pathway which originally consisted of several metabolites, none of the metabolite to pathway associations were statistically significant after adjustment of the p values required to account for multiple testing. Furthermore, we did not find evidence for the role of these pathways in chemotherapy-induced peripheral neuropathy in the existing literature.

For the set of month 6 metabolites, SM(d18:0/16:1(9Z)), Oxidized glutathione, LysoPC(18:1(9Z)), and Pipecolic acid were identified to be part of the sphingolipid metabolism, glutathione metabolism, glycerophospholipid metabolism, and lysine degradation, respectively (details in Supplementary Table S7). Again, none of the pathway and metabolite associations were statistically significant after adjustment of the p values to account for multiple testing. We, however, did find evidence for the role of sphingolipid metabolism in chemotherapy-induced peripheral neuropathy43,44,45. Moreover, glutathione is a popular antioxidant tested as a therapeutic for chemotherapy-induced peripheral neuropathy, with various related studies mentioned in a review by Starobova and Vetter46. There is also evidence for the involvement of glycerophospholipid metabolism47 which indicates that these metabolites may be biologically relevant to VIPN.

Discussion

Peripheral neuropathy is a painful and debilitating side-effect of vincristine, a common chemotherapeutic drug used for treatment of pediatric ALL patients, as well as, many other pediatric and adult cancers. Currently, there is no established way of predicting VIPN during the initial stage of the treatment. Identification of specific biomarkers will aid in adjusting the vincristine dose according to susceptibility of patients to VIPN in order to improve their quality of life. Even though all previous omics related studies have focused on genomics, it is imperative to include phenotypic variabilities to find VIPN predictors given their known impact on drug response. In this study, we investigated the role of metabolites in predicting susceptibility of ALL pediatric patients to VIPN.

We performed metabolite profiling of ALL patients at three time points during the treatment: day 8, day 29, and month 6. First, we found that the metabolite profiles were distinctly expressed at the three time points, as shown by hierarchical clustering (Fig. 2) and longitudinal SVC modeling, indicating that the overall metabolite profile varied as a direct response to the treatment. Second, preliminary analysis using RFE resulted in sets of 5, 46, and 42 metabolites that accurately predicted overall VIPN susceptibility. Since the vincristine concentration was not one of the selected features from RFE, we concluded that regardless of the vincristine concentration at the time points, a small set of metabolites can accurately predict overall neuropathy susceptibility in these patients at these time points. Third, we manually integrated chromatogram peaks of these selected metabolites and performed RFE again with the polished matrix. Subsequently, we chose the metabolites that overlapped from the analysis with unpolished and polished data. Then, final sets of 2, 14, and 21 metabolites were found that could predict overall VIPN susceptibility at day 8, day 29, and month 6, respectively.

Metabolites could not predict time point-specific TNS©-PV intensity as accurately as overall susceptibility to VIPN (LN versus HN). This further strengthened our hypothesis that specific downstream metabolites can be potential biomarkers of overall neuropathy susceptibility due to treatment with vincristine. Furthermore, the final models built with the chosen metabolites were more accurate in predicting neuropathy at day 8 and month 6 as compared to the model built for the day 29 data (Table 3), implying that our developed predictive models can be used to evaluate VIPN susceptibility at the day 8 and month 6 time points during the treatment.

Based on the final developed models at the day 8 and month 6 time points, we present a framework for predicting a patient’s overall VIPN susceptibility during ALL treatment (Fig. 4). On day 8 of the treatment, blood samples can be collected for metabolite profiling, specifically to obtain expression of the 2 chosen metabolites. The expression of these metabolites can be used to predict overall neuropathy susceptibility from our trained model. If the output of the model is a probability greater than the threshold value of 0.7, the physician may need to lower the vincristine dose as that sample might correspond to an HN patient. A follow up evaluation can be performed after 6 months by obtaining expression of the 21 chosen metabolites (Supplementary Table S5). If the output from the two trained models is consistently a probability greater than 0.7, the physician could consider lowering the vincristine dose. This framework can aid in vincristine dose decision making for the physicians. Our user-friendly tool, VIPNp33,34, can be used to execute this strategy.

Figure 4
figure 4

A workflow showing a potential vincristine dose decision making strategy based on the trained SVC models. Blood samples of patients can be collected at day 8 and month 6 time points of the treatment. Samples can then be analyzed using mass spectrometry for metabolite profiling of the selected 2 and 21 metabolites at the day 8 and month 6 time points, respectively. The metabolite profile data can then be used to predict overall neuropathy susceptibility from our trained SVC models. If the model output probability is greater than a threshold value of 0.7, the patient might be susceptible to overall high neuropathy (HN). This strategy enables identification of patients susceptible to HN. The vincristine dose for HN patients may require adjustment.

In the foregoing suggested strategy, a significant caveat is that these models can only classify patients as susceptible to HN or LN. Some patients might be susceptible to “medium” VIPN (TNS©-PV greater than 3 but lesser than 8) during the treatment. Such cases have not been included here, implying that the models have limited predictability and that they should be used with caution. As a result, we only suggest that if the output of the model is greater than 0.7 at both day 8 and month 6 predictions, then there may be a need to adjust vincristine dose, since such patients are most likely classified as HN. It must also be noted that developing a strategy to balance efficacy and toxicity of vincristine would require further investigation before a standard protocol for physicians can be developed, in careful consideration with ALL experts.

Even though we finalized a list of metabolites using RFE (Supplementary Tables S3, S4, and S5), we could not identify all of them. This greatly reduced our capability to find the biological relevance of these chosen metabolites. We could not obtain MS/MS spectra for all of them, which may be a result of their presence in smaller quantities. Even after obtaining the MS/MS spectra, we could not find matches from existing databases for most of these metabolites. As a result, we only used the available m/z value, retention time, and adduct information for the remaining unidentified metabolites, which does not ensure that we have identified the metabolites correctly. For a thorough identification, MS/MS spectra is needed for all of the chosen metabolites and subsequently the retention times of those compounds need to be validated using authentic standards. Since only a few metabolites could be identified and consequently used for pathway analysis, none of the associations between the metabolites and their pathways found were statistically significant. Even though some of the identified metabolites for the month 6 data belonged to pathways that are relevant to chemotherapy-induced peripheral neuropathy, their biological relevance could not be established with certainty.

The peak integration of chromatograms also needs to be interpreted with caution. We observed that some peaks may not be integrated correctly or may simply be difficult to integrate for only a few metabolites. This has two implications: 1) The chosen metabolites may be predictive of neuropathy because they were improperly integrated. We eliminated this possibility by manually integrating and repeating RFE with a polished metabolite profile matrix. 2) Some potential biomarker metabolites might not have been chosen because of poor integration. It is not feasible to manually inspect every peak for every sample given that there were approximately 2000 metabolites and 104 samples (including all the time points), leaving a possibility that all potential predictors might not have been identified. Regardless, the current metabolites found at day 8 and month 6 are certainly potential predictors since they accurately predict VIPN susceptibility.

It should also be noted that the metabolite profile matrix is defined as high dimensional–with several metabolites and very few samples. This leaves a possibility that, despite using feature selection machine learning algorithms, some potential metabolites may have been overlooked. Moreover, we do not have balanced samples; having an equal number of high and low neuropathy patients will allow for a more accurate model. We also do not have a test set to validate our model. It is critical to perform this study with more samples and additional datasets in order to test and validate our model. VIPNp33,34 can be used to test our trained models on new datasets in the future. There is also a need to include “medium” neuropathy patients to accurately predict multiple outcomes of the treatment. Since this is the first metabolomics based pilot study for VIPN, we only focused on HN and LN patients to explore the potential of this approach in predicting VIPN susceptibility.

It should also be acknowledged that the average age of LN and HN patients were different (4.4 and 9.6 years, respectively). It would, thus, be interesting to explore the role of puberty in VIPN incidence. To investigate the effect of puberty on the metabolite profiles, we performed hierarchical clustering which did not show a demarcation of metabolite profiles based on the cutoff age of 10 years (see Supplementary Figs. S2, S3, and S4). We still cannot rule out the role of puberty in this case, which could have been addressed better if all the selected metabolites could be identified. The currently identified metabolites and their corresponding pathways are not directly related to puberty. This would require further investigation in the future.

Despite these limitations, our preliminary study has shown that ALL treatment can alter the metabolite profiles, and a few selected metabolites can accurately predict the overall VIPN susceptibility. We have also provided a strategy (Fig. 4) to adjust the vincristine dose based on VIPN susceptibility at two time points: day 8 and month 6. Our work shows that metabolomics can aid in predicting VIPN susceptibility during the early stage of treatment. Although it must be admitted that our prediction of high and low neuropathy could have had a stronger statistical backing with a larger cohort of patients, we contend that the numbers in this study were not so unreasonably small as to challenge this major conclusion. Based on an exhaustive literature review, all previous studies have only focused on genomics and pharmacokinetics in the prediction of chemotherapy-induced peripheral neuropathy with the exception of one proteomics study48. This pilot study is the first of its kind focused on metabolomics to predict VIPN. In order for this methodology to be more effective, we need a balanced and large number of fasting samples, accurate integration of chromatogram peaks, MS/MS spectra for all the chosen metabolites, and a more exhaustive database to identify these metabolites.

Peripheral neuropathy is not just limited to vincristine but is a dose-limiting toxicity of several other chemotherapy drugs (paclitaxel, taxane, and cisplatin), as well. A problem of this scale requires expertise and continued collaboration of individuals across multiple disciplines. An integrative approach involving better quantitation of all the omics data, large patient cohorts, careful phenotyping of patient data, and state of the art machine learning and statistical techniques is necessary in order to find a robust prediction of VIPN and any other chemotherapy-induced peripheral neuropathy, even before the start of the treatment.

Methods

ALL patient data collection

Children with newly diagnosed precursor B-cell ALL were recruited from four academic medical centers: Indiana University School of Medicine/Riley Hospital for Children, the University of Michigan Comprehensive Cancer Center/Mott Children’s Hospital, Vanderbilt University Medical Center/Monroe Carell Jr. Children’s Hospital, and George Washington University/Children’s National Medical Center49. Participants were between the ages of 1 and 18 at the time of diagnosis and received vincristine according to a Children’s Oncology Group (COG) treatment trial for acute lymphoblastic leukemia. The standard vincristine dosage received was 1.5 mg/m2 (capped at 2-mg maximum dose). Toxicity-based dose modifications were defined according to the specific COG protocol guiding the individual child’s leukemia treatment. Patients were excluded if they had any of the following criteria: baseline peripheral neuropathy score greater than grade 1 per the NCI CTCAE©version 4.0; currently receiving erythropoietin, itraconazole, or vitamin supplement greater than 100% of the recommended daily allowance; Down’s syndrome; pregnancy; or a history of coexisting serious illness that would limit neurological assessments. The protocol was reviewed and approved by the Institutional Review Boards at all four participating centers. All the methods were performed in accordance with the relevant guidelines and regulations. Informed consent was obtained from a parent and/or legal guardian for study participation.

Metabolomics sample preparation and extraction

Protein removal and sample extraction were performed by adding 200 μL methanol to 100 μL of plasma. Solutions were sonicated for 2 minutes, chilled at −20 °C for 1 hour, and centrifuged at 16,000 × g for 8 minutes. The supernatants were transferred to separate vials and evaporated to dryness in a vacuum concentrator. The dried fractions were reconstituted in 75 μL of diluent composed of 95% water and 5% acetonitrile, containing 0.1% formic acid. Solutions were sonicated for 5 minutes, centrifuged at 16,000 × g for 8 minutes, and the supernatants were transferred to plastic HPLC total recovery autosampler vials.

HPLC-MS analysis of metabolomics samples

Separations were performed on an Agilent 1290 system (Palo Alto, CA), with a mobile phase flow rate of 0.45 mL/min50. The metabolites were assayed using a Waters HSS T3 UPLC column (1.8 μm, 2.1 × 100 mm), where the mobile phase A and B were 0.1% formic acid in ddH2O and acetonitrile, respectively. Initial conditions were 100:0 A:B, held for 1 minute, followed by a linear gradient to 70:30 at 16 min, then 5:95 at 21 min. Column re-equilibration was performed by returning to 100:0 A:B at 22 minutes and holding until 27 minutes. The mass analysis was obtained using an Agilent 6545 Q-TOF mass spectrometer with electrospray ionization (ESI) capillary voltage +3.2 kV, nitrogen gas temperature 325 °C, drying gas flow rate 8.0 L/min, nebulizer gas pressure 30 psig, fragmentor voltage 130 V, skimmer 45 V, and OCT RF 750 V. Mass data from m/z 70–1000 scans were collected at 5 Hz using Agilent MassHunter Acquisition software (v. B.06). Mass accuracy was improved by infusing Agilent Reference Mass Correction Solution (G1969-85001). To assist with compound identification, MS/MS was performed in a Data-dependent Acquisition mode. Five precursors per cycle were obtained using fixed collision energies of 10, 20, and 40 eV.

Peak deconvolution, integration, and alignment was performed using Agilent ProFinder (v. B.06). Peak annotations were performed using the METLIN (www.metlin.scripps.edu) and HMDB (www.hmdb.ca) metabolite databases, with a mass error of less than 15 ppm.

Vincristine quantitation

Sample preparation

100 μL plasma was pipetted to a microcentrifuge tube51. Protein precipitation was performed by adding 200 μL cold methanol. The mixture was vortexed for 2 minutes, chilled at −20 °C for 1 hour, and then centrifuged at 13,000 g for 8 minutes. Supernatant was transferred to a fresh microcentrifuge tube and vacuum concentrated to dryness. Reconstituted in 75 μL of diluent (95% water and 5% acetonitrile, with 0.1% formic acid), sonicated for 5 minutes, centrifuged at 13,000 rpm for 8 minutes, and supernatant was transferred to HPLC vials.

HPLC/MS-MS analysis

Vincristine plasma levels were quantitated by HPLC/MS-MS. Separation was performed on an Agilent Rapid Res 1200 HPLC system using an Agilent Zorbax XDB-C18 (2.1 × 50 mm, 3.5 μm) column. Mobile phase A was water with 0.1% formic acid and mobile phase B was acetonitrile with 0.1% formic acid. Initial conditions were 95:5 A:B, held for 0.5 minute, followed by a linear gradient to 0:100 at 8 min, and held until 10 min. Column re-equilibration was performed by returning to 95:5 A:B at 11 minutes and held until 15 minutes. Column flow rate was 0.3 mL/min. Retention time for vincristine was 5.9 minutes.

Analytes were quantified by MS/MS utilizing an Agilent 6460 triple quadrupole mass spectrometer with electrospray ionization (ESI). Quantitation was based on Multiple Reaction Monitoring (MRM). ESI positive mode was used with a transition of 825.2 to 807.7 (quantifier) and 825.2 to 765.3 (qualifier), with a collision energy (CE) of 45 and 40 V, respectively. A fragmentor energy of 135 V and a dwell time of 80 ms was used. Source parameters were as follows: nitrogen gas temperature = 330 °C and flow rate = 10 L/min, nebulizer pressure = 35 psi, sheath gas temperature = 250 °C, sheath gas flow rate = 7 L/min, and capillary potential = 3.5 kV. All the data were collected and analyzed with Agilent MassHunter B.03 software. Quantitation was based on a 6 point standard curve, with concentration range from 0.1 to 500 ng/mL, by spiking vincristine into unmedicated human plasma (Sigma). Standard curves were fit to a quadratic function, with a 1/x curve fit weighting. Correlation coefficients >0.99 were obtained.

Metabolite profiling data analysis

Data analysis was performed in R52. First, metabolites that were not present in at least 75% of the patients in at least one group were discarded. Subsequently, missing data imputation was performed using a modified k-nearest neighbours (KNN) imputation as specified in53. Impute package in R was used to perform the KNN imputation54. Hierarchical clustering was performed using hclust, with ward.D2 as the method and Euclidean distance as the metric. For metabolite selection, recursive feature elimination (RFE) along with repeated 5-fold cross validation support vector classifier was used. Repetition was done 20 times. This was further iterated 100 times and the model was chosen based on highest AUROC and lowest specificity standard deviation. For recursive feature elimination, all sizes of number of predictors between 1 to 50 were allowed. Caret55 package from R was used for this. The method specified in caret for this algorithm is svmLinear2, from the package e107156. Tunelength was kept as 1 and hence the cost parameter was fixed to 0.25. SVM with linear kernel was chosen because more tuned parameters would be needed for nonlinear kernels. Since our dataset had few samples, that may have lead to overfitting. Final model fitting results specified in Table 2 were based on resampling from bootstrapping 25 times, using the selected number of predictors. This was the default model fitting mode in caret while performing RFE.

For model training after finalizing the set of metabolites, we trained using 5-fold repeated cross validation, repeated 20 times. The metric for accuracy was kept as AUROC. Here, tuneLength was kept as 10. Cost function of the SVM model was tuned using this. After model training, for deciding the probability threshold, thresholds between 0.5 and 1 were explored. The one corresponding to minimum distance from perfect sensitivity and specificity (i.e. both of them equal to 1) was chosen. Subsequently, confusion matrix and its metrics were generated using caret package’s confusion matrix generator.