Metabolite biomarker discovery for human gastric cancer using dried blood spot mass spectrometry metabolomic approach

As one of the most common malignancies, gastric cancer (GC) is the third leading cause of cancer-related deaths in China. GC is asymptomatic in early stages, and the majority of GC mortality is due to delayed symptoms. It is an urgent task to find reliable biomarkers for the identification of GC in order to improve outcomes. A combination of dried blood spot sampling and direct infusion mass spectrometry (MS) technology was used to measure blood metabolic profiles for 166 patients with GC and 183 healthy individuals, and 93 metabolites including amino acids, carnitine/acylcarnitines and their derivatives, and related ratios were quantified. Multiple algorithms were used to characterize the changes of metabolic profiles in patients with GC compared to healthy individuals. A biomarker panel was identified in training set, and assessed by tenfold cross-validation and external test data set. After systematic selection of 93 metabolites, a biomarker panel consisting of Ala, Arg, Gly, Orn, Tyr/Cit, Val/Phe, C4-OH, C5/C3, C10:2 shows the potential to distinguish patients with GC from healthy individuals in tenfold cross-validation model (sensitivity: 0.8750, specificity: 0.9006) and test set (sensitivity: 0.9545, specificity: 0.8636). This metabolomic analysis makes contribution to the identification of disease-associated biomarkers and to the development of new diagnostic tools for patients with GC.

As one of the most common malignancies, gastric cancer (GC) is the third cause of cancer-related deaths in China 1 . According to GLOBOCAN 2018 data, around 1,034,000 new cases and more than 782,000 deaths occurred for GC in 2018 2 . GC is a multifactorial and multistep process, beginning with active chronic gastritis caused by Helicobacter pylori infection 3 . It is often described as a stepwise progression from non-active gastritis via chronic active gastritis into precursor lesions of GC and finally GC 4,5 . Most GC is adenocarcinoma, which derives from glandular epithelium of gastric mucosa 4 . However, GC is asymptomatic in early stages, and the majority of GC mortality is due to the delayed symptoms. It has been found that 5-year overall survival rate for patients with GC diagnosed at advanced stages is reduced down to 20% 6 . Among screening methods for early detection of GC, endoscopy as a sensitive method is most commonly used 7 , nevertheless, the risk of complications and patient discomfort limit its wide use. Furthermore, despite traditional circulating biomarkers of cancer were achieved, the diagnostic efficacy was not satisfactory for patients with GC due to their low sensitivity 8 . Because of that, finding reliable biomarkers for disease identification is of highest interest to improve outcomes for patients with GC. Investigating the quantity or type of molecules in organisms via metabolomics can provide better understanding as to the biochemical status in a system or indicate the changes that have occurred within the metabolome 9,10 . Metabolomics technology can aid in cancer discovery and in building cancer diagnostic tools, and can provide opportunity to understand the molecular mechanism. Metabolic changes in blood are the key events in the development of carcinoma, which could be characterized by mapping global metabolic profiles, and this analytic technique has been used to interpret possible mechanisms and to identify novel metabolic biomarkers for GC 11,12 . Liquid chromatography mass spectrometry (LC-MS), one of the most commonly used platforms in metabolomic studies, can be applied to detect biomolecules for its peak resolution, high sensitivity, and sufficient reproducibility 13 . It has been found that the levels of 16 metabolites detected by LC-MS were altered in patients with GC compared to healthy control group, involving in Gly, Ala, Pro and hexadecanoic acid, which showed potential for developing biomarkers and therapeutic interventions for GC 14 . Another study showed that the ratio of kynurenine/tryptophan was associated with observed metabolic changes in patients with GC, and the monitoring of tryptophan metabolites could be used to identify potential biomarkers for GC 11 . However, both of them were relatively small-sized studies, and further clinical sample analysis is still needed for patients with GC. Besides, few reports were available in characterizing metabolic profiles of amino acids and carnitine/ acylcarnitines for patients with GC. Dried blood spot (DBS) sampling is a microvolume sampling technique involving the collection of blood samples by heel or finger puncture. It as compared to conventional whole blood sampling has relatively high stability, requires a smaller blood volume, offers a simpler storage and easier transfer, reduces infection risk by infectious pathogens, and can be as an alternative method to metabolomics study 15,16 . The combination of DBS and MS can provide a high-throughput, reliable and stable determination for a broad array of analytes, which can be satisfactorily used to select high specificity and sensitivity biomarkers to some kinds of diseases 17,18 . In the present study, a combination of DBS sampling and MS was utilized to detect biomarkers based on altered levels of amino acids and carnitine/acylcarnitines in patients with GC compared to normal individuals. Nine parameters including 4 amino acids, 2 acylcarnitines and 3 related ratios were detected as potential biomarkers for GC in training set, which were further used to build prediction model for distinguishing patients with GC from healthy individuals. Hence, the changes in some amino acid and carnitine/ acylcarnitine levels might indicate the existence of GC invasion, and the study's findings suggest new insights into GC detection. Blood sample collection and pretreatment. Labeled amino acid and relevant carnitine/acylcarnitine internal standards were mixed with pure methanol, individually. Stock solutions were prepared by the mixture of these dissolved isotope standards, and stored at 4 °C. The 100-fold dilution of stock solution was used as working solution. In quality control (QC) process, a pooled QC sample was obtained by the mixture of equal volumes (10 μL) from all collected samples. Blood samples were collected after an overnight fasting for each participants in order to eliminate the disturbance of diet. DBS samples were collected by fingertip puncture. After wiping off the first drop of blood, 3-5 drops were collected onto a DBS card. A disc of 3 mm diameter was punched from a DBS card. The collected discs were put into Millipore MultiScreen HV 96-well plate (Millipore, Billerica, MA, USA) aimed at extracting metabolites. A working solution of 100 μL was added into a well containing a DBS disc. After 20 min gentle shaking, the plate was centrifuged at 1500 rpm for 2 min and, afterwards, the filtrate was collected into a new flat-bottom 96-well plate. In order to monitor the stability of MS analysis, 2 low-level and 2 high-level QC sample solutions were randomly put into 4 blank wells. The filtrate and QC solution were dried in pure nitrogen gas flow at 50 °C, and then these samples were derivatized with 60 μL mixture of acetyl chloride/1-butanol (10:90, v/v) at 65 °C for 20 min. After derivatized solution dried again, 100 μL mobile phase solution was mixed with each dried sample for the following metabolomics analysis.

Metabolomics analysis. The direct injection MS was used for quantitative metabolomic analysis on an
AB Sciex 4000 QTrap system (AB Sciex, Framingham, MA) coupled with an electrospray ionization source, and the MS analysis was conducted under positive mode. A sample volume of 20 μL was injected into the system. The 80% acetonitrile aqueous was used as mobile phase. An initial flow rate was set to be 0.2 mL/min. Flow rate was decreased to 0.01 mL/min within 0.08 min, and remained stable until 1.5 min. Subsequently, the flow rate reverted back to 0.2 mL/min within 0.01 min, and maintained for 0.5 min. MS parameters were set as follows: ion spray voltage 4.5 kV, curtain gas pressure 20 psi, auxiliary gas temperature 350 °C. Sheath and auxiliary gas pressure was set at 35 psi. The scan modes and scan parameters were referred to previous report 19 . Analyst 1.6.0 software (AB Sciex) was applied to control system, align spectrum, and collect MS data. ChemoView 2.0.2 (AB Sciex) was used for absolute quantification purposes. www.nature.com/scientificreports/ Data analysis. A multivariate analysis for metabolomics data was performed using SIMCA-P 12.0 software (Umetrics AB, Umea, Sweden). A principal component analysis (PCA) was used to supervise holistic metabolome alterations between patients with GC and healthy individuals and to inspect the stability of this study. In addition, a partial least squares discriminant analysis (PLS-DA) was applied to differentiate patients with GC from healthy individuals and to determine the important variables contributing to this classification based on variable importance in projection (VIP) values. Subsequently, a permutation test was used to evaluate the risk of over-fitting for PLS-DA model. T-test statistical analysis was used to identify the differential metabolites between HC and GC groups for parametric variables. Wilcoxon-Mann-Whitney test was performed for nonparametric variables. Benjamini-Hochberg false discovery rate (FDR) was used to adjust p-values for multiple hypothesis testing. Volcano plots were generated to screen important variables (VIP > 1, fold change (FC) > 1.2 or < − 1.2, adjusted p-value < 0.05) in GC group compared to HC group. In order to further investigate metabolite changes in GC group compared to HC group, significance analysis of microarrays (SAM) method was performed. Ultimately, potential biomarkers were selected by a stepwise selection method. These selected potential biomarkers were included to build a binary logistic regression model for distinguishing patients with GC from healthy individuals. The performance of this model was assessed by tenfold cross validation and external test set. Receiveroperating characteristic (ROC) curve was created to measure the ability of potential biomarkers to discriminate between patients with GC and healthy individuals. Statistical analysis was performed using SAS software. Online software MetaboAnalyst 5.0 was used for pathway analysis based on differential metabolites between HC and GC groups.

Results
Demographics of study samples. The workflow for this study was shown in Fig. 1 Metabolic differences between LC and HC groups. A total of 93 variables including 23 amino acids, 26 carnitine/acylcarnitines, and 44 derived parameters and related ratios 19 were detected from healthy participants and patients with GC for subsequent univariate and multivariate analyses. All detected variables were provided in Supplementary Table S1. Unsupervised PCA was executed for metabolomics data from blood samples of HC and GC groups in order to investigate the altered metabolites. There was a trend that GC group was separated from HC group based on 93 parameters in the training set ( Fig. 2A). Furthermore, supervised PLS-DA was performed to determine the separations between HC and GC groups by all 93 variables. The PLS-DA score plot (Fig. 2B) showed the apparent separations between patients with GC and healthy individuals without overfitting (Fig. 2C) in the training set.
The screening of significantly differential metabolites. Systematic screening of important metabolites was executed by multiple approaches. Firstly, a total of 29 metabolites were selected with VIP > 1, which can contribute to the classification between HC and GC groups according to PLS-DA score plot (Fig. 3A). Secondly, significant differences for all metabolites were evaluated with Wilcoxon-Mann-Whitney test or t-test, and FDR was controlled in order to adjust significance levels for hypothesis testing. A total of 45 parameters were retained with adjusted p-value < 0.05 (Fig. 3B). Thirdly, FC was calculated, and 30 features with FC > 1.2 or < − 1.2 were significantly altered in GC group compared to HC group. Together, 25 of these metabolites with VIP > 1, adjusted p-value < 0.05, and FC > 1.2 or < − 1.2 exhibited significant alterations in patients with GC compared to healthy individuals in the training set (Fig. 3C). All detected variables and their adjusted p-value, VIP, and FC values were shown in Supplementary Table S1. SAM was used to further supervise and define the significant metabolite changes in patients with GC compared to healthy individuals (Fig. 4). Finally, 23 metabolites contribute to the discrepancy between the two groups ( Table 2). Among these metabolites, the levels of 15 features were significantly increased and, conversely, the levels of 8 features were distinctively decreased in patients with GC compared to healthy individuals. www.nature.com/scientificreports/ To further clarify the metabolic pathways which may be affected by GC, the differential metabolites between healthy individuals and patients with GC (Table 2) were imported into MetaboAnalyst 5.0 for pathway analysis. As shown in Fig. 5, 8 metabolic pathways were highlighted focusing on amino acid metabolism, urea cycle, malate-aspartate shuttle, lipid metabolism, and so on.
Building prediction model. A stepwise logistic regression was conducted towards 23 selected metabolites (Table 2) in the training set. Finally, 9 features were identified including Ala, Arg, Gly, Orn, Tyr/Cit, Val/ Phe, C4-OH, C5/C3, C10:2 (Fig. 6). A logistic regression model was developed as follows: Logit probability = 2.05 − 1.35 × Ala + 5.68 × Orn + 1.80 × Arg + 2.39 × C4-OH − 0.90 × Tyr/Cit − 0.62 × Val/Phe + 1.20 × C5/ C3 − 1.35 × C10:2 + 0.76 × Gly. The diagnostic performance of this metabolic panel was evaluated by tenfold cross validation and external test set (Table 3). Furthermore, ROC curve was drawn to assess the potential of metabolic panel to distinguish between patients with GC and healthy individuals (Fig. 7). The area under ROC curve (AUC) is 0.9586 (95%CI 0.9384-0.9788) in the training set. The sensitivity and specificity were 0.8611 and 0.9565 in the training set, respectively. During the process of tenfold cross validation, all samples in the training set were randomly divided into 10 partitions in order to cross-validate the predicted model. Additionally, 44 blood samples including 22 patients with GC and 22 healthy individuals were used as test set to further assess the diagnostic potential of 9 selected metabolic biomarkers. As shown in Table 3, the AUC of 0.9438 (95%CI 0.9163-0.9714) and 0.9318 (95%CI 0.8525-1.0000) was determined in tenfold cross validation and test set, respectively. Additionally, both of sensitivity and specificity were also satisfactory in tenfold cross validation (sensitivity: 0.8750; specificity: 0.9006) and test set (sensitivity: 0.9545; specificity: 0.8636). www.nature.com/scientificreports/

Discussion
Currently, identification of novel blood biomarkers remains a pivotal goal for GC, and the limitations of modern technology for the detection and treatment of the disease emphasize the necessity of finding novel potential biomarkers. However, few biomarker candidates can be translated into clinical applications due to limited diagnostic performance or study cohorts 20 . Specific physiological or pathological conditions are able to perturb blood metabolites, which can be used as potential biological indicators in normal and pathological biological  www.nature.com/scientificreports/ processes. Thus, the detection of perturbed small molecular metabolites can provide a powerful tool for cancer diagnosis. In the present study, a total of 349 subjects were recruited, including 183 healthy individuals and 166 patients with GC, and were divided into training set of 305 subjects and test set of 44 subjects. A combination of DBS sampling and direct injection MS analysis was performed to detect metabolite biomarkers for GC. After systematic selection, there were significant differences in the levels of 23 metabolites between patients with GC and healthy individuals (Table 2). Furthermore, independent predictors was identified by a stepwise logistic regression analysis, and a biomarker panel consisting of Ala, Arg, Gly, Orn, Tyr/Cit, Val/Phe, C4-OH, C5/C3, C10:2 was used to construct prediction model for GC. Metabolic reprogramming was regarded as a central hallmark of cancer. The amino acid and lipid metabolic pathways were disturbed in patients with GC as revealed by differential pathway analysis (Fig. 5). Identifying Table 2. The differential parameters between patients with GC and healthy individuals. HC, healthy control; GC, gastric cancer; Asp, aspartic acid; Arg, arginine; Gly, glycine; Ser, serine; Orn, ornithine; C3DC, malonylcarnitine; C4-OH, hydroxybutyrylcarnitine; C18:1, octadecenoylcarnitine; Ala, alanine; Cit, citrulline; C2, acetylcarnitine; C0, free carnitine; C10, decanoylcarnitine; C5, isovalerylcarnitine; C3, propionylcarnitine; Pro, proline; Met, methionine; Phe, phenylalanine; Tyr, tyrosine; Val, valine; C10:2, decadienoylcarnitine. a Defined as the increased (upward arrow) or decreased (downward arrow) levels of metabolites in patients with GC compared to healthy individuals.

No
Parameters HC (mean ± SD) GC (mean ± SD) Status a Adjusted p-value  Figure 5. A pathway impact analysis based on differential metabolites between GC and HC groups in training set. Eight perturbed metabolic pathways were indicated for patients with GC. www.nature.com/scientificreports/ how metabolism shifts in patients with cancer can contribute to disease diagnosis and prediction. Amino acids were disturbed by the imbalance in protein metabolism due to the influences of host-tumor interactions and metabolic requirements of tumor cells to specific amino acids 21 , which exhibited potential usage in improving diagnosis and detection of early-stage cancer 22 . In the present study, 23 metabolites were significantly altered with VIP > 1, adjusted p-value < 0.05, and FC > 1.2 or < − 1.2 in patients with GC compared to healthy individuals, involving Asp, Arg, Gly, Ser, Orn, Ala, and Pro. Interestingly, the levels of Asp, Arg, Gly, Ser, and Orn were increased in patients with GC compared to healthy individuals ( Table 2). As a non-essential amino acid, Asp is the basic substrate for the synthesis of pyrimidine and purine nucleosides. Furthermore, Ser is also involved in the synthesis of purine nucleotides via Gly, and it can influence cell growth and invasion of cancer cells 23 . The increased uptake rates of Asp and Ser imply that these amino acids are needed in fueling nucleoside biosynthesis for tumor proliferation 24 . As a semi-essential amino acid, Arg is a key component in the body and involved in cell division, immune system, and hormone biosynthesis, and contributes to immunosurveillance, tumor growth and metastasis 25 , which implies that Arg is required to fuel tumor cell metabolism. Its metabolism may be influenced by the overexpressed argininosuccinate synthase 1 (ASS1) in GC 26 . The high expression of ASS1 can lead to increased NO production, which promotes gluconeogenesis via S-nitrosylation of pyruvate carboxylase and phosphoenolpyruvate carboxykinase 2. The increased gluconeogenesis may further enhance the levels of Ser  www.nature.com/scientificreports/ and Gly in nucleotide synthesis 27 . Increased level of Arg can lead to decreased ratio of Cit/Arg, and this ratio has been found to reflect NO production 28 . Orn acts both as a substrate of ornithine decarboxylase (ODC) to produce polyamines and as a substrate of ornithine aminotransferase (OAT) to produce Pro, and all these products are involved in the promotion of cancer progression 29 . Therefore, altered levels of Orn and Pro suggested the disorders of ODC and OAT metabolism and the requirements of these amino acids in GC progression. It has been found that ratio of Orn/Cit can reflect a shift in arginine metabolism 30 . This ratio is influenced by increased level of Orn, reconfirming the altered arginine metabolism in GC. The upregulated glycolysis in cancer metabolism, also known as the Warburg effect, promotes compensatory pathways, especially oxidation of fatty acids 31 . Carnitine/acylcarnitines, intermediates of fatty acid oxidation, are essential for fatty acid oxidation and energy metabolism, and accumulated as a consequence of metabolic defect. Considering these adaptations, carnitine pool is uniquely positioned to supervise the perturbations of carnitine/acylcarnitine metabolism, and it is useful to discover the disturbed metabolic pathways during cancer development and progression 32 . Carnitine palmitoyl transferase 1 (CPT1) is associated with metabolism of acylcarnitines by catalyzing the conversion of acyl-CoA into acylcarnitine, and controls the entry of long-chain fatty acid into mitochondrial matrix for energy production via fatty acid β-oxidation. A recent study reported that CPT1A is upregulated in patients with GC, and is involved in GC progression 33 , which may account for the distinct accumulations of acylcarnitines in GC. In addition, the expression of carnitine acetyltransferase (CrAT) has been reported to be upregulated in cancer, which may also promote alterations in carnitine metabolism 34 . In this study, the levels of 4 short-chain acylcarnitines (C2, C3DC, C4-OH, C5) and one long-chain acylcarnitine (C18:1) were increased in patients with GC compared to healthy individuals (Table 2), and the accumulations of acylcarnitine metabolites may be due to the abnormal expression of these enzymes. Ratio C2/C0 was enhanced in patients with GC, which further indicated the increased fatty acyl mitochondrial transport and β-oxidation of fatty acids in GC 35 . Furthermore, it has been reported that short-chain carnitine-acylcarnitine translocase in mitochondria and short-chain acylcarnitine levels may be related to the metabolism of branched-chain amino acids (BCAA) 36 , and changed ratio of C3/C5 implicates altered flux through BCAA metabolic pathways 37 . The abnormal levels of short-chain acylcarnitines may further influence ratio Val/Phe. Several studies reported that free carnitine and short-, medium-, and long-chain acylcarnitines were disturbed in patients with cancer, and showed potential as candidate biomarkers for the development of certain cancers 38,39 . Taken together, these findings suggested that the detection of carnitine/acylcarnitine changes may provide a promising new strategy against GC.
A high-performance biomarker panel consisting of Ala, Arg, Gly, Orn, Tyr/Cit, Val/Phe, C4-OH, C5/C3, C10:2 was identified and validated for separating patients with GC from healthy individuals, as displayed in Table 3. The tenfold cross-validation was performed to evaluate classifier performance by using the data in training set, which showed that the diagnostic performance of this biomarker panel was satisfactory (AUC: 0.9438). An independent test data set of 44 subjects, including 22 patients with GC and 22 healthy individuals, was applied to assess reliability of this metabolite biomarker panel. It showed that this biomarker panel can effectively discriminate patients with GC from healthy individuals (AUC: 0.9318). These results highlighted that the metabolite biomarker panel may act as a potential valuable tool to detect GC.
In this single-center case-control study, a combination of DBS sampling and MS was utilized for highthroughput detection of metabolites. A metabolite biomarker panel was identified with diagnostic potential for GC. Whereas, there were some limitations in this study. Firstly, since GC is considered as a stepwise progression from non-active gastritis 4,5 , we believe that this study will be more systematic when a reasonable amount of patients with gastritis can be recruited. Secondly, in this study, the detected metabolites were limited for the trade-off between coverage and cost, and more metabolites such as fatty acids will be detected in order to select more potential biomarkers. Thirdly, more patients with advanced GC will be recruited in order to perform metabolomics analysis based on the patients in different stages. Finally, a multi-institution study with a larger sample size is still required in order to further assess this study's results. www.nature.com/scientificreports/

Conclusion
In summary, a combination of DBS sampling and direct injection MS technology was used to detect metabolites for patients with GC and healthy individuals. Results obtained displayed significantly altered metabolomic profiles in patients with GC compared to healthy individuals. A metabolite biomarker panel of Ala, Arg, Gly, Orn, Tyr/Cit, Val/Phe, C4-OH, C5/C3, C10:2 was determined as an effective tool with satisfactory sensitivity and specificity for discriminating patients with GC from healthy individuals. Therefore, we believe that these selected metabolites have potential as novel biomarkers in the detection of GC.

Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.