Identification of metabolic biomarkers in patients with type 2 diabetic coronary heart diseases based on metabolomic approach

Type 2 diabetic coronary heart disease (T2DM-CHD) is a kind of serious and complex disease. Great attention has been paid to exploring its mechanism; however, the detailed understanding of T2DM-CHD is still limited. Plasma samples from 15 healthy controls, 13 coronary heart disease (CHD) patients, 15 type 2 diabetes mellitus (T2DM) patients and 28 T2DM-CHD patients were analyzed in this research. The potential biomarkers of CHD and T2DM were detected and screened out by 1H NMR-based plasma metabolic profiling and multivariate data analysis. About 11 and 12 representative metabolites of CHD and T2DM were identified respectively, mainly including alanine, arginine, proline, glutamine, creatinine and acetate. Then the diagnostic model was further constructed based on the previous metabolites of CHD and T2DM to detect T2DM-CHD with satisfying sensitivity of 92.9%, specificity of 93.3% and accuracy of 93.2%, validating the robustness of 1H NMR-based plasma metabolic profiling to diagnostic strategy. The results demonstrated that the NMR-based metabolomics approach processed good performance to identify diagnostic plasma biomarkers and most identified metabolites related to T2DM and CHD could be considered as predictors of T2DM-CHD as well as the therapeutic targets for prevention, which provided new insight into diagnosing and forecasting of complex diseases.

Metabonomics, a postgenomic approach used to rapidly identify global metabolic changes in biological systems, has been increasingly applied to diagnose diseases, measure the response to treatment, discover biomarkers and identify perturbed pathways [15][16][17] . Nuclear magnetic resonance (NMR) spectroscopy is a rapid, non-destructive and high-throughput analytical method and has been widely used in metabonomic research [18][19][20][21][22] . It has been reported that NMR-based metabolomic approaches instituting a sensitive high-throughput molecular screening have already demonstrated promising results in diagnosing a variety of diabetes mellitus and cardiovascular system disorders [23][24][25] .
In this study, we made a novel attempt to explore the potential biomarkers related to CHD and T2DM and validate these potential biomarkers as predictors to diagnose the patients with T2DM-CHD based on the NMR non-targeted metabolomics. Serum samples from T2DM and CHD patients were analyzed by NMR metabolic profile, principle component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) to screen out potential biomarkers. ROC curve analysis for the logistic regression model was constructed by the biomarkers of T2DM and CHD patients for T2DM-CHD prediction. This process may accelerate the advancement in understanding the mechanism of T2DM-CHD occurrence and progression at the metabolic level and providing information for the prediction of early marker metabolites for T2DM-CHD.

Results
Demographics and Clinical Characteristics. Detailed data about patients and controls are presented in Table 1. There was no significant difference in gender, age, Body Mass Index (BMI), Systolic Blood Pressure (SBP), Diastolic Blood Pressure (DBP), total cholesterol, Blood Urea Nitrogen (BUN) and serum creatinine (SCr) among the four groups based on SPSS analysis (p > 0.05). The level of triglycerides and HbA1c in T2DM and T2DM-CHD was higher than that of the controls (p < 0.05). Apart from HDL in T2DM (p > 0.05), HDL in other two groups were higher than that in the controls (p < 0.05). As expected, FPG, 2 h plasma glucose (2hPG) and fasting insulin (FINS) in T2DM-CHD and T2DM were higher compared to the controls (p < 0.05), particularly in T2DM-CHD (p < 0.0001). The level of LDL in CHD and T2DM were a little higher than those in healthy subjects (HC), perhaps due to the influence of the medication such as stains and insulin. Therefore, the findings cannot be attributed to demographic factors.

H-NMR analysis of Plasma samples.
Plasma contains almost all of the low molecular weight species in whole blood and a few high molecular weight compounds, thus it can provide valuable bio-information in the organism's metabolism. Figure 1 shows representative 600 MHz 1 H NMR CPMG spectra of plasma from the healthy controls, T2DM group, CHD group and T2DM-CHD group. The plasma NMR spectra were dominated by LDL/VLDL (δ 0.86, δ 1.26), leucine (δ 0.95, δ 0.97), valine (δ 1.03), lactate (δ 1. 33 literatures [26][27][28] and in-house NMR database and further confirmed with analysis of the 2D NMR spectroscopy (the spectra was shown in Fig. S1). Visual inspection of the 1 H NMR spectra showed subtle differences in plasma metabolites between groups. In the 1 H NMR spectral of plasma samples, the dominated change of the signals among low molecular weight metabolites like leucine, isoleucine, valine, alanine, glutamine, creatine, proline, glucose etc. were detected. Multivariate data analysis was further performed to obtain more detailed analysis of metabolic differences between groups.
Multivariate data analysis and the selection of potential biomarkers. PCA was used for the overview of the metabonomic data set and the spotting of outliers, and then for the detection of any grouping. This type of analysis is designed to highlight systematic variation across series of NMR spectra. It results in the calculation of a series of principal components (PCs) for each sample. The PCA scores plot was used to reveal observations lying outside the 0.95 Hotelling's T2 ellipse. The score plot was obtained with the first two PCs presenting 47.2% and 14.5% variance, respectively ( Fig. 2A). PLS-DA model was established to investigate the metabolic differences between four groups. The PLS-DA score plot displayed a good separation between HC group and other disease groups (Fig. 2B). Then, both of the two PLS-DA models with satisfactory discriminating ability were established to assess the metabolic differences between two disease groups (CHD and T2DM) and HC group respectively (Fig. 3). According to the score plot of the PLS-DA model, CHD patients and HC were discriminated obviously with R 2 X = 18.5%, R 2 Y = 95.2%, and Q 2 = 70.7% (Fig. 3A), and the T2DM patients and HC were discriminated with R 2 X = 17.7%, R 2 Y = 96.9%, and Q 2 = 0.675 (Fig. 3C). The parameters for describing the PLS-DA models were significantly elevated (R 2 Y, Q 2 > 0.5), which suggested that the PLS-DA models were robust 29 . The validation plot (Fig. 3B,D) demonstrated that the original PLS-DA models were not random and overfitting as both permutated Q 2 and R 2 values were significantly lower than the corresponding original values.
In order to eliminate the influence of individual difference and conduct an insight into the changed metabolites responsible for the separation between two groups, the OPLS-DA model was constructed using the first principal component and the first orthogonal component. In Fig. 4, it reveals the OPLS-DA score plots for pairwise comparison of CHD, T2DM and HC group samples, along with the corresponding coefficients plots depicting the major discriminators. In the score plot (Fig. 4A, R 2 Y = 95.2%, Q 2 = 0.462), a significant biochemical distinction between the CHD patients and HC was identified and there was also a significant biochemical distinction between the T2DM patients and healthy controls in the score plot (Fig. 4C, R 2 Y = 96.9%, Q 2 = 0.622). The metabolic changes in patients were reflected in the color coded coefficient plots (Fig. 4B,D). Metabolites exhibiting significant changes (p < 0.05) were identified based on the absolute cutoff value of correlation coefficients (|r|) and VIP value and were listed in Table 2. The resonances assigned to proline and creatine were significantly increased, but the levels of isopropanol, alanine, leucine, arginine, acetate, glutamine, glycine, glucose and 3-methylhistidine were statistically decreased in the CHD group compared to those of the HC group. The T2DM group had lower levels of isoleucine, leucine, valine, isopropanol, alanine, arginine, glutamine, proline, creatinine, threonine and tyrosine, but had higher levels of glucose compared to thoes of the HC group. The potential biomarkers related to T2DM and CHD screened out above were used to predict the process and mechanism of T2DM-CHD.
Hierarchical cluster analysis (HCA) of biomarkers for T2DM-CHD diagnosis. HCA could readily be used to assess relatedness and distance of any type of samples characterized by any type of descriptors, and the result was displayed as 'heatmap' . We used the metabolites listed in Table 2 as the variables to conduct the HCA, and got the heatmap (Fig. 5). From the heatmap, the similarity of different metabolites and different samples could be shown visually. The heatmap showed that the T2DM-CHD patients and healthy controls were almost completely separated from each other. It could be observed that the metabolic state of T2DM-CHD patients resulted in the decreased levels of isopropanol, glycine, alanine, arginine, proline, glutamine, acetate, creatine, 3-methylhistidine, creatinine, isoleucine, tyrosine, valine, threonine and leucine, as well as elevated levels of VLDL/LDL and glucose. The result of HCA further illustrated that these metabolites could distinguish the T2DM-CHD patients and HC, so these endogenous metabolites could be used as the potential biomarkers.
Prediction and the diagnostic test to the T2DM-CHD disease. The 17 potential metabolites responsible for discrimination between T2DM-CHD patients and HC were identified. Table 3 shows the variation of the integrals of the normalized spectral regions responsible for these 17 metabolites and lists the results from the student's t-test (p < 0.05) for comparison of HC and T2DM-CHD.
As is shown in Fig. 6A, a complete separation of T2DM-CHD patients and HC in PLS-DA score plots based on the 17 potential metabolites (R 2 X = 56.7%, R 2 Y = 84.9, Q 2 = 0.72), suggesting a severe metabolic disturbance of the 17 potential metabolites in T2DM-CHD patients by a supervised PLS-DA with a well goodness of fit (displayed in Fig. 6B).
Then, ROC curves analysis was performed to validate the clinical effect of these potential biomarkers in diagnosing the T2DM-CHD. Areas under the ROC curve (AUC) were generally considered as the method of choice for evaluating the performance of potential biomarkers: the greater the AUC, the better the prediction of the model. In Fig. 7A, it showed a set of ROC curves for SVM models created using different subsets of metabolites selected by the filter approach, and six models were developed. The top 2 important variables (isopropanol and glycine) were used to build classification models, the AUC value was 0.983 and 95% confidence interval (CI) was 0.933~1. The AUC using a larger number of variables tried to achieve even greater areas under the ROC curves, and the maximum value was 0.983 (95% CI, 0.933~1) when we used 2 or 3 metabolites as the variables. Meanwhile, the predictive accuracy was the maximum value 93.2% when we use 5 or 7 metabolites as the variables (Fig. 7B). The metabolites in Fig. 7C were ranked by their contribution to distinguish the T2DM-CHD from HC. The greater the distance from the Y-axis, the greater the contribution of a particular metabolite in distinguishing cases from controls. This plot also indicated whether the metabolite concentration was increased or decreased in cases related to controls. The metabolites in Fig. 7C included isopropanol, glycine, alanine, arginine, proline, glutamine, acetate, glucose, creatine, 3-methylhistidine, creatinine, isoleucine, tyrosine, valine, threonine and leucine, and the importance decreased in this order, while the VLDL/LDL was rejected as it made little contribution to distinguish the T2DM-CHD and HC. The predicted class probabilities (average of the cross-validation) for each sample using the best classifier (based on AUC) is illustrated in Fig. 7D. The verification results showed that in the 28 T2DM-CHD samples, 26 were predicted correctly, and in the 15 HC samples, 14 were predicted correctly. Therefore, the OPLS-DA prediction model exhibited a sensitivity of 92.9% and a specificity of 93.3% for T2DM-CHD diagnosis. On the basis of selected biomarkers, ROC analysis revealed that T2DM-CHD could generate signature biomarkers and in return these biomarkers could be used to diagnose them.

Metabolic Pathway and Function Analysis.
In addition, based on the identified biomarkers, the plasma metabolic pathway analysis was performed using MetPA software to reveal the most relevant pathways related to T2DM-CHD. The impact value of these pathways calculated from pathway topology analysis above 0.1 was screened out as potential target pathway. According to the impact value, finally there were 4 potential target pathways related to 8 metabolites identified in this research. There were 4 pathways disturbed when T2DM-CHD occurred (Fig. 8), including arginine and proline metabolism, Glycine, serine and threonine metabolism, alanine, aspartate and glutamate metabolism and Pyruvate metabolism, which included more than one target. The details of pathways were displayed in supplementary Table S1 and Figures S2-S5, Supporting Information.

Discussion
The development of CHD and T2DM in patients is a serious problem that compromises the quality of life and survival of patients. Taking into account of the tendency to population aging observed during the last years, the problem of T2DM-CHD has become even more serious. The precise mechanism linking between CHD and T2DM is not completely clear and there are still unknown factors. Biomarkers predicting T2DM-CHD are useful to identify individuals at high risks of developing T2DM-CHD. Metabolomics is increasingly being applied towards the identification of biomarkers for disease diagnosis, prognosis and risk prediction. In the present study, 1 H NMR-based metabonomic approach was conducted to demonstrate metabolic differences between HC and T2DM-CHD. Subsequent analysis of the metabolite profiles of serum samples from CHD and T2DM patients could distinguish patients from healthy normal controls and provide a fingerprint of metabolic changes that characterized the disease, and highlighted the potential of metabolomic analysis in the evaluation of a disease condition. About 17 metabolic biomarkers were highly possible to be associated with T2DM-CHD, which showed better performance in terms of both specificity and sensitivity. These metabolites included isoleucine, valine, isopropanol, alanine, leucine, acetate, proline, glutamine, arginine, trans-aconitate, creatine, creatinine, glucose, glycine, threonine, tyrosine and 3-methylhistidine. The diagnostic model using ROC curves was further constructed based on the metabolites of CHD and T2DM to predict T2DM-CHD with satisfying sensitivity of 92.9%, specificity of 93.3% and accuracy of 93.2%.
In our study, four unique metabolic pathways of arginine and proline metabolism, glycine, serine and threonine metabolism, alanine, aspartate and glutamate metabolism, and pyruvate metabolism are identified from T2DM and CHD patients (Fig. 8). The altered metabolites related to T2DM-CHD are most involved in energy metabolism and amino acids metabolism (Fig. 9).
Energy metabolism. Glucose is the major source material for ATP production in cells. ATP is mainly produced through metabolism of glucose under normoxia condition, which is composed of three relay pathways: citric acid cycle (TCA cycle, Krebs cycle), oxygen-independent pathway of glucose to pyruvate in cytoplasm and oxygen-dependent electron transfer chain, respectively 30 . It is expected that reduced oxygen level in CHD patients will significantly affect the TCA cycle since it is oxygen dependent. The anaerobic glycolysis begins to play a dominant role for ATP production under the conditions of hypoxia, leading to the disorder of glucose.
Creatine, synthesized in the liver and kidney, is transported through the blood and taken up by tissues with high energy demands. It can reflect the changes of energy metabolism in the muscles. Creatinine is derived from creatine and phosphocreatine. Creatine has the ability to increase muscle stores of phosphocreatine, potentially increasing the muscle's ability to resynthesize ATP from ADP to meet increasing energy demands. Therefore, the level of creatine and creatinine also reflect the disorder of energy metabolism in T2DM-CHD patients.
Amino acids metabolism. Leucine, isoleucine and valine are essential amino acids whose carbon structures are marked by branch points (BCAA). These three amino acids are critical to human life and are particularly involved in stress, energy and muscle metabolism. BCAA, especially leucine, can be an important source of calories, and is superior as fuel to the ubiquitous intravenous D-glucose, and it also can stimulate insulin released by pancreatic b-cells in vitro 31 . As important insulin secretagogues, BCAAs exert a regulatory effect on proteolysis and participate in building body organs 32 . Altered BCAA metabolism is one of the characteristics of T2DM.
As the most abundant amino acid in the serum, glutamine is the most important amino acid gluconeogenic precursor for adding new carbon to the glucose pool 33 . Turer et al. 34 used metabolomic profiling to compare cardiac extraction and plasma substrates, and demonstrated that patients with CHD had decreased concentration of glutamate/glutamine. Alanine is highly concentrated in muscle and is one of the most important amino acids released by muscle, functioning as a major energy source. It is an important participant as well as regulator in glucose metabolism, and its levels always parallel blood sugar levels. And reduced concentrations of glutamine and alanine were also observed in T2DM patients, which illustrated the enhancement of gluconeogenesis in the diabetic state. Some of the amino acids are associated with insulinopenia and thus would be seen to be a normal response to gluconeogenesis. Our results are consistent with previous studies which indicate that the conversion of glutamine and alanine is high in T2DM patients 35 [37][38][39] , however, there is growing evidence that elevated BCAA levels may reflect a state of insulin resistance that is not necessarily specific to T2DM 40 .
Arginine is one of the most versatile amino acids in animal cells, serving as a precursor for the synthesis not only of proteins but also of nitric oxide, urea, polyamines, proline, glutamate, creatine and agmatine 41 . It may stimulate the oxidation of energy substrates (including fatty acids and glucose) in adipocytes, liver, skeletal muscle, heart and whole body. Fu et al. have reported that dietary L-arginine supplementation markedly reduced white-fat mass in Zucker diabetic fatty rats 42 .
Isopropanol belongs to the family of alcohols and polyols compounds. The previous report indicated that isopropanol is one of the products from propanoate metabolism, and the substrate for synthesizing acetone catalyzed by the enzyme isopropanol dehydrogenase 43 . Alcohol dehydrogenase oxidizes alcohols to either aldehydes or ketones, with concomitant reduction of NAD + to NADH 44 . Thus, we suggested that the isopropanol is associated with acetone metabolism, which may be a significant differential metabolite in T2DM.
For all we know, this study presented a holistic view of the metabolic changes related to T2DM-CHD and may contribute to its diagnosis. However, limitations of our study included a relatively small sample size in each group, which might prevent the differences in some metabolites from being fully apparent, and imperfect diagnostic approaches of altered metabolites. In addition, our understandings of these altered metabolites and their

CHD T2DM
Interal in group a (mean ± std) × 10 −2  Table 2. Quantitative comparison of metabolites found in plasma of CHD patients, T2DM patients and healthy controls. The arrows (↑ /↓ ) were used to show the metabolite levels increase/decreased compared with healthy controls. a The relative integrals of metabolites were determined from 1 D 1 H NMR analysis of plasma of each group. b The values of correlation number extracted from the correlation plots of OPLS-DA models. c The p values were obtained from student's t-test. The chemical shifts in boldface were that we used in calculating integrals and p values.
underlying mechanisms remain at rudimentary levels. Future work will focus on confirming/validating current metabolite findings in larger independent patient cohorts and elucidating the biological mechanisms.

Conclusion
In the present study, 1 H NMR-based metabolomics method combined with multivariate data analysis were used to distinguish independently T2DM-CHD patients from healthy controls with high reliability. About 17 potential biomarkers related to T2DM-CHD disease were found by analysis and 16 of the 17 metabolites used as the biomarkers in diagnosing T2DM-CHD disease exhibited a sensitivity of 92.9%, a specificity of 93.3% and an accuracy of 93.2%. This study has been proved to be useful in improving the diagnosis of T2DM-CHD which may provide new insights to identify additional novel biomarkers.

Materials and Methods
Ethical approval. All procedures were designed according to the Declaration of Helsinki's ethical principles.
The study protocol has already been ethically reviewed and approved by Ethics Review Committee of Beijing University of Chinese Medicine and the methods were carried out in accordance with the approved guidelines.
Patients were aware of their involvement and signed a written informed consent agreeing to the use of the resulting information for medical publications.
Subjects and participants. The study was conducted with the approval of the ethical committee of Beijing University of Chinese Medicine and all study participants have given informed consent for the investigation. A total of 71 participants from the affiliated Dongzhimen Hospital of Beijing University of Chinese Medicine were matched for age and gender and equally distributed into four study groups: (i) T2DM patients; (ii) T2DM-CHD patients; (iii) CHD patients; (iiii) Healthy subjects as controls (HC). Detailed data about four study groups are listed in Table 1. Diagnosis of diabetes was according to American Diabetes Association criteria (2005) and Diagnosis criteria of CHD referred to the WHO standard criteria (1979). From January 2013 to December 2014, we consecutively recruited patients who had been referred to the outpatient clinic from the affiliated Dongzhimen Hospital of Beijing University of Chinese Medicine for treatment of diabetes and coronary heart disease. There were 15 volunteers of HC subjects from the medical examination center of Dongzhimen Hospital in the same period of time.
General information, past medical history, family history, personal history, and signs were collected within 24 hours after the patients were admitted. Details of information in the view of traditional Chinese four diagnostic methods were also recorded. Collections of patient histories and information from traditional four diagnostic methods were determined by the relevant professionals. Specific requirements for relevant professionals included having the occupation qualification, attending the physician or above, and having relevant clinical experience more than two years.
Sample collection and preparation. Fasting blood samples were collected from the subjects in the morning by venipuncture and stored in EDTA-containing green-top tubes. Then the samples were centrifuged at 3 000 × g for 10 min at 4 °C to isolate plasma. The plasma samples were stored at − 80 °C until further processing and analysis.

HMDB Chemical Shift
Interal in HC group a (mean ± std) × 10 −2 Interal in T2DM-CHD group a (mean ± std) × 10 −2 r b (T2DM-CHD vs HC) (|r|> = 0.532) VIP   Plasma samples were thawed and prepared by mixing 200 μ L of plasma with 400 μ l of 1.5 M of deuterated phosphate buffer (NaH 2 PO 4 and K 2 HPO 4 , including 0.1% TSP, pH 7.47), adding D 2 O up to 600 μ L if the volume of serum is insufficient. The mixture was left to stand for 5 min at room temperature and then centrifuged at 13 000 rpm at 4 °C for 15 min. The supernatant solution (550 μ L) was then transferred into a 5 mm NMR tube for NMR analysis.
Acquisition of 1 H-NMR spectra. All the samples were analyzed at 298 K using a VARIAN VNMRS 600 MHz NMR SPECTROMETER operating (Varian Inc, Palo Alto, Calif) at 599.871 MHz using a 5 mm inverse-proton (HX) triple resonance probe with z-axis gradient coil. 1 H NMR spectra of plasma were recorded using the water-suppressed standard 1 D CPMG pulse sequence (RD-90°-(τ -180°-τ )n-ACQ), where a fixed total spin-spin relaxation delay 2nτ of 320 ms was applied to attenuate the broad NMR signals from slowly tumbling molecules (such as proteins) and retain those from low-molecular weight compounds and some lipid components. The free induction decays (FIDs) were collected into 64 K data points with a spectral width of 12 000 Hz and 128 scans. The FIDs were zero-filled to double size and multiplied by an exponential line-broadening factor of 0.5 Hz before Fourier transformation (FT). Standard COSY, TOCSY, HMBC and J-resolved spectra were also acquired for metabolite identification purposes for the selected plasma samples.
Data reduction and multivariate pattern recognition analysis. All of the 1 H NMR spectra were manually phased and corrected for baseline distortion by MestReNova7.1.0 software (Mestrelab Research, Spain). All the spectra were referenced to the methyl group of lactate at δ 1.336. In order to exploit all metabolic information embedded in the spectra, all NMR spectra (0.5-9.0) were segmented into equal widths of both 0.01 ppm and 0.001 ppm. Spectral regions of δ 4.68-5.10, δ 3.65-3.57, δ 3.06-3.23, δ 2.66-2.72 and δ 2.53-2.60 were excluded to eliminate variations caused by imperfect water suppression, EDTA, and EDTA metal complexes. The area under the spectrum was then calculated for each segmented region and expressed as an integral value. The integrated data were normalized to the total sum of the spectrum before multivariate statistical analysis to give the same total integration value for each spectrum.
Subsequently, the integral values were imported into SIMCA-P+ 12.0 (Umetrics, Sweden) for multivariate statistical analysis. The data were mean centered for PCA and PLS-DA [45][46][47] , and in order to improve the separation due to groups and minimize other biological analytical variation, sample classes were modeled using the OPLS-DA algorithm at a unit variance scaled approach. The PCA and PLS-DA score plots were showed with the first principal component and the second principle component, while OPLS-DA were visualized with the first principle component and the first orthogonal component. The model coefficients locate the NMR variables associated to specific intervention as y variables. The model coefficients were then back-calculated from the coefficients incorporating the weight of the variables in order to enhance interpretability of the model; in the coefficient plot, the intensity corresponds to the mean-centered model (variance) and the color-scale derives from the unit variance-scaled model (correlation). Thus, biochemical components responsible for the differences between samples detected in the scores plot can be extracted from the corresponding loadings with the weight of the variable contributing to the discrimination. The coefficient plots were generated with MATLAB scripts (downloaded from http://www. mathworks.com) with some in-house modifications and was color-coded with absolute value of coefficients (r).