Sex differences in the 1H NMR metabolic profile of serum in cardiovascular risk patients

Personalized diagnosis and risk stratification of cardiovascular diseases would allow optimizing therapeutic strategies and lifestyle changes. Metabolomics is a promising technique for personalized diagnosis and prognosis; however, various physiological parameters, including sex, influence the metabolic profile thus hampering its translation to the clinic. Knowledge of the variation in the metabolic profile associated with sex would facilitate metabolomic translation to the clinic. The objective of the present work was to investigate the possible differences in the metabolic 1H NMR profile associated to sex beyond lipoproteins. 1H NMR spectra from whole serum and methanol deproteinized samples from 39 patients (22 males, 17 females) between 55–70 years old with suspected coronary artery disease that underwent a stress test that was considered negative where included. Deproteinized serum could be used to differentiate sex based on higher levels of lactate and glucose in women. Lipoprotein region was the most variable area of the spectra between individuals, but spectra of whole serum were able to differentiate sex based on lipoproteins. There are sex-related differences in the 1H NMR metabolic profile of individuals with suspected cardiovascular disease beyond lipoproteins. These findings may help the translation of metabolomics to the clinic.

the results were not as good as previously reported thus limiting the diagnostic power of 1 H NMR metabolic profiling of serum 6 . Later, a meta-analysis showed that metabolomics could be indeed used for cardiovascular risk prediction 7 and identified phenylalanine and mono-and polyunsaturated fatty acids as biomarkers for cardiovascular risk after adjusting for confounding factors including sex. However, at this time, metabolomics offers low diagnostic value for coronary artery disease 8 and has not yet reached clinical application in other pathologies 9 . Although it has been described that metabolic markers represent the most obvious kind of biomarkers for clinical application, changes caused by disease could be masked by physiological factors including sex, age, and diet 10 . Biological sex markedly impacts cardiac metabolism at rest 11 and in response to metabolic diseases 12,13 . Also, lipoprotein profiles, a risk factor for the development of cardiovascular diseases, are also affected by sex 6,14 . Thus, guidelines recommended that parameters known to be associated with cardiometabolic disease, including sex, should be taken into account in metabolomic studies 15 .
There is a need for a population screening tool for individualized risk stratification and diagnosis of cardiovascular diseases once symptoms appear. A first step towards reaching this goal would be to understand the background sources of variation in the metabolic profile of the population most likely to suffer cardiovascular diseases The objective of the present work was to evaluate possible differences associated with sex in the 1 H NMR metabolic profile of patients with suspected CAD beyond lipoproteins.

Results
Patients. Table 1 shows the relevant epidemiological and clinical data of the patients included in the study.
Men were significantly taller than women while they had similar weight; as a result, BMI was higher in women than men. Women had higher total cholesterol levels than men; however when lipoprotein subclasses were analyzed individually, differences did not reach statistical significance. All other parameters analyzed, including the incidence of diabetes and pre-test medication were similar between the groups. Spectra. NMR spectra of serum samples obtained in this work were similar to our previously published data 16 . CPMG spectra removed part of the signal originating from lipoproteins thus allowing easier observation of small molecular weight compounds that would otherwise be masked by the large and broad peaks of macromolecules; conversely, diffusion edited spectra removes the signal originating from small molecules (Fig. 1).
Spectra of deproteinized samples ( Fig. 1) were similar to those published previously 17 . However, in some spectra traces of the methanol used for precipitation were observed even after freeze-drying the samples; the area of the spectra at around 3.34 ppm, containing the residual methanol peak, was removed from the analysis. Pattern Recognition. It is possible to obtain an OPLS-DA (Orthogonal projection to latent structures-Discriminant Analysis) model to differentiate between men and women using spectra obtained with the CPMG (Carr-Purcell-Meiboom-Gill) pulse sequence ( Fig. 2A). The model (R 2 x = 0.7; R 2 y = 0.54; Q 2 = 0.307) is able to correctly classify 80% of the samples. Although permutation tests show that the model is robust (Fig. 2B), CV-ANOVA does not reach statistical significance (p = 0.078) possibly due to the high variation seen in the lipid region between the samples, even in T2 edited spectra.
The variables responsible for the differences between sexes in the discriminant model obtained using T2 edited spectra ( Fig. 2A) were found around 1.28 ppm, which correspond mostly to the methylene peak of lipid acyl chains (Fig. 2C).   It was possible to obtain a model able to differentiate sex using diffusion edited spectra (R 2 x = 0.59; R 2 y = 0.25; Q 2 = 0.07). However, this model is able to correctly classify 72% of the samples but does not reach statistical significance in CV-ANOVA (p = 0.65).
Spectra from deproteinized samples show less inter-individual variation than spectra from complete serum. Principal component analysis performed on the spectra showed a tendency towards clustering in the first component suggesting that the main source of variation within the deproteinized dataset is sex (Fig. 3A). It was possible to obtain a statistically significant OPLS-DA model (R 2 x = 0.716; R 2 y = 0.43; Q 2 = 0.25) (p = 0.039) that was able to classify the spectra according to sex with 80% accuracy (31/39) (Fig. 3B). The variables responsible for the discrimination were found at around 1.35 and 3.70 ppm tentatively assigned to lactate and glucose respectively (Fig. 3C).
We did combine CPMG and deproteinized spectra in a single analysis. The statistical model obtained slightly improves the one obtained using spectra from deproteinized samples (R 2 x = 0.564; R 2 y = 0.511; Q 2 = 0.302) (p = 0.021) reaching 89% accuracy (35/39) in the classification. However, the most important variables in the discrimination arise from the deproteinized spectra.
Finally, using Chenomx software we were able to identify and quantify 19 metabolites from the deproteinized spectra ( Table 2). As expected from the whole spectra analysis, lactate, and glucose where elevated in women. Also, valine and glycine were elevated in women (

Discussion
Data presented in this work shows that there are sex-related differences in the 1 H NMR metabolic profile of individuals with suspected cardiovascular disease beyond lipoproteins.
Using NMR spectra of complete serum we have been able to show that women have higher lipoprotein levels than men. This is consistent with findings in the general population 19 and could be also seen in a group of patients with cardiovascular diseases 6 . Moreover, in our study, women had higher total cholesterol levels than men, in accordance with general population findings 20 .
We have previously shown that T2 edited spectra of complete serum provide the best results to predict exercise-induced ischemia 16 . In the present work, the discriminant model to differentiate between men and women based on T2 edited spectra was based on lipoproteins as shown before 6 . However, this model did not reach statistical significance when evaluated using CV-ANOVA.
Diffusion edited spectra show only the resonances corresponding to macromolecules, mainly lipoproteins. Sex is known to be associated with blood lipoproteins 6 suggesting that diffusion weighted spectra would be suitable for differentiating men from women. However, we could not obtain a valid statistical model able to differentiate between men and women; most likely due to the high variability in lipoprotein composition between individuals.
On the other hand, when using deproteinized samples, it was possible to obtain a statistical model able to differentiate men from women mostly based on the higher levels of glucose and lactate seen in women. In our database 14 out of 39 individuals were considered diabetics (were prescribed oral antidiabetics, insulin or both) at the time of analysis but only one individual (female) showed blood glucose over 6.1 mmol/L (limit to be diagnosed as diabetic) on the sample analyzed. In a study involving healthy human volunteers, women had shown higher glucose levels in urine but not in plasma 19 while in another healthy population of young adults men showed higher glycaemia than women 21 . Metabolite quantification of deproteinized serum samples showed increases in glucose, lactate and also of valine and glycine in women. Glycine has been found elevated in younger 21 and older 22 healthy women. On the other hand, valine was found to be higher in healthy older men than women 22 . However, we could find no reports based on individuals with suspected cardiovascular diseases as the ones described in this study. There are some reports in the literature describing differences in the metabolic profile associated with sex 19,23-25 . However, this is the first one focusing on the population with suspected cardiovascular diseases, which would benefit most from a non-invasive screening and diagnostic tool.

Study limitations.
We are aware that the number of patients included in this study is limited; however, in well defined populations we 16 and others 26 have been able to obtain positive results for serum metabolomics in CAD patients. In order to minimize the effect of a limited sample size, we evaluated PLS-DA models using CV-ANOVA a robust approach and dependent on sample number. The percentage of diabetics in our study is similar to what is found in primary care for similar age groups in our environment with with slightly higher incidence of men than women for similar age groups 27 . Diabetes is known to influence the metabolic signature however, when patients with glycated hemoglobin higher than 6 were removed from the analysis it was still possible to obtain a statistical model able to differentiate sex (CV-ANOVA p = 0.029) and there was no change regarding the variables responsible for group discrimination. However, further studies with larger populations should be done in order to validate our work.
In conclusion, we have detected differences in the 1 H NMR metabolic profile between men and women in a population with suspected cardiovascular disease in deproteinized serum samples. These findings may facilitate the development of 1 H NMR based metabolomics approaches in cardiovascular diseases and its translation to the clinic.

Methods
Patients. 39 consecutive patients (22 males, 17 females) between 55-70 years old that were referred to perform myocardial perfusion SPECT study with stress test at Hospital Vall d'Hebron were included in this study. Patient selection was done prospectively but only the samples of those studies considered negative in the report were taken, considering negative if the clinical stress test, the ECG, the gammagraphic images and the ventricular function were all normal. Patients that were unable to perform a full stress test or required pharmacological stimulation where excluded from the study.
Patients included are from the METS (Metabolomic Profile of Patients Undergoing Myocardial Perfusion SPECT study (ClinicalTrials.gov Identifier: NCT02968771). All patients gave their written informed consent and the study was approved by the "Hospital Vall d'Hebron" ethics committee. Methods and procedures were performed in accordance with local guidelines and regulations. Samples. Blood samples (5 ml) were obtained just before the stress test after overnight fasting. Blood was allowed to clot at room temperature, then, centrifuged at 1000 g, 4 °C, 5 minutes. The serum was separated and kept at −80 °C until needed.
Serum deproteinization. Serum was deproteinized using methanol precipitation 17  NMR spectroscopy. Prior to NMR spectroscopy, deproteinized samples were reconstituted in 600 μl of PBS (Phosphate Buffered Saline) made up with D 2 O, containing 0.5 mM TSP (Trimethylsilyl tetradeuteropropionic acid sodium salt) as a concentration and chemical shift reference and placed in a 5 mm NMR tube. Spectra were acquired at 300 K on a 400 MHz vertical bore magnet interfaced to a Bruker Avance console. Each spectrum consisted in the accumulation of 64 scans with a NOESYPR1D (One-dimensional Nuclear Overhauser Spectroscopy) pulse sequence with a mixing time of 100 ms. Serum samples (200 μl) were mixed with PBS-D 2 O (300 μl) just prior to NMR spectroscopy. A series of spectra including pulse-and-acquire, CPMG with an effective T2 delay of 32 ms and diffusion edited spectra were acquired for each sample.
Data analysis. For pattern recognition, each spectrum was manually phase corrected and the area between 0.5 and 9 ppm (excluding the water zone) divided into bins of equal width of 0.01 ppm. The resulting digitized spectra were normalized to total area of 1 and fed into SIMCA v14 software (Umetrics, Umea, Sweeden) for further processing. Pareto scaling was applied to the data.
General variance within the dataset was analyzed using principal component analysis (PCA) 28,29 . This method reduces the dimensionality of a data set while retaining as much as possible of the variation present in the original data set facilitating the extraction of information. Being an "unsupervised" approach, it does not require input from the observer and, thus, is free from possible bias.
In order to assess the capacity of the NMR spectra to differentiate between samples from men and women, a supervised classification analysis was performed. Supervised classification refers to the development of a statistical model able to differentiate two (or more) populations defined in advance. The target is to assign an individual to one of the populations. The information for this classification is provided by a "training set" of correctly classified individuals. In our case the ability of the classification models was tested using the "leave-one-out" approach where one in seven samples within the dataset were not used to define the model and then used to test how well the classification algorithm worked; this process is repeated in an iterative process until all the samples have gone through being in the training and test sets. The supervised approach used in this work was Orthogonal PLS discriminant analysis (OPLS-DA) that highlights the variables responsible for differences among classes 30 . All OPLS-DA models were able to classify the samples better than random grouping but were only considered statistically significant when CV-ANOVA 31 was <0.05.
Metabolite quantification was performed in deproteinized spectra using Chenomx software (Chenomx, Edmonton, Canada) by comparing the areas of the peaks of interest to that of TSP added as an internal standard at a final concentration of 0.5 mM. Metabolite concentration is given in mmol/L ± standard deviation and concentrations were compared using unpaired, two-sided t-test without correction for multiple comparisons.

Data Availability
The datasets generated during the current study are available from the corresponding author on reasonable request.