Serum and urine 1H NMR-based metabolomics in the diagnosis of selected thyroid diseases

Early detection of nodular thyroid diseases including thyroid cancer is still primarily based on invasive procedures such as fine-needle aspiration biopsy. Therefore, there is a strong need for development of new diagnostic methods that could provide clinically useful information regarding thyroid nodular lesions in a non-invasive way. In this study we investigated 1H NMR based metabolic profiles of paired urine and blood serum samples, that were obtained from healthy individuals and patients with nodular thyroid diseases. Estimation of predictive potential of metabolites was evaluated using chemometric methods and revealed that both urine and serum carry information sufficient to distinguish between patients with nodular lesions and healthy individuals. Data fusion allowed to further improve prediction quality of the models. However, stratification of tumor types and their differentiation in relation to each other was not possible.

1 H NMR measurements. The NMR spectra of the serum and urine samples were recorded at 300 K using an Avance II spectrometer (Bruker, GmBH, Germany) that was operating at a proton frequency of 600.58 MHz. The NMR spectra of the serum samples were recorded by using a CPMG pulse sequence with water presaturation (cpmgpr1d in Bruker notation). For each sample, 128 subsequent scans were collected with a spin-echo delay of 400 μs; there were 80 loops, a relaxation delay of 3.5 s, an acquisition time of 2.73 s, a time domain of 64 k, and a spectral width of 20.01 ppm.
The NMR spectra of the urine were recorded using NOESY pulse sequence with a water presaturation (noesy-1dpr in Bruker notation) relaxation delay of 3.5 s, an acquisition time of 1.36 s, 128 transients, a time domain of 64 k, and a spectral width of 20.01 ppm.
The spectra were processed with a line broadening of 0.3 Hz and were manually phased and baseline corrected using Topspin 1.3 software (Bruker, GmBH, Germany). Serum spectra were referenced to an α-glucose signal (δ = 5.225 ppm), while urine spectra to the TSP resonance (δ = 0.000 ppm). Signal alignment was carried out using the correlation optimized warping algorithm (COW) 39 and the icoshift algorithm implemented in Matlab (v 8.3, Mathworks Inc.) 40 . The water spectrum region was removed from the calculations. All of the spectra were normalized using the Probabilistic Quotient Normalization (PQN) method 41, 42 . Preprocessing of variables prior to analysis. The metabolite resonances were identified according to the assignments published in the literature and on-line databases (Biological Magnetic Resonance Data Bank and Human Metabolome Data Base). For quantification purposes, integrals of the non-overlapping signal fragments were used. All of the variables (originating from different fluids) were scaled by unit variance.
Data fusion. The relative integrals of the resonances signals obtained from paired serum and urine samples data matrices were combined into one fusion data matrix. For the purpose of the model calculations, the names of the metabolites identified in the urine were replaced with structure -metabolite_[s] for serum and metabolite_ [u] for urine to overcome overlaying metabolites. All repetitive metabolites from serum and urine were treated as separate variables.
Multivariate data analysis. Multivariate data analysis was performed using SIMCA software (v 14.0, Umetrics). The order of the samples in the dataset was randomized. The discriminant version of the Partial Least Squares regression (PLS-DA) with a default k-fold cross validation procedure was used to determine the differences between the groups. Samples were split into two datasets (model and test) based on the Kennard and Stone algorithm and randomized.
To improve the obtained models, variable selection using the VIP-plots with a jack-knifed confidence interval and confidence level of 0.95 was conducted. The variables that had a value VIP of below 0.8 were removed from the subsequent analysis until they had a negative influence on the R 2 and Q 2 parameters of the model. The new models were re-built on the basis of the selected variables, and then, the models' reliabilities were tested with CV-ANOVA at the level of significance of p < 0.05.
The prediction performance of the VIP-PLS-DA models was estimated based on receiver operating characteristic (ROC) curves and area under curve (AUC) values. For this purpose, a perfcurve function from the Matlab statistical tool-box (Matlab v. 8.3, Mathworks, Inc.) was adopted. The specificity and sensitivity were determined according to the sample class prediction using the 7-fold cross-validated predicted values of the fitted Y-predcv (implemented in SIMCA-14 software) for observations in the model.

Statistical data analysis.
For each metabolite of the serum and urine samples, the percentage difference (PD) and relative standard deviation (RSD) were calculated using STATISTICA 12. The percentage difference was calculated based on the mean values of relative signal integrals in each group. The calculations were performed from left to right. For the chosen metabolites, the statistical significance based on the Mann-Whitney-Wilcoxon (p < 0.05) or Student t test (p < 0.05) was calculated.

Results and Discussion
Multivariate analysis in the diagnosis of thyroid lesions. Each pathological state, even at the cellular level, should be reflected in the body fluids, at least in the most abundant urine and blood, which can be less-invasively collected 43 . The changes that occur on a molecular basis in the thyroid tissue at the genomic and proteomic levels ought to be reflected in a variation in the metabolome profile of the biofluids. There, a specific variation in the homeostatic concentration of low-molecular compounds is expected to occur as a characteristic of the existing pathological conditions of thyroid gland 43 . In this work, for the first time, we have investigated paired urine and blood serum samples by the use of NMR methodology for healthy controls (HC) and patients who suffer from benign changes as well as those with advanced carcinogenesis.
The representative 1

Discrimination between controls and thyroid lesions.
Initially, all of the obtained NMR data were subjected to calculations of seven discriminatory PLS models, for each type of thyroid lesion and each type of collected biological material. The selected metabolites were chosen based on the best parameters of separation between the groups by using the VIP scores, and they were used for further VIP-PLS-DA model calculations (Figs 3 and 4). The calculated parameters of the designed models as represented by ROC curves were compiled in Table 2.
Each of analyzed biofluids exhibited different discrimination potential ( Table 2). The best separation using serum between healthy subjects and patients was obtained for NN vs HC comparison (Q 2 = 0.478, AUC test set = 0.83). The two other models, FA vs HC and TC vs HC, were on the same level of discrimination (AUC test set equal to 0.71 and 0.73) but did not pass the test of model significance. The urine models for these groups also showed a p value that was higher than 0.05, while the predictive potential between particular comparisons was in the following order: FA vs HC (AUC test set = 1) > TC vs HC (AUC test set = 0.73) > NN vs HC (AUC test set 0.61). Interestingly, in the comparison between healthy subjects and all of the collected thyroid lesions, All patients (P) vs HC were only slightly different between the biofluids, showing blood serum (AUC test set 0.84) to be more appropriate diagnostic material than urine (AUC test set 0.76). Conversely, the pairwise comparison between different thyroid lesions revealed that only one model (FA vs TC based on urine) provides satisfactory predictive power (AUC = 0.76).
In the case of models obtained on the fusion data, the basic model parameters were significantly better ( Table 2). In support of this finding, the results also showed differences in ROC curve (Fig. 5) and AUC training ( Table 2) values for all of the comparisons, which indicates a better fit for the model that uses the selected samples in relation to the models, that were constructed separately on serum or urine. As in the case of the obtained models for each biofluid model based on data fusion, the best performance was observed in the case of the comparison of HC vs NN /FA/TC and the total number of patients (P) with pathological changes ( Table 2). For all of the models, all of the values that were obtained were above 0.83 (FA vs TC), while the highest value was 0.99 for TC vs HC comparison. The possibility of formulating prediction models was also higher in terms of comparisons of HC to pathological changes of the thyroid ( Table 2). The highest predictive value was characterized by FA vs HC, which reached the AUC test of 1.00, while the lowest value was obtained for NN vs HC with an AUC test of 0.82.
The data fusion from the serum and urine NMR measurements definitely strengthened of calculated models between the NN, FA, TC, P and the HC groups. That finding was due to the increased number of variables, which theoretically could add complementary information to the obtained models. The AUC values were between 0.82  and 1, which shows a high predictive potential based on combined information from both biofluids. Moreover, comparisons between different thyroid lesions were enhanced when data fusion was applied. Surprisingly, lesion development did not exhibit better predictive model abilities, as it has been previously found in the metabolomics investigation of thyroid tissues 37,38,44 . This fact might be explained by vascularization of the tumor tissue, where the size of the tumor could influence the type of vascularization 45 and thus lead to providing less powerful information on the biochemical changes that are widely spread over the biological system. Metabolic differences in thyroid lesions. In our previous study, we showed that the differentiation of tumor type was possible by conducting aqueous tissue extracts 37 . Based on this findings we decided to investigate whether similar discrimination can be obtained using biofluids (individually or in combination), which unlike tissue biopsies, can be easily collected. Considering all low-molecular-weight compounds that were identified in NN vs HC, when comparing serum to tissue extract samples, only lactate and formate were statistically important in both studies and followed the same trend of increasing values of the relative integral in NN. However the tissue lactate was increasing systematically with NN > AF > TC, which was found to be reversal to blood serum level. In tissue, the lactate upregulation is associated with alanine and glucose increasing level, two main sources of it. While in blood serum these metabolites are only slightly changed, decreased alanine and increased glucose, but are not statistically significant. This might be evidence of its fast utilization from circulating blood as an answer for eg. energy demand. Similarly, formate strong upregulation was observed in blood serum and tissue however due to the high RSD it is hard at this stage to consider this molecule as a potential biomarker.
In urine and tissue extract only two metabolites were overlapping: 3-hydroxybutyrate (3-HB) and acetone, both statistically important and followed the same types of changes -decreasing in NN. Collectively in the serum and urine samples, 10 metabolites were statistically important in the NN vs HC comparison (Tables 3, 4).
In contrast, in the tissue study, 16 metabolites with significant changes were found 37 . In the assessment of changes in the metabolite statistical data of FA vs HC, for serum and urine samples, four metabolites and three metabolites, respectively, were matched to the tissue study results, namely, valine, citrate, lactate, and tyrosine for serum and citrate, acetone and 3-hydroxybutyrate for urine samples. In serum samples along the statistically important metabolites in both studies, valine and tyrosine percentage differences were decreased in FA, which was in contrast to the trend in the aqueous tissue extract study. However, the changes in tissue extracts of citrate (decreasing) and lactate (increasing) are of opposed direction, while in serum blood both metabolites are increased in comparison to HC group. This data can pronounce the different changes, which occurred at the local level (tissue) and whole metabolism as an answer for pathological state. Another example can be shifted balance of valine, where decreased level in blood serum is observed and increased in tissue extract. Additionally the decreased level of amino acids especially relative integral of serum tyrosine, could not only be related to protein biosynthesis but also for the synthesis catecholamines 46 .
In the assessment of changes in the urine samples, the decreased trend in citrate, acetone and 3-hydroxybutyrate level for the FA group were reported in both studies showed metabolism directed towards energy demand. In the tissue extract study, a total of 15 metabolites were statistically important, while overall for serum and urine, 17 compounds were found, where the majority belonged to the urine -12 (Supplementary Tables S1 and S2).
The third comparison was the TC vs HC subjects. The differences between these most distant groups should have given the largest differences written in the molecular information. In the tissue extract study, 22 metabolites from 26 identified were statistically important, while surprisingly, in serum and urine collectively, only 17 were significantly changed. Only four of the identified metabolites from serum and two from urine matched the results from the previous study 37 . Valine, alanine, creatine and tyrosine in serum samples were decreasing in TC, whereas the aqueous tissue extract showed the opposite trend. Only the creatine, replenishing energy in ADP-ATP cycle metabolite, which increased level is observed in the TC group matched with the changes from the previous study. In urine, citrate and acetone were statistically important, and it followed the same decreasing trend as in the tissue extract. The distinguishing of HC among all of the other investigated thyroid lesions appears to be possible based on the serum and urine samples. However, the molecular composition outcome was not as good as from the aqueous tissue extract in the previous study. The consequence of having a lower quantity of  statistically important metabolites obtained in this study was that discrimination of the thyroid nodules types was more difficult. Among all of the comparisons between each type of thyroid lesion, only two common metabolites were identified in serum, which matched with the results obtained from the aqueous tissue extracts. From the FA vs NN comparison -the valine relative integral was decreased for the FA group in both studies, in TC vs NN valine and lactate decreased in serum for the TC group, with the opposite trend in the tissue extract study. For the FA vs TC comparison -only serum lactate was decreased in the TC group, which is also in contradiction to the previous study 37 . An increased level of lactate in the processes of carcinogenesis is a common symptom and it was unexpected that the level is the highest in the NN group.
In conclusion no significant changes were observed in the total lipid profile. The acetone level identified in the urine samples followed exactly the same trend, which was recognized in the tissue extract samples study. Moreover, in most cases is positively correlated with its precursor 3-hydroxybutyrate (except tissue TC vs HC).   Table 4. Significantly changed urine metabolites ( # -VIP plot selected metabolites; * -statistically significant metabolites). This may indicate that the level of acetone in the urine samples may be prognostic factor for thyroid nodules. The increased level of lactate in blood and tissue with opposite direction of change in urine can be related to possible influence of hypoxia microenvironment 47 occurring in tissue and its direct translation on blood serum 48 (comparison NN, FA vs HC). The lack of lactate level changes in the comparison TC vs HC can be caused by strong local tissue changes and/or the limited flow between these two compartments. Clearly, changes were much more pronounced in the tissue extracts then in serum or urine. This is however, to be expected as tissue collected in situ should reflect metabolic state of the lesion, than any of biofluids that reports whole organism metabolism.
All our findings appear to be rational combining three biological compartments, where the obtained data from the tissue extract samples were directly occurred in the place of major pathological disturbances. In the case of serum and urine samples, visible changes in the relative integral of the metabolites could be caused by the general response of the biological system for homeostasis disorder, which could be more subtle then the changes directly in the pathological tissue 37 .

Conclusions
The models based on the fusion data have higher parameters and predictive potential compared with most of the models that were calculated separately for each body fluid. That finding indicates that combined datasets exhibit synergy that increases model stability and enhances diagnostic potential. However, it should also be noted that stratification of tumor types and their differentiation in relation to each other could not be obtained.
The VIP-PLS-DA method allows us to identify metabolites that are biomarker candidates and should be investigated in detailed in future.
Our study also allowed us to obtain a model with a 100% prediction for the FA vs HC comparison. The models that were calculated for comparisons between diseases units were of low quality, which could be connected to indications of similar changes in the distribution of metabolites in the organism. This similarity could negatively affect the creation of high-quality diagnostic models based on the proton NMR technique.
Despite the relatively good results in some comparisons, studies must be conducted on a larger cohort of patients in order confirm predictive potential of selected metabolites in diagnosing thyroid lesions.

Declarations
Ethics approval and consent to participate. The study was carried out in accordance with the Declaration of Helsinki. Serum and urine samples were collected from patients who were operated on at the First Department and Clinic of General, Gastroenterological and Endocrinological Surgery of Wroclaw Medical University. The protocol for this study was approved by the Commission of Bioethics at Wroclaw Medical University (Approval no. KB-248/2010).