RCC usually develops and progresses asymptomatically and, when detected, it is frequently at advanced stages and metastatic, entailing a dismal prognosis. Therefore, there is an obvious demand for new strategies enabling an earlier diagnosis. The importance of metabolic rearrangements for carcinogenesis unlocked a new approach for cancer research, catalyzing the increased use of metabolomics. The present study aimed the NMR metabolic profiling of RCC in urine samples from a cohort of RCC patients (n = 42) and controls (n = 49). The methodology entailed variable selection of the spectra in tandem with multivariate analysis and validation procedures. The retrieval of a disease signature was preceded by a systematic evaluation of the impacts of subject age, gender, BMI, and smoking habits. The impact of confounders on the urine metabolomics profile of this population is residual compared to that of RCC. A 32-metabolite/resonance signature descriptive of RCC was unveiled, successfully distinguishing RCC patients from controls in principal component analysis. This work demonstrates the value of a systematic metabolomics workflow for the identification of robust urinary metabolic biomarkers of RCC. Future studies should entail the validation of the 32-metabolite/resonance signature found for RCC in independent cohorts, as well as biological validation of the putative hypotheses advanced.
Renal cell carcinoma (RCC) is the most common and lethal malignancy of the kidney, accounting for 87% of all renal tumors and representing 2–3% of all human cancers1. RCC comprises a range of tumors with distinct histopathological and genetic features, these being traditionally diagnosed by detection of renal masses by ultrasound, computed tomography and magnetic resonance imaging1,2. However, many renal masses progress while remaining asymptomatic until the late stages of the disease and, consequently, more than 50% of patients are incidentally diagnosed when examined for other clinical purposes and without the suspicion of genitourinary malignancies2,3. As a result, up to 20–30% of patients present metastatic disease at the time of diagnosis, being less amenable to be cured by surgery and presenting poor prognosis4. Therefore, the development of new diagnostic and follow-up methods should certainly have an important impact in RCC clinical management. Possible new strategies may rely on the understanding and measurement of metabolic deviations accompanying the disease, preferably using non-invasive methods. In this context, metabolomics has been increasingly pursued in relation to cancer, either entailing the analysis of tumor tissue (e.g. melanoma5, breast6,7, ovarian8, colorectal9, brain10,11,12) or of biofluids. The use of biofluid metabolomics in cancer research has consistently increased13, minimally invasive samples (plasma/serum and urine) having been used to study ovarian14,15, colorectal16, kidney17,18,19, breast15,20, lung21,22 and liver23,24,25 cancers, either using Nuclear Magnetic Resonance (NMR) spectroscopy or Mass Spectrometry (MS). These studies have unveiled potential disease biomarkers but also significant stumbling blocks regarding the impact of systemic effects and confounders e.g. diet, lifestyle, population phenotypes, age, gender26,27,28.
In the particular case of RCC, metabolomics-based studies have already been successfully applied to paired cancer and normal renal tissue29,30,31,32,33, showing its potential to discriminate cancer and normal tissue as well as study RCC progression and aggressiveness. A lack of correspondence between altered pathways proposed by genomics and metabolomics was observed highlighting that genetics only describes part of the pathophysiologic metabolic rearrangements of RCC29. Overall, the studies reported high glutathione levels, high glycolytic and pentose phosphate pathway (PPP) activity as well as alterations in the tricarboxylic acid (TCA) cycle and fatty acid metabolism29,33. Nevertheless, biofluids such as blood and urine offer the possibility of non-invasive diagnosis and monitoring and these have been the subject of recent metabolomics studies. Serum metabolomics has unveiled changed levels of glucose, glutamine, pyruvate and lactate34,35 consistent with enhanced glycolytic and glutaminolytic activities, showed potential in nephrectomy follow-up34 and patient staging36, and more recently produced a possible biomarker cluster of 7 metabolites (alanine, creatine, choline, isoleucine, lactate, leucine, valine) for early RCC37. Urine is a particularly suited biofluid concerning kidney cancer, and RCC in particular, due to its intimate contact with the urinary system. A number of MS-based metabolomic studies of RCC urine have been carried out to illustrate the potential of urine metabolic profiling to differentiate RCC patients from controls, two initial reports having considered small sample groups (<10)17,38 and another MS study addressing the impacts of gender, age and geographical origin on urine metabolic profile and discrimination of cancer from control samples39. The latter showed that geographic origin was important whereas the impact of gender and age was residual. Subsequent MS work, using age-, gender- and ethnic-matched cohorts, identified 13 metabolites with statistically different urinary levels between RCC and control samples, and the proposed altered pathways included glycolysis, fatty acid metabolism, amino acid (alanine, aspartate, glutamate) metabolism and nicotinate and nicotinamide metabolism18. Further work addressed a xenograft mouse model of highly metastatic RCC and showed that urine is less reflective of tumoral metabolomic changes compared to tissue or serum19, nevertheless, increased urinary levels of some acylcarnitines were found in RCC cases, consistent with previous work40.
Notwithstanding the important developments described above, NMR-based metabolomics studies of RCC urine has been relatively underexplored41,42 NMR metabolomics importantly complements MS-based methodologies since, in spite of its lower sensitivity, it enables a wide range of compound families to be detected, with high reproducibility, in a single experiment, thus being particularly suited to untargeted analysis.
This paper reports a NMR profiling study of human urine from a cohort of RCC patients (n = 42) and controls (n = 49) building on previous work42 by adding a systematic evaluation of the impacts of subject age, gender, body mass index (BMI) and smoking habits, to be later considered when defining a RCC metabolic signature. Furthermore, our methodology entailed variable selection of the 1H NMR spectra43, in tandem with multivariate analysis and validation procedures, to more efficiently retrieve meaningful profiles of metabolite variations related to confounders and disease. Notably, the thus obtained RCC signature was shown to have classification power in unsupervised analysis, for the first time to our knowledge, thus proving its potential in the classification of new subjects prior or irrespective of clinical diagnosis. Finally, a preliminary study of clear cell RCC (ccRCC) compared to other RCC subtypes, and of different RCC stages was performed.
Figure 1 shows the average 1H NMR spectra of the urine of controls and RCC subjects. A total of 70 metabolites have been identified (Table S1), confirming previous reports of urine NMR assignment44,45,46,47. Visual comparison of the spectra in Fig. 1 suggests apparent decreases in acetate, p-cresol sulfate (p-CS), trimethylamine N-oxide (TMAO), hippurate, indoxyl sulfate (IS) and trigonelline, in RCC samples. However, at this stage it is unclear if such changes are statistically meaningful and/or reflect the effect of confounders, in addition to disease. Indeed, differences in subject age (controls: mean age 67, RCC: mean age 60), gender (controls: 34/15 F/M; RCC: 17/22 F/M) or lifestyle factors (e.g. smoking habits, dietary habits) may impact on urine composition. The effect of these may be decreased using matched cohorts, however, this leads to (1) no applicability of models to subjects falling outside the matched range and (2) reduction in sample numbers and, hence, model robustness. In this study, we have used spectral variable selection to reduce the impact of random uncontrolled factors (such as diet and other lifestyle characteristics) and compared the results obtained using an unmatched cohort and an age- and gender-matched sub-cohort (Table 1), while identifying the specific effects of each recognized confounder.
Initial PCA of the full spectra of the unmatched cohort (not shown) did not separate controls from RCC subjects and PLS-DA provided a model with a very low predictive power (median Q2 0.31) (Fig. 2a, Table S2). This result was improved when variable-selected spectra were considered (median Q2 0.59) (Fig. 2b, Table S2), reflecting a reduction of random variability effects, as also expressed by a visible improvement in the Q2 distributions plot and ROC curve (Figure S1). When an age- and gender-matched RCC sub-cohort was considered (Table 1), the robustness of the PLS-DA model improved marginally (median Q2 0.67) (Fig. 2c, Table S2), thus indicating that age and/or gender play a minor role in the present context. Nevertheless, the impact of these variables on urine profile was quantified in matched sub-cohorts (Figure S2a,b, Table S2) and a corresponding list of metabolite variations was obtained (Table 2). Generally, age-related changes were indeed small (effect sizes <1), healthy controls over 60 years excreting slightly higher levels of IS, and lower levels of citrate, cis-aconitate, creatine, creatinine, hypoxanthine and trigonellinamide and some unassigned resonances. Decreased cis-aconitate, creatine and cretinine confirmed previous reports48,49,50,51 but the remaining characteristics seem to be specific of the particular cohort considered here. In addition, healthy males were observed to excrete higher amounts of 1-methylhistidine, isoleucine, tartrate and unassigned resonances (Un) 3, 4 and 6 (at δ 0.92, 1.15, 6.19 respectively) and less glutamate, 1,6-anhydroglucose and Un 5 (δ 2.05) (Table 2). Interestingly, these variations are, to our knowledge, novel compared to reported gender-related alterations in urine48,49,50,51, thus confirming the need to establish confounders’ impact for each new (unmatched) cohort under study. Furthermore, since the RCC cohort was heterogeneous regarding smoking habits and BMI (Table 1), the impact of these on the metabolic profile of patients was also verified (Table S2). When age- and gender-matched sub-cohorts were used, the robustness of the PLS-DA model related to smoking habits (SH) was not affected whereas the BMI model was improved (median Q2 = 0.57; sensitivity = 96%; specificity = 86% and classification rate = 91%). According to the unmatched PLS-DA models, some metabolite variations emerged as possibly related to smoking habits (increased trigonelline, hypoxanthine and a number of unassigned resonances, and decreased 3-hydroxy-butyrate (3-HBA), 4-deoxythreonic acid, allantoin, betaine, guanidinoacetate (GAA) glucose, lactose and scyllo-inositol) and to BMI (increased trimethylamine-N-oxide (TMAO) and decreased 3-hydroxy-isovalerate (3-HIVA), citrate and succinate (Table 2). Taking the above confounders into account, both unmatched and matched disease cohorts were studied through the corresponding PLS-DA loadings (Fig. 2d, for the unmatched cohort) and spectral integration, while possible bias due to confounders were identified. Overall, variations in 45 metabolites were identified, in addition to changes in several unassigned resonances (Table 3). A total of 32 features exhibited statistical relevance (p < 0.05). These features comprised 20 assigned compounds and were mostly observed in both matched and unmatched cohorts. However, changes in GAA, scyllo-inositol and unassigned resonances at δ 4.29, 9.05 were only revealed in the age- and gender-matched cohort, which suggests such changes may become masked in unmatched cohorts. On the other hand, 2-ketoglutarate (2-KG), 4-hydroxyphenylacetate, fumarate, lactate, threonine and trigonellinamide were only observed to change in the unmatched cohort, probably benefiting from the higher sample numbers. PLS-DA model computed using only the set of 32 metabolites/resonances was comparable in performance to that obtained with all selected spectral variables (Table S2), thus electing the 32-resonances subset as a good descriptor of RCC, when compared to controls. This was confirmed by PCA of the 32-metabolite subset (Fig. 3a), which showed very good group separation. Upon removal of bias resonances (those noted with c, e, f, g in Table 3), the remaining 23 integrals remarkably retained classification power (median Q2 0.75), compared to the 32-resonances model (median Q2 0.68) (Table S2), while PCA improved only very slightly (Fig. 3b), thus again demonstrating that age, gender or smoking habits discrepancies do not hinder RCC classification, based on the urine profiles of the population under study.
Overall, RCC patients were shown to excrete higher levels of 2-KG, N-methyl-2-pyridone-5-carboxamide (2-Py), bile acids (tentative assignment), galactose, hypoxanthine (possible confounder), isoleucine (possible confounder), pyruvate, succinate and valine; and lower levels of 4-hydroxyhippurate, 4-hydroxyphenylacetate, acetone, GAA, glycine, hippurate, malonate, phenylacetylglutamine (PAG), tartrate and trigonelline (Table 3). These results partially confirm previous suggestions of a 7- metabolite urinary signature of early RCC42, particularly regarding the increases in lactate (not statistically relevant here) and pyruvate (here with p-value 5.43 × 10−7, compared to 0.010 in ref. 42) and the decrease in hippurate (here with p-value 4.60 × 10−6, compared to 0.023 in ref. 42). Previously observed variations in creatine, alanine, betaine and citrate were either not confirmed in this cohort or found to have a confounder contribution (namely, decreased citrate was here related to higher BMI, Table 2).
In addition, metabolic differentiation between distinct RCC subtypes and stages was attempted by PLS-DA (Table S2). Unfortunately, the low sample numbers for each of the non-ccRCC types only enabled a preliminary comparison between ccRCC and all other types put together. This revealed small changes (effect sizes <1) in ccRCC cases (higher levels of trimethylamine (TMA), taurine and unassigned doublets at δ 0.75, 0.78 and 1.25, and lower levels of creatine and trigonellinamide), compared to other types (Table 2). The age-matching of ccRCC and other subtypes samples resulted in a slightly improved PLS-DA model (median Q2 = 0.78; sensitivity = 95%; specificity = 93% and classification rate = 91% compared to median Q2 = 0.63; sensitivity = 92%; specificity = 90% and classification rate = 91% of the unmatched cohort).
The variable selection strategy used here enabled a 32-resonance urinary signature to be identified as characteristic of RCC urinary profile, compared to controls, and putative biochemical pathways are hereby advanced relating the metabolite changes observed (Fig. 4). Changes in hippurate levels may arise from sources as varied as diet, oxidative stressors and gut microflora and, while hippurate decreases have been reported in a recent account of RCC urine metabolomics42 and lung cancer27, some inconsistency is found in other cancer studies52,53,54. This indicates the importance of diet/lifestyle parameters in determining the relationship of hippurate and cancer. Furthermore, lower levels of excreted hippurate have also been correlated with type-2 diabetes mellitus55, obesity55,56,57 and high blood pressure58, the latter two factors being recognized as risk factors of RCC2,4 and, indeed, characterizing many of the subjects comprised in our RCC group (n = 9 obese subjects, n = 21 overweight subjects, n = 27 subjects with high blood pressure). Trigonelline (or N-methylnicotinate) may relate to particular dietary products (e.g. coffee) but may also arise from endogenous niacin methylation. Reduced excretion of this compound was reported in liver cancer patients59, ovarian cancer patients15, patients with pancreatic ductal adenocarcinoma60 and lung cancer27. Together with the decreasing tendency of trigonellinamide (or 1-methylnicotinamide), this change suggests some impairment of nicotinate and nicotinamide metabolism61. This may impact on the conversion of nicotinamide to 2-Py, catalyzed by poly(ADP-ribose) polymerase (PARP), an enzyme also involved in DNA repair mechanisms, replication, chromatin condensation, cellular response to stress and regulation of apoptosis62, thus possibly explaining the increased excretion of 2-Py. Furthermore, altered purine metabolism may lead to elevated hypoxanthine levels, as observed in the present study and previously in the urine of patients with Non-Hodgkin lymphoma63. An additional note on the above mentioned reduction in excreted trigonelline relates to this compound being a byproduct of the conversion of S-adenosylmethionine to S-adenosylhomocysteine in the methionine cycle. Indeed, its reduction may also indicate S-adenosylmethionine depletion due to its redirection to help replenish reduced glutathione (GSH) levels64, an important cellular antioxidant molecule, believed to be associated with increased production of reactive oxygen species (ROS) in cancer cells. The increased level of galactose may be related to its participation in glycolysis. In fact, increased glycolysis activity is hereby made clear by the higher levels of excreted pyruvate and subsequent altered levels of several tricarboxylic acid (TCA) cycle intermediates, namely 2-KG and succinate. Increased glycolytic flux and altered TCA cycle function are well known hallmarks of cancer, affecting not only cellular energetic efficiency but also anabolic/biosynthetic efficiency65,66,67, since intermediates in these pathways are diverted towards the synthesis of proteins, nucleic acids, lipids, and cholesterol synthesis66,68,69,70,71,72, and generally aid in the maintenance of cellular redox69,71, genetic and epigenetic status69,73 required for cancer cells proliferation. In connection to enhanced glycolysis, increased circulating and excreted alanine and lactate levels in RCC have been interpreted as evidence of the expected Warburg effect in cancer cells37,42, however, in this cohort we have only observed a small increasing tendency for lactate. This suggests that such direct evidence may depend on phenotype and/or become masked by inter-subject variability. In addition, the decreased acetone urinary levels found suggest the inhibition of ketogenesis, consistent with the reported preferential use of acetyl coenzyme A (acetyl-CoA) by cancer cells as precursor of lipids/cholesterol/isoprenoids over the ketogenic pathway74. This hypothesis is consistent with the decreased amounts of both acetoacetate and 3-hydroxy-butyrate (3-HBA), also components of ketone bodies, found in cancer patients compared to controls (metabolites not discriminant according to their VIP value). Furthermore, a ketogenic diet (high fat, low carbohydrate content) has shown to have a protective/therapeutic effect on cancer probably because it diminishes the glycolytic flux in cancer cells75,76.
Metabolic deregulation in RCC seems to impact importantly on the levels of glycine (decreased), isoleucine (increased) and valine (increased). Glycine is an essential amino acid highly consumed by fast proliferating cancer cells77,78, decreased levels of glycine having also been detected in the urine of prostate cancer patients79. As serine, glycine sustains the one-carbon metabolism (folate and methionine cycles) that provides precursors for the biosynthesis of several biomolecules80 and the increased uptake of glycine has been suggested to relate to oncogenesis and malignancy in several cancer cell lines78. Regarding the branched-chain amino acids valine and isoleucine, their urinary levels have also been found increased in human colorectal cancer81 and gastric cancer patients82. Both studies have revealed that the levels of these amino acids seem dependent on cancer staging (due to distinct extents of proteolysis74) although we have not detected relevant changes in this respect. Other amino acid-related changes comprise those affecting PAG and 4-hydroxyphenylacetate, both decreased. PAG is a downstream metabolite of phenylacetic acid and glutamine and it has been found elevated in the urine of colorectal cancer patients83 and decreased in the urine of human bladder cancer84 and lung cancer27. Lower PAG excretion, as that observed here, may reflect lower glutamine availability due to enhanced glutaminolysis, a well-known metabolic adaptation of cancer cells to sustain TCA cycle activity and amino acids and lipids synthesis85,86. 4-Hydroxyphenylacetate is a product of tyrosine metabolism and has also been reported as significantly decreased in the urine of breast cancer patients87 and elevated in the plasma of dialysis patients88. It is possible that the decreased excretion of this compound in cancer patients may relate to kidney impairment as a result of cancer progression. Furthermore, GAA is an intermediate in the biosynthesis of creatine being synthesized mainly in the kidneys and then converted to creatine in the liver89. The combination of arginine and glycine to form GAA is considered the rate-limiting step of creatine synthesis, thus, the decreased GAA in urine may be due to an impaired renal function and, its urinary excretion has been found decreased in a variety of renal diseases90,91,92. This is also compatible with the decreased creatinine and creatine excretion found in RCC patients compared to controls, which further suggest an impaired renal function and subsequently clearance of this compounds. Furthermore, GAA was also found decreased in the urine of patients with glioblastoma multiforme93 and bladder cancer94 compared to controls, which may also be suggestive of altered muscle energy metabolism as a systemic effect of cancer95. Finally, the tentative assignment of peaks at δ 0.54 and 0.57 as bile acid resonances suggests an impact of RCC on the endogenous cholesterol metabolism. Bile acids are considered in general as markers of liver injury96 and have indeed been found elevated in both serum and urine of patients with hepatocellular carcinoma97. However, the particular relationship of these compounds with RCC remains unclear, at this stage.
The results presented here firstly confirm the importance of evaluating the effects of age and gender (as well as of relevant lifestyle parameters) on urine composition, as part of an initial phenotypic characterization of the specific cohort under study. Based on the evaluation of the impacts of age, gender, smoking habits and BMI on urinary profile, we concluded that the impact of these potential confounders is residual, in this population, in terms of RCC classification, although it determines the exact nature of the retrieved disease signature. Hence, the use of unmatched cohorts was found not to hinder successful RCC classification within the present cohort. The results also show that RCC may be successfully described by changes in a total of 32 compounds/resonances (or 23 resonances, if possible biased variables are excluded). Putative interpretation of the identified metabolites changing in RCC, compared to controls, identified possible unspecific effects involving hippurate, trigonelline and trigonellinamide, thus reflecting the importance of diet and gut microflora, as well as nicotinate and nicotinamide metabolism and anti-oxidative mechanisms as less specific systemic effects of cancer. Additional metabolic effects accompanying RCC, probably of a more specific nature, include disturbances in galactose metabolism, probably in association with the expected enhanced glycolysis activity34,35. The latter was seen to be accompanied by adaptations of the TCA cycle, ketogenesis, selected amino acid metabolism (glycine, isoleucine and valine), creatine and creatinine metabolism and, possibly, endogenous cholesterol metabolism. These biochemical hypotheses will require future validation through complementary biological measurements, a process which would also benefit from further efforts in the assignment of many important still unassigned NMR resonances. In addition, natural follow-ups of these findings comprise external validation of the 32-resonance RCC signature found here, using external independent sets of subjects of the same geographical origin as the training cohort used here. In fact, the possible dependence of the metabolic signature on geographical origin and, in general, on population phenotype needs to be investigated, as it may explain discrepancies between independent studies and, most importantly, unveil the need to define distinct metabolic biomarkers for different populations.
The cohort enrolled in this study comprised thirty-nine patients diagnosed with primary RCC (17 females and 22 males; age range 35–79, average age 60) and forty-nine healthy control (cancer-free) subjects (34 females and 15 males; age range 38–86, average age 67) (Table 1). These groups did not include diabetic patients or subjects suffering from other acute conditions. Table 1 also shows the histopathological types of the RCC tumors diagnosed, TNM staging2, and information on subject age, gender, and, in the case of RCC patients, smoking habits and BMI (information not available for control subjects). All subjects signed informed consents and the study was approved by the Ethics Committee of the Portuguese Oncology Institute-Porto (CES76/2012). All the experiments were performed in accordance with the relevant approved guidelines and regulations. The patients provided urine samples preoperatively, none having undergone radiation or chemotherapy treatment. Each subject, either patient or healthy volunteer, provided a sample of first void urine sample (after overnight fasting) in a sterile cup. All samples were then centrifuged (4000 rpm, 20 min, 4 °C) and split into several aliquots transferred into cryovials and stored at −80 °C until NMR analysis.
Prior to NMR analysis, urine samples were thawed (room temperature) and centrifuged (8000 rpm, 5 min, 4 °C) to remove cells and other precipitated material. Then, 60 μL of buffer solution (1.5 M phosphate buffer pH 7.0 in D2O) containing 0.1% of 3-trimethylsilyl-propionate (TSP), used as chemical shift reference, were added to 540 μL of each urine sample. Fine pH readjustment to 7.00 ± 0.02 was carried out with 4 M solutions of KOD or DCl. The resultant mixture was centrifuged (8000 rpm, 5 min, 4 °C) and 550 μL were transferred to a 5 mm NMR tube.
NMR spectra were acquired at 300 K on a Bruker Avance DRX-500 spectrometer operating at 500.13 MHz for proton and equipped with a 5 mm TXI probe. For each sample, a standard 1D 1H NMR spectrum was acquired, using a ‘noesypr1d’ (Bruker library) pulse sequence with water suppression during the relaxation delay (4 s) and mixing time (100 ms). Other acquisition parameters were as follows: spectral width (SW) 10000 Hz, 64 k data points and 128 transients. All free induction decays (FID) were multiplied by a 0.3 Hz line-broadening factor prior to Fourier Transformation (FT). Spectra were manually phased, baseline corrected, and internally referenced to TSP at δ 0.00 ppm (TopSpin 3.2). 2D homonuclear and heteronuclear spectra were recorded for selected samples to aid spectral assignment. Specifically, 1H-1H NMR total correlation spectroscopy (TOCSY) spectra (‘dipsi2phpr’ pulse sequence) were acquired in phase sensitive mode using time proportional phase incrementation (TPPI) and the MLEV17 pulse sequence for spin locking, with acquisition parameters: 4096 data points in dimension 1 (F1) and 256 data points in dimension 2 (F2), 40 scans and SW 8012.82 Hz in both dimensions, relaxation delay 1.5 s, mixing time of MLEV spin lock 80 ms. 1H-13C phase sensitive (echo/antiecho) heteronuclear single quantum correlation (HSQC) experiments were recorded with inverse detection and 13C decoupling (‘hsqcetgp’ pulse sequence), a total of 2048 data points in both dimensions, 40 scans and SW 8012.82 Hz (F1) and 20831.98 Hz (F2). A relaxation delay of 1.5 s was employed and a refocusing delay equal to 1/4 1Jc-H (1.72 ms) was used. For both TOCSY and HSQC spectra, zero-filling to 1024 data points and forward linear prediction were used in f1 and multiplication by a shifted sinebell-squared apodization function was applied in both dimensions prior to FT and phasing. 1D and 2D spectra were compared to reference spectra in the BBIORFCODE-2-0-0 database (Bruker Biospin, Rheinstetten, Germany), as well as other existing databases98 and literature reports44,46,47. Statistical Total Correlation Spectroscopy (STOCSY) was also used to aid peak assignment, based on the fact that the method identifies correlated peak intensities arising from the same molecule, as well as from biochemically related molecules99.
Multivariate Statistical Analysis
The spectral region between 9.40–0.50 ppm was considered for multivariate analysis, after exclusion of residual water (5.48–6.16 ppm) and urea (4.62–5.06 ppm) spectral regions (AMIX software Bruker GmbH). Spectra were aligned using recursive segment-wise peak alignment100, normalized by probabilistic quotient normalization (PQN)101 (Matlab 7.12.0, The MathWorks, Inc) and scaled to unit variance (UV) (SIMCA-P 11.5, Umetrics, Umea, Sweden). Principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA) were applied to the NMR spectra (SIMCA-P 11.5, Umetrics, Umea, Sweden). PLS-DA model robustness was assessed by Monte Carlo Cross Validation (MCCV) using 500 iterations. For each of the 500 randomly generated classification models, Q2 values (predictive power), number of Latent Variables (LV), and confusion matrices of original and randomly permuted classes were retrieved. Sensitivity (sens), specificity (spec) and classification rates (CR) were computed and the predictive power of each model was further assessed using a Receiver Operating Characteristic (ROC) map, a function of the true positive rate (TPR or sensitivity) and false positive rate (FPR or 1-specificity). PLS-DA models were considered robust when minimal overlap of the original (alternative hypothesis) and randomly permuted (null hypothesis) Q2 distributions were obtained. For variable selection studies, spectral variables were selected through the intersection of three conditions: VIP > 1 and VIP/VIPcvSE > 1 and |b/bcvSE| > 143. After variable selection, PLS-DA was reapplied and resubmitted to MCCV. For all models computed, the relevant peaks/metabolites contributing to class discrimination were integrated in the original spectra (AMIX software Bruker GmbH) and PQN normalized (Matlab 7.12.0, The MathWorks, Inc). All integrals were compared through the two-samples Student t-test or the nonparametric analogue Wilcoxon rank sum test (statistical relevance considered for p < 0.05, confidence level 95%). Additionally, the Benjamini-Hochberg false discovery rate (BH-FDR) correction method102 was used to adjust p-values for multiple comparisons (corrected p-value = p-value*(n/(n-2)), where n = number of metabolites/resonances tested) and a significance cut-off equal to 0.05 considered. Furthermore, effect size values were calculated following the definition given in Berben et al.103.
How to cite this article: Monteiro, M. S. et al. Nuclear Magnetic Resonance metabolomics reveals an excretory metabolic signature of renal cell carcinoma. Sci. Rep. 6, 37275; doi: 10.1038/srep37275 (2016).
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work received financial support from the European Union (FEDER funds POCI/01/0145/FEDER/007728) and National Funds (FCT/MEC, Fundação para a Ciência e Tecnologia and Ministério da Educação e Ciência) under the Partnership Agreement PT2020 UID/MULTI/04378/2013. It was also developed within the scope of the project CICECO-Aveiro Institute of Materials, POCI-01-0145-FEDER 007679 (FCT Ref. UID /CTM /50011/2013), financed by national funds through the FCT/MEC and when appropriate co-financed by FEDER under the PT2020 Partnership Agreement. The authors acknowledge the Portuguese National NMR Network (RNRMN), supported by Fundação para a Ciência e Tecnologia (FCT) and M. Spraul, Bruker BioSpin, Germany, for access to software and spectral databases. M.S.M. and J.P. acknowledge FCT for PhD grants SFRH/BD/80518/2011 and SFRH/BD/73343/2010, respectively.