Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Magnetic Resonance Spectroscopy-based Metabolomic Biomarkers for Typing, Staging, and Survival Estimation of Early-Stage Human Lung Cancer

## Abstract

Low-dose CT has shown promise in detecting early stage lung cancer. However, concerns about the adverse health effects of radiation and high cost prevent its use as a population-wide screening tool. Effective and feasible screening methods to triage suspicious patients to CT are needed. We investigated human lung cancer metabolomics from 93 paired tissue-serum samples with magnetic resonance spectroscopy and identified tissue and serum metabolomic markers that can differentiate cancer types and stages. Most interestingly, we identified serum metabolomic profiles that can predict patient overall survival for all cases (p = 0.0076), and more importantly for Stage I cases alone (n = 58, p = 0.0100), a prediction which is significant for treatment strategies but currently cannot be achieved by any clinical method. Prolonged survival is associated with relative overexpression of glutamine, valine, and glycine, and relative suppression of glutamate and lipids in serum.

## Introduction

Despite extensive research over the past decade to improve lung cancer (LuCa) detection and treatment, LuCa presents persistent clinical challenges. The leading cause (>26%) of cancer death in the United States for both men and women, LuCa results in the number of deaths equivalent to the combination of the next four highest causes of cancer death: breast, prostate, colon, and pancreatic. LuCa is usually diagnosed at late stages, with >70% of patients dying from the disease; the ratios for breast and prostate cancer are about 16% and 17%, respectively1. This reality is largely attributable to the lack of a widespread, early screening test for LuCa. In its absence, the vast majority of patients seeking medical advice for symptoms of LuCa harbour locally advanced or metastatic disease.

At present, advanced radiological examinations, especially low-dose spiral CT (LDCT), can detect small LuCa lesions2,3,4,5. The many reports published by the National Lung Screening Trial (NLST) evaluating LDCT efficacy nonetheless question its cost-effectiveness as a screening tool6 and raise the issue of potential over-diagnosis7 and the impact of screening on quality of life8. Recently, the American Thoracic Society and American College of Chest Physicians published a joint official policy statement to guide the safe, effective development of LDCT screening programs9. Nevertheless, implementation of LDCT as a LuCa screening tool would entail considerable logistic and scientific concerns, ranging from high cost10,11,12,13 to, most importantly, possible radiation hazard for screened populations14,15,16,17. Thus, while LDCT enables detection of small lung nodules, its implementation as a screening tool for the general population is not feasible. Therefore, novel, low-cost, and safe LuCa tests that can prompt patients with suspicious screening results to seek further radiological evaluation are needed.

Current investigations of circulating blood biomarkers to develop LuCa screening methods are based on the fundamental physiology fact that all cardiac output passes through the lungs, with 20% of blood in the lungs at any given time. Thus, LuCa-associated molecules would likely be carried from lung lesions into the circulating blood. Alternatively, since metabolites in the circulating blood provide necessary nutrients for all biological and pathological processes throughout the entire body, consumption of specific metabolites by LuCa to sustain and enhance malignant processes may be measurable in blood. As a result, metabolites produced by or consumed by lung cancer lesions could serve as biomarkers.

Previously, studies of blood LuCa biomarkers have reported LuCa-associated microRNA18,19,20 and RNA fragments21 detected by quantitative real-time-PCR and mass spectrometry as promising markers. Inspired by the achievements in genomics, proteomics, and transcriptomics, cancer metabolomics, which reflect the functional read-outs of these upstream biological processes, can yield measures of global metabolite profiles associated with various metabolic pathways influenced by oncological developments.

To evaluate LuCa, particularly low grade LuCa and identify potential LuCa biomarkers, we investigated paired tissue and serum samples obtained from the same patients. These analyses were carried out with the special technique of high resolution magic angle spinning magnetic resonance spectroscopy (HRMAS MRS), which we developed for metabolomic analysis of intact biological tissue22,23 and complex fluids. This technique allows subsequent histopathology analyses of the same tissue samples, enabling spectroscopic data to be interpreted according to tissue pathologies. Since HRMAS MRS can also measure the complex biofluid of serum and obtain spectra of high resolution, tissue and serum metabolomic measures can be correlated in order to investigate the associations between metabolites of potential LuCa biomarkers measured from paired tissue and serum samples.

## Results

This study included tissue and serum pairs from 93 patients of the two major types of non-small cell LuCa (NSCLC): squamous cell carcinoma (SCC, n = 42, F = 15, M = 27, age = 67.7 ± 8.3) and adenocarcinoma (Adeno, n = 51, F = 26, M = 25, age = 63.7 ± 8.7), as well as 29 serum samples from healthy controls (Ctrl, F = 10, M = 19, Age = 66.8 ± 12.6). The patients were recruited as previously described from an ongoing study of lung cancer survival24. We selected tumour samples that had at least 70 percent tumour cellularity, with histology of tissue samples confirmed by a pathologist after MRS analyses. With a specific emphasis on studying early stage LuCa, the studied patient population included more low grade Stage I (n = 58, SCC = 27, Adeno = 31, F = 24, M = 34) cases than the more advanced stages (II, III, and IV, n = 35, F = 17, M = 18) combined (Supplementary Table S1 lists patient clinical and demographic information). We further randomly sub-divided Stage I (n = 58) and control (n = 29) cases into Training (SCC = 14, Adeno = 19, Ctrl = 14) and Testing (SCC = 13, Adeno = 12, Ctrl = 15) cohorts and tested when needed. From HRMAS MRS measurements of these samples, we identified 32 spectral regions (width: 0.026 ± 0.010 ppm) that presented measurable spectral intensities in more than 90% of spectra for both tissue and serum samples. We present results using spectral regions, rather than individual metabolites, since each region may include contributions from multiple metabolites, and conversely, a single metabolite can contribute to multiple spectral regions. However, we discuss major possible contributing metabolites when relevant. Principal component analyses (PCA) can be used to reduce data dimensions by identifying PCs that have eigenvalues greater than 1.0 and can be further analysed. PCA performed on these 32 regions for tissue and serum MRS data sets independently produced eight PCs, all with eigenvalues greater than 1.0, for both tissue and serum MRS data sets, respectively. The eight PCs accumulatively represent 79.2% and 77.2% of variance for tissue and serum, respectively. All the statistically significant results presented below were verified by co-variance analyses of age and smoking status (packyear).

Results presented here will include the following four aspects: (1) serum MRS data that differentiate LuCa from healthy controls and among LuCa types and stages; (2) tissue MRS data that differentiate LuCa types and stages; (3) correlations between serum and tissue MRS measurements; and (4) predictions of LuCa overall survival with metabolomics.

### Serum MRS – identifying LuCa from controls and differentiating LuCa types and stages

We defined the serum relative spectral intensity, RelInt(Ser), for spectral region m (m = 1, 2, … 32) and samples = 1, 2, … 93 as:

$${{\rm{RelInt}}}_{{\rm{m}},{\rm{s}}}({\rm{Ser}})( \% )=({\mathrm{Exp}}_{{\rm{m}},{\rm{s}}})\times 100/\sum _{i=1}^{32}(Ex{p}_{i,s})$$
(1)

where, (Expm,s) represents the experimental value for spectral region m and $$\sum _{i=1}^{32}(Ex{p}_{i,s})$$ represents the sum of measured values for all 32 spectral regions, for each of the 93 samples.

Results from serum MRS showed significant differences in spectral relative intensities between groups of interest. Figure 1 summarizes the observed statistical significances of relative spectral intensities for 19 among 32 analysed spectral regions and 5 out of 8 PCs measured from serum samples that differentiate healthy controls from different LuCa groups (central column), as well as 8 spectral regions and 4 PCs to differentiate among different groups of LuCa types and stages (right column), according to Student’s t-test (for normal distributions with or without equal variance) or Mann-Whitney-Wilcoxon test (for non-normal distributions). The notations of statistical significance levels in Fig. 1, and for the rest of the report, are as follows: “*”p < 0.05; “**”p < 0.005; and “***”, Bonferroni-corrected thresholds of statistical significance of p < 0.0016 or p < 0.0063 for 32 individual regions or 8 principal components (PC), respectively. The star symbols in this figure, and figures hereafter, denote statistical significance values after calibration with false discovery rate (FDR) analyses.

Multiple spectral regions showed statistical significance in differentiating LuCa from controls in Fig. 1. In Fig. 2a, Stage I are compared with control cases with three panels presenting examples of three significant regions: lactate (4.10–4.11 ppm), glutamate (2.05–2.07 ppm), and GPC & PC (3.21–3.23 ppm). While these metabolic regions can significantly differentiate Stage I LuCa cases from controls both for the entire tested populations, and for Training and Testing cohorts, respectively, overlap between control and LuCa samples is also obvious, represented by modest receiver operating characteristic (ROC) curves (area-under-curve, AUC = 71~83%), as well as by the closeness of the two 3D ellipsoids presented in Fig. 2b. Invoking the metabolomic concept of multi-dimensional comparisons (in contrast to a single metabolite evaluation), leave-one-case-out (LOCO) cross-validated linear discriminator (LD) analyses involving all 19 spectral regions presented in Fig. 1, in Fig. 2c, Drastically improved differentiations between LuCa and control (vertical panel) presented as well separated 3D ellipsoids (ROC AUC = 98.9%), as well as among all three groups (horizontal panel) were observed. Figure 2d,e further detail metabolomic differentiation among all three groups and between LuCa and controls, respectively. The indication of the existence of metabolomic differentiations between LuCa and controls led us to further test the validity of the observation with the above defined Training and Testing cohorts. Figure 2f demonstrates the significant LuCa and control differentiation results measured with LD canonical correlation analysis for the Testing cohort by using analytic parameters obtained from the Training cohort.

### Tissue MRS – differentiating LuCa types and stages

Unlike serum samples, which are homogenous fluids, tissues are heterogeneous mixtures that are comprised of both diseased and healthy pathological components. Metabolite levels vary in different pathological features, so tissue MRS results must be interpreted in the context of tissue pathologies. The most significant advantage of HRMAS MRS – its ability to preserve tissue architecture for subsequent pathological evaluations – enables us to conduct pathological analyses after MRS measurement to calibrate contributions of tissue pathologies, and their inherent metabolic differences, towards the observed MRS values. For the studied LuCa tissues, four major pathology features were quantified for each specimen after HRMAS MRS: vol% of LuCa (in 58/93 measured samples, with Max: 94.9%, Median: 26.1%), Fibrosis/Inflammation (FI, 89/93, Max: 100%, Median: 74.4%), Necrosis (Nec, 31/93, Max: 100%, Median: 25%), and Cartilage/Normal (CN, 8/93, Max: 89.7%, Median: 47.5%).

We determine the relationship between tissue MRS and pathologies using a least-square regression of an over-determined linear model (LSR-ODLM), which includes 93 linear equations comprised of four pathology features and the experimentally measured value (Expm,s) for all 32 spectral regions m according to the following equation, for samples s = 1, 2, … 93:

$$\begin{array}{c}[{{\rm{C}}}_{{\rm{LuCa}},{\rm{m}}}\times {\rm{LuCa}}{ \% }_{{\rm{s}}}]+[{{\rm{C}}}_{{\rm{FI}},{\rm{m}}}\times {\rm{FI}}{ \% }_{{\rm{s}}}]+[{{\rm{C}}}_{{\rm{Nec}},{\rm{m}}}\times {\rm{Nec}}{ \% }_{{\rm{s}}}]\\ \,+[{{\rm{C}}}_{{\rm{CN}},{\rm{m}}}\times {\rm{CN}}{ \% }_{{\rm{s}}}]+{{\rm{a}}}_{{\rm{m}}}={\mathrm{Exp}}_{{\rm{m}},{\rm{s}}},\end{array}$$
(2)

where the contribution coefficients (Cx,m) of pathology feature (x = LuCa, FI, Nec, and CN) percentage towards the experimental value in region m are determined solely from the spectral data without any additional assumptions or weighting for the quantified pathological features; am is a spectral region-specific constant. The contribution coefficients from each of these four pathological features for the 32 analysed regions can be found in Supplementary Fig. S1.

To evaluate these coefficients, we calculated the estimated spectral intensity (Estm,s) for each spectral region, m, and each sample, s, based on the pathological compositions of the sample and the contribution coefficients:

$$\begin{array}{c}[{{\rm{C}}}_{{\rm{LuCa}},{\rm{m}}}\times {\rm{LuCa}}{ \% }_{{\rm{s}}}]+[{{\rm{C}}}_{{\rm{FI}},{\rm{m}}}\times {\rm{FI}}{ \% }_{{\rm{s}}}]+[{{\rm{C}}}_{{\rm{Nec}},{\rm{m}}}\times {\rm{Nec}}{ \% }_{{\rm{s}}}]\\ \,+[{{\rm{C}}}_{{\rm{CN}},{\rm{m}}}\times {\rm{CN}}{ \% }_{{\rm{s}}}]+{{\rm{a}}}_{{\rm{m}}}={{\rm{Est}}}_{{\rm{m}},{\rm{s}}}\end{array}$$
(3)

After calibration for pathology contributions, the difference between Expm and Estm, Expm − Estm, can be considered to be independent from tissue pathological compositions. This is supported by the comparisons of the linear regression analyses conducted between tissue pathological compositions (vol%) and Expm, as well as (Expm − Estm) values. For instance, the results of linear regression analyses evaluated between LuCa vol% and Expm values for tissue samples presented statistically significant (p < 0.050) correlations for 22 out of the 32 spectral regions, with p values ranging from <0.001 to 0.045 (mean = 0.010 ± 0.003). However, when linear regression analyses were evaluated between LuCa vol% and (Expm − Estm) values, no significant linear correlation was seen. The p values for the same 22 regions were determined to be between 0.110 and 1.00 (mean = 0.710 ± 0.074).

Therefore, the values of Expm − Estm, or the values of Expm/Estm (to avoid negative values), after calibration of the tissue pathological contributions, could be attributed to tissue MRS results that are largely reflecting patient disease status rather than pathology features. The total calibrated spectral intensity from 32 regions for sample s is:

$${\rm{Total}}\,{{\rm{Int}}}_{{\rm{s}}}({\rm{Tis}})=\sum _{i=1}^{32}(Ex{p}_{i,s}/Es{t}_{i,s}).$$
(4)

Then, the calibrated tissue relative spectral intensity, RelIntm,s(Tis), for spectral region m and samples s is:

$${{\rm{RelInt}}}_{{\rm{m}},{\rm{s}}}({\rm{Tis}})( \% )=({\mathrm{Exp}}_{{\rm{m}},{\rm{s}}}{/\mathrm{Est}}_{{\rm{m}},{\rm{s}}})\times 100/{{\rm{TotalInt}}}_{{\rm{s}}}({\rm{Tis}}).$$
(5)

Using the same conventions established in Fig. 1 for serum results, Fig. 3 presents differentiation of LuCa groups according to tissue pathological feature-calibrated MRS results as defined in Eq. 5. Here, 9 of 32 spectral regions present significant differentiation among various LuCa groups. The effects of pathology calibrations on tissue metabolites can be appreciated by an example of alanine (Ala) differentiating Stage I SCC from Adeno groups, as shown in Fig. 4, where significant differentiation was only observed after the applied pathological feature calibration.

### Correlating serum and tissue metabolomic profiles for LuCa differentiation

Studying tissue-serum pairs from the same patient enabled us to use the tissue data set as a training cohort to investigate correlations between serum and tissue MRS data.

The successes demonstrated in Fig. 5a,b, further guided us to test with randomly determined Training and Testing cohorts, including Stage I LuCa and control cases, previously presented with Fig. 2a,f. In Fig. 5c,d, linear discriminant canonical correlation analyses were conducted with the 19 spectral regions (Fig. 1), for tissue and serum MRS data of the Training cohort, respectively. The capabilities of the resulting canonical scores in differentiating SCC from Adeno groups were tested with the Testing cohort presenting indications of differentiations, but the serum result (p = 0.0502) was just above the level of significance. However, the study design of paired tissue and serum samples obtained from the same patients permitted us to conduct a further canonical analysis including both tissue and serum canonical scores for the Training cohort, and we tested the resulting canonical score on the Testing cohort with improved statistical significance (p = 0.009) when compared with the score obtained from serum data alone.

### Predictions of LuCa overall survival with MRS metabolomics

Clinical records (1997–2012) for the studied LuCa patients indicate the average survival time after surgery to be 41.3 ± 4.6 months (nAdeno = 27, Mean: 43.4 ± 6.4 months, and nSCC = 27, Mean: 39.2 ± 6.7 months). Using 41.3 mo as a threshold to define short vs. prolonged surviving, we observed a number of tissue and serum spectral regions that can differentiate between the two groups. Most importantly, some of these spectral regions from both tissue and serum can further provide statistically significant Kaplan-Meier estimates of 10-year overall patient survival for the entire population, as well as for certain subgroups (see Fig. S4).

To systematically evaluate the prognostic potential of serum metabolomics, we randomly divided 93 cases into eight groups (maximum number of cases per group: 14; minimum: 9). We combined seven groups to form the training cohort, and used the one group left out as the corresponding testing cohort. We iterated the leave-one-group-out process eight times to cover all eight groups (i.e. the eight corresponding testing cohorts). For each training cohort, we identified regions among the 32 spectral regions that can differentiate short vs. prolonged surviving groups with statistical significance (p < 0.05). The numbers of spectral regions thus identified ranged from one to seven (average: 3.75 ± 0.88) for the eight training cohorts. For each training cohort, a canonical correlation analysis including the identified spectral regions was first conducted to discover CCA loadings to discriminate short from prolonged surviving groups within the cohort. The loadings obtained from the training cohort were applied to the cases in the corresponding testing cohort to obtain the CCA scores for cases in the testing cohort. Upon the completion of all eight iterations, the CCA scores for all cases obtained when they were considered as testing cohort cases were combined into a single ensemble. The median value for the ensemble was determined and used as the threshold to evaluate the 10-year overall survival based on the Kaplan-Meier curves, which displayed statistical significance between short (red) and prolonged living groups (green) (p = 0.0325), as shown in Fig. 6a. The effects of tissue pathology calibration can also be seen when the measured tissue metabolic intensities are used to predict patient overall survival. For instance, pathology-calibrated spectral intensities of 3.91–3.90 ppm region are sensitive to predicting patient overall survival for the entire tested population, for SCC cases, and particularly for Stage 1 cases of SCC. However, this significant separation for a single disease stage, which cannot be differentiated by currently known clinical parameters, would be invisible without the pathology calibrations (Fig. S4).

While the leave-one-group-out analyses indicated the potential existence of metabolomic discriminators between short and prolonged LuCa patient survival, the method cannot provide a single set of parameters able to evaluate the status of a future case. Nevertheless, this proof of the potential existence of the survival-related intrinsic LuCa metabolomics encouraged us to further analyse all cases in a single data set. With all 93 cases, we identified nine serum spectral regions (Fig. 6d) that show significant differentiation between short and prolonged survival groups. By including these nine spectral regions in a canonical analysis to discriminate (p < 0.0001) short (<41.3 months, nSCC = 15, nAdeno = 14, CCA score = −0.652 ± 0.186; nSCC,St=I = 9, nAdeno,St=I = 5, CCA score = −0.837 ± 0.254) from prolonged (>41.3 months, nSCC = 12, nAdeno = 13, CCA score = 0.756 ± 0.200; nSCC,St=I = 8, nAdeno,St=I = 10, CCA score = 0.766 ± 0.224) survival (Supplementary Table S1), we were able to predict 10-year Kaplan-Meier overall survival estimates for both the entire LuCa population (SCC = 42, Adeno = 51) (Fig. 6b) and the Stage I cases alone (SCC = 27, Adeno = 31) (Fig. 6c) by using their respective median of the canonical scores as the discriminators. Prolonged survival is associated with relative overexpression of Gln, Val, and Gly, and relative suppression of Glu and lipids in serum. This last conclusion obtained from sera of stage I LuCa patients is of critical importance for its potential utility in clinic. At present, the criteria used in the LuCa clinicians for patient assessments are mostly based on clinical experiences accumulated from symptomatic and late-stage patients that cannot be applied to the assessment of asymptomatic patients with early stage disease which is now detected through advanced radiological tests. Thus, new prediction parameters for survival of Stage I LuCa will assist the advancement of the LuCa clinic. The heatmap in Fig. 6d illustrates that metabolite intensities for the prolonged living group (P) are either closer to the control group (C) or the short living group (S), whereas the intensities for C and S groups are noticeably different.

## Discussion

The aim of the current study is to evaluate potential human blood serum LuCa metabolomic markers that may be used to screen high-risk individuals for advanced imaging for detection of LuCa at early and asymptomatic stages. However, while identification of cancer blood screening biomarkers is extremely attractive due to the less invasive nature of specimen collection, research in this area is often challenged by low specificity, since blood circulates throughout the entire body. To associate serum metabolites with LuCa, we designed the study to include paired LuCa tissue and serum samples from the same patients. With such an experimental set-up, metabolites quantified in the serum samples could be investigated in conjunction with those measured from tissue samples.

Each analysed sample, either a tissue specimen of ~10 mg or a drop of serum of ~10 µl, produces a single MRS spectrum. However, a tissue sample, even as small as on a mg scale, represents a mixture of various pathological components, such as cancer, inflammation, fibrosis, and necrosis. Metabolite values are affected by not only the stage of disease from which the tissue is acquired but also the amount of cancerous and other cells present in the tissue. Thus, understanding metabolite concentrations in tissue requires consideration of pathology variations, and HRMAS MRS technology allowed us to quantify these amounts. We then calibrated MRS-measured metabolite values according to varying amounts of pathological components in each sample, rather than merely make qualitative comparisons between pathology percentages and metabolites25,26,27,28.

This calibration adjustment generated stronger results for tissue and serum analyses, but the interpretation of the observed reversed relationship when comparing SCC with Adeno for tissue and serum data sets (Fig. 5) requires caution. The apparent inverted slopes shown in Supplementary Fig. S3 were presented to explain the reversed relationship seen in Fig. 5a, but are not to be interpreted as the presentation of reversed metabolite concentrations between SCC and Adeno cases when comparing their metabolite concentrations in tissues with those in sera. To compare the selected 32 spectral regions for tissue and serum at a similar intensity level, we elected to analyse relative spectral intensities. With tissue, further calibrations of the relative spectral intensities according to tissue pathological compositions were implemented. Therefore, the above-mentioned apparent inverted slopes represent only the relationship seen with these calibrated relative spectral intensities, and cannot be simply extended to indicate metabolic concentrations. Additional analyses that will allow for quantification of metabolites from different sources will be necessary to understand these observations in detail. Furthermore, since blood is the main nutrient source for all physiological and pathological processes, it cannot simply be viewed as the “dumping ground” of cancer metabolisms active in tissue lesions. Therefore, the metabolomic profiles presented by blood serum cannot be expected to mimic those measured from cancerous tissues. Nevertheless, analyses of the similarities and differences of cancer metabolomic profiles measured from paired tissue and serum samples will improve understanding of cancer metabolism both for patient prognostication and design of treatment strategies.

MRS measurements of tissue and serum samples present snapshots of cellular metabolites. In the case of blood, metabolite levels measured in a LuCa patient may reflect altered output or uptake by cancer cells. For instance, when comparing serum profiles between prolonged and short survival groups, the prolonged survival cases favoured elevated expressions of glutamine, valine, and glycine (positive CCA loadings), and suppressed expressions in glutamate and lipid droplets (negative CCA loadings). The alterations of glutamine and glutamate may be interpreted through their metabolic mechanism. Glutamine is an essential metabolite to support anabolic metabolism in tumour cells29, and high consumption of glutamine has been reported for cancer cells30. Cancer cells use the enzyme glutaminase to convert glutamine to glutamate, and to form precursors for the processes of anaplerosis, glutathione synthesis, and fatty acid production, which allow for tumorigenesis31. Glutamine itself is also important energy source for cancer cells when glucose availability is limited32. These biological realities of cancer support the finding of higher glutamine in prolonged cases. Furthermore, blood maintains high levels of glutamine as a ready source of carbon and nitrogen to support biosynthesis, energetics, and cellular homeostasis, and cancer cells may hijack this supply for tumour growth33, which may particularly be true in the cases of fast-growing LuCa of the short survival cases. On the other hand, in prolonged survival cases (which closely match the healthy controls for glutamine levels in (Fig. 6d) the consumption of glutamine, for conversion to glutamate, is less than consumption by the short survival cases. Less glutamine consumption results in less production of lipid droplets and glutamate for these longer-surviving cancer cases. Similarly, for the essential amino acid valine, studies have shown that non-small cell LuCa tumours displayed a significant increase in valine uptake34. Thus, the elevated blood valine levels seen in the prolonged survival cases may present less uptake of valine when compared with short survival cases. The same reasoning can be extended to the observed elevated serum glycine levels when comparing prolonged with short survival cases, where glycine provides the carbon units to fuel the one-carbon metabolism for the synthesis of proteins, lipids, nucleic acids, etc.30. The observation of the association between higher levels of glycine and poorer prognoses reported for human breast cancer agree well with our measurements described here35.

In this exploratory study, our current results are limited by the scale of the study. First, we only analysed tissue and serum samples from a LuCa tumour bank enrolling cancer-positive patients who presented with symptoms or whose cancer was found incidentally. The biomarkers thus obtained may only apply to the studied patient populations, and may not be extrapolated to other patient populations, such as asymptomatic patients. Second, this study only investigated two major types of non-small-cell LuCa, so again the conclusions are not applicable to other types of LuCa without further validation on enlarged patient populations.

Nevertheless, our proof-of-concept MRS exploratory study of paired human LuCa tissue and serum samples demonstrates the potential of a physical chemistry approach for the discovery of human serum LuCa metabolomic markers. While the reported LuCa markers have been tested under a training and testing cohort design, the limitations of small case numbers and the need for analysing more diverse patient populations argue for more comprehensive studies to be conducted. Success in these investigations can propel biomarkers towards clinical trials and towards the ultimate goal – to indicate cancer and screen patients to advanced radiological imaging when warranted.

## Materials and Methods

### Study design

#### Experimental design

This study was approved by the Partners Human Research IRB (Protocol 2009P000982), and all research was performed in accordance with relevant guidelines and regulations. Serum and tissue samples were obtained from the Harvard/MGH Lung Cancer Susceptibility Study Repository. Informed consent was obtained from LuCa patients and healthy controls prior to banking samples and after the nature and possible consequences of the study were explained. The objective of this retrospective, paired tissue-serum investigation was to discover biomarkers in LuCa tissue of early stage LuCa which can also be measured in serum. Based on our initial, preliminary evaluation of lung cancer biomarkers published previously36, we designed this exploratory study to analyse ~100 samples. After evaluation of 101 samples, it was determined that the spectral resolution for 8 samples was not sufficient for further analysis, and only 93 were included in the current study.

#### Study population

Patient information: Detailed information on the studied patient population can be found in Supplementary Table S1. Researchers were blinded to the status of the samples during all measurement and experimental steps.

#### Intact tissue MRS

Samples were stored at −80 °C until analysis. High resolution magic angle spinning magnetic resonance spectroscopy (HRMAS MRS) measurements were performed using our previously developed method on a Bruker Avance (Billerica, MA) 600 MHz spectrometer. Measurements were conducted at 4 °C with a spin-rate of 3600 ± 2 Hz and a Carr-Purcell-Meiboom-Gill (CPMG) sequence with and without continuous-wave water suppression. Ten µL of serum or 10 mg of tissue were placed in a 4 mm Kel-F zirconia rotor with 10 uL of D2O added for field locking. HRMAS MRS spectra were processed using a laboratory developed MATLAB-based program, and peak intensities from 4.5–0.5 ppm were curve fit. Relative intensity values were obtained by normalizing peak intensities by the total spectral intensity between 4.5–0.5 ppm. The resulting values that were less than 1% of the median of all curve fit values were considered as noise and eliminated. Spectral regions were defined by regions where 90% or more of samples had a detectable value, resulting in 32 regions.

#### Quantitative histopathology

Following MRS measurement, tissues were formalin-fixed and paraffin-embedded. Serial sectioning was performed by cutting 5 µm-thick slices at 100 µm intervals throughout the tissue, resulting in 10–15 slides per piece. After hematoxylin and eosin (H&E) staining, a pathologist with >25 years experience read the slides to the closest 10% for percentages of the following pathological features: cancer, inflammation/fibrosis, necrosis, and cartilage/normal.

### Statistical analysis

Statistical analyses were performed using JMP Pro 13 and MATLAB 2017a. Univariate statistical tests included Student’s t-test (for spectral regions with normal distribution according to Shapiro-Wilk W test) or Mann-Whitney-Wilcoxon test (MWW, for spectral regions with non-normal distributions) for binary comparisons; analysis of variance (ANOVA, for normal distributions) or Kruskal-Wallis-Wilcoxon (KWW, for non-normal distributions) for ≥ ternary comparisons. Multivariate analyses included principal component analysis, linear discriminator analysis, and canonical correlation analysis. Associations between canonical correlation scores and survival were assessed using Kaplan-Meier survival curves and log-rank tests. Additionally, MRS spectral measurements from tissue were calibrated to account for the contributions from varying amounts of pathological components in each sample, using a least squares regression-over-determined linear regression model. In addition to reporting comparisons with an alpha level = 0.05, false discovery rate (FDR) analysis and Bonferroni corrections to account for multiple testing of 32 spectral regions and 8 principal components were invoked. Except where noted and explained, two-sided testing was used. All the statistically significant results presented were verified by co-variance analyses of age and smoking status (packyear).

## Data Availability

Data reported in this paper are available at our repository through the Martinos Center (http://www.nmr.mgh.harvard.edu/~cheng/MRSbenign/).

## References

1. 1.

Siegel, R. L., Miller, K. D. & Jemal, A. Cancer Statistics, 2017. CA: Cancer J. Clin. 67, 7–30, https://doi.org/10.3322/caac.21387 (2017).

2. 2.

National Lung Screening Trial Research, T. et al. Results of initial low-dose computed tomographic screening for lung cancer. N. Engl. J. Med 368, 1980–1991, https://doi.org/10.1056/NEJMoa1209120 (2013).

3. 3.

Kovalchik, S. A. et al. Targeting of low-dose CT screening according to the risk of lung-cancer death. N. Engl. J. Med. 369, 245–254, https://doi.org/10.1056/NEJMoa1301851 (2013).

4. 4.

Tammemagi, M. C. et al. Selection criteria for lung-cancer screening. N. Engl. J. Med. 368, 728–736, https://doi.org/10.1056/NEJMoa1211776 (2013).

5. 5.

Garcia-Velloso, M. J. et al. Assessment of indeterminate pulmonary nodules detected in lung cancer screening: Diagnostic accuracy of FDG PET/CT. Lung Cancer 97, 81–86, https://doi.org/10.1016/j.lungcan.2016.04.025 (2016).

6. 6.

Curl, P. K., Kahn, J. G., Ordovas, K. G., Elicker, B. M. & Naeger, D. M. Understanding Cost-Effectiveness Analyses: An Explanation Using Three Different Analyses of Lung Cancer Screening. Am. J. Roentgenol 205, 344–347, https://doi.org/10.2214/AJR.14.14038 (2015).

7. 7.

Patz, E. F. Jr. et al. Overdiagnosis in low-dose computed tomography screening for lung cancer. JAMA Intern. Med. 174, 269–274, https://doi.org/10.1001/jamainternmed.2013.12738 (2014).

8. 8.

Gareen, I. F. et al. Impact of lung cancer screening results on participant health-related quality of life and state anxiety in the National Lung Screening Trial. Cancer 120, 3401–3409, https://doi.org/10.1002/cncr.28833 (2014).

9. 9.

Wiener, R. S. et al. An official American Thoracic Society/American College of Chest Physicians policy statement: implementation of low-dose computed tomography lung cancer screening programs in clinical practice. Am. J. Respir. Crit. Care Med. 192, 881–891, https://doi.org/10.1164/rccm.201508-1671ST (2015).

10. 10.

Cressman, S. et al. Resource utilization and costs during the initial years of lung cancer screening with computed tomography in Canada. J. Thorac. Oncol. 9, 1449–1458, https://doi.org/10.1097/JTO.0000000000000283 (2014).

11. 11.

Goulart, B. H., Bensink, M. E., Mummy, D. G. & Ramsey, S. D. Lung cancer screening with low-dose computed tomography: costs, national expenditures, and cost-effectiveness. J. Natl. Compr. Cancer Netw 10, 267–275, https://doi.org/10.6004/jnccn.2012.0023 (2012).

12. 12.

Mauchley, D. C. & Mitchell, J. D. Current estimate of costs of lung cancer screening in the United States. Thorac. Surg. Clin 25, 205–215, https://doi.org/10.1016/j.thorsurg.2014.12.005 (2015).

13. 13.

Rasmussen, J. F. et al. Healthcare costs in the Danish randomised controlled lung cancer CT-screening trial: a registry study. Lung Cancer 83, 347–355, https://doi.org/10.1016/j.lungcan.2013.12.005 (2014).

14. 14.

Huber, A. et al. Performance of ultralow-dose CT with iterative reconstruction in lung cancer screening: limiting radiation exposure to the equivalent of conventional chest X-ray imaging. Eur. Radiol 26, 3643–3652, https://doi.org/10.1007/s00330-015-4192-3 (2016).

15. 15.

McCunney, R. J. & Li, J. Radiation risks in lung cancer screening programs: a comparison with nuclear industry workers and atomic bomb survivors. Chest 145, 618–624, https://doi.org/10.1378/chest.13-1420 (2014).

16. 16.

Murugan, V. A., Kalra, M. K., Rehani, M. & Digumarthy, S. R. Lung Cancer Screening: Computed Tomography Radiation and Protocols. J. Thorac. Imaging 30, 283–289, https://doi.org/10.1097/RTI.0000000000000150 (2015).

17. 17.

Christiani, D. C. Radiation risk from lung cancer screening: glowing in the dark? Chest 145, 439–440, https://doi.org/10.1378/chest.13-2588 (2014).

18. 18.

Hennessey, P. T. et al. Serum microRNA biomarkers for detection of non-small cell lung cancer. PLOS ONE 7, e32307, https://doi.org/10.1371/journal.pone.0032307 (2012).

19. 19.

Leidinger, P. et al. High-throughput qRT-PCR validation of blood microRNAs in non-small cell lung cancer. Oncotarget 7, 4611–4623, https://doi.org/10.18632/oncotarget.6566 (2016).

20. 20.

Montani, F. et al. miR-Test: a blood test for lung cancer early detection. J. Natl. Cancer Inst. 107, djv063, https://doi.org/10.1093/jnci/djv063 (2015).

21. 21.

Kohler, J. et al. Circulating U2 small nuclear RNA fragments as a diagnostic and prognostic biomarker in lung cancer patients. J. Cancer Res. Clin. Oncol 142, 795–805, https://doi.org/10.1007/s00432-015-2095-y (2016).

22. 22.

Cheng, L. L. et al. Enhanced resolution of proton NMR spectra of malignant lymph nodes using magic-angle spinning. Magn. Reson. Med. 36, 653–658, https://doi.org/10.1002/mrm.1910360502 (1996).

23. 23.

Cheng, L. L. et al. Quantitative neuropathology by high resolution magic angle spinning proton magnetic resonance spectroscopy. Proc. Natl. Acad. Sci. U.S.A. 94, 6408–6413, https://doi.org/10.1073/pnas.94.12.6408 (1997).

24. 24.

Zhai, R., Yu, X., Shafer, A., Wain, J. C. & Christiani, D. C. The impact of coexisting COPD on survival of patients with early-stage non-small cell lung cancer undergoing surgical resection. Chest 145, 346–353, https://doi.org/10.1378/chest.13-1176 (2014).

25. 25.

Cheng, L. L. et al. Quantification of microheterogeneity in glioblastoma multiforme with ex vivo high-resolution magic-angle spinning (HRMAS) proton magnetic resonance spectroscopy. Neuro-Oncol 2, 87–95, https://doi.org/10.1093/neuonc/2.2.87 (2000).

26. 26.

Cheng, L. L., Wu, C., Smith, M. R. & Gonzalez, R. G. Non-destructive quantitation of spermine in human prostate tissue samples using HRMAS 1H NMR spectroscopy at 9.4 T. FEBS Lett 494, 112–116, https://doi.org/10.1016/s0014-5793(01)02329-8 (2001).

27. 27.

Esteve, V., Celda, B. & Martinez-Bisbal, M. C. Use of 1H and 31P HRMAS to evaluate the relationship between quantitative alterations in metabolite concentrations and tissue features in human brain tumour biopsies. Anal. Bioanal. Chem. 403, 2611–2625, https://doi.org/10.1007/s00216-012-6001-z (2012).

28. 28.

Tzika, A. A. et al. Biochemical characterization of pediatric brain tumors by using in vivo and ex vivo magnetic resonance spectroscopy. J. Neurosurg 96, 1023–1031, https://doi.org/10.3171/jns.2002.96.6.1023 (2002).

29. 29.

De Vitto, H., Perez-Valencia, J. & Radosevich, J. A. Glutamine at focus: versatile roles in cancer. Tumor Biol. 37, 1541–1558, https://doi.org/10.1007/s13277-015-4671-9 (2016).

30. 30.

Antonov, A. et al. Bioinformatics analysis of the serine and glycine pathway in cancer cells. Oncotarget 5, 11004–11013, https://doi.org/10.18632/oncotarget.2668 (2014).

31. 31.

Martinez-Outschoorn, U. E., Peiris-Pages, M., Pestell, R. G., Sotgia, F. & Lisanti, M. P. Cancer metabolism: a therapeutic perspective. Nat. Rev. Clin. Oncol. 14, 11–31, https://doi.org/10.1038/nrclinonc.2016.60 (2017).

32. 32.

Koizume, S. & Miyagi, Y. Lipid Droplets: A Key Cellular Organelle Associated with Cancer Cell Survival under Normoxia and Hypoxia. Int. J. Mol. Sci. 17, e1430, https://doi.org/10.3390/ijms17091430 (2016).

33. 33.

Altman, B. J., Stine, Z. E. & Dang, C. V. From Krebs to clinic: glutamine metabolism to cancer therapy. Nat. Rev. Cancer 16, 749, https://doi.org/10.1038/nrc.2016.114 (2016).

34. 34.

Mayers, J. R. et al. Tissue of origin dictates branched-chain amino acid metabolism in mutant Kras-driven cancers. Science 353, 1161–1165, https://doi.org/10.1126/science.aaf5171 (2016).

35. 35.

Giskeodegard, G. F. et al. Lactate and glycine-potential MR biomarkers of prognosis in estrogen receptor-positive breast cancers. NMR Biomed. 25, 1271–1279, https://doi.org/10.1002/nbm.2798 (2012).

36. 36.

Jordan, K. W. et al. Comparison of squamous cell carcinoma and adenocarcinoma of the lung by metabolomic analysis of tissue-serum pairs. Lung Cancer 68, 44–50, https://doi.org/10.1016/j.lungcan.2009.05.012 (2010).

## Acknowledgements

We kindly thank J.A. Fordham for editorial assistance. Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under award Numbers R01CA115746 and R21CA162959 (Cheng) and U01CA209414 (Christiani). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We gratefully acknowledge the support of the Massachusetts General Hospital Athinoula A. Martinos Center for Biomedical Imaging.

## Author information

Authors

### Contributions

Y.B. developed the spectroscopy quantification tool and the pathology calibration model, guided statistical analysis, wrote the manuscript, and edited and approved the final manuscript. L.A.V. interpreted the data, wrote the manuscript, and edited and approved the final manuscript. I.W. performed the spectroscopy measurements and histopathological preparations, and edited and approved the final manuscript. L.S. collected and provided the samples, performed pathology work and obtained clinical data, and edited and approved the final manuscript. J.K. performed the spectroscopy measurements and histopathological preparations, and edited and approved the final manuscript. A.S. performed the spectroscopy measurements and histopathological preparations, and edited and approved the final manuscript. S.S.D. interpreted the data, conceived of and created figures, and edited and approved the final manuscript. P.H. interpreted the data, and edited and approved the final manuscript. J.N. interpreted the data, and edited and approved the final manuscript. E.M. interpreted tissue pathology, and edited and approved the final manuscript. M.J.A. guided statistical analysis, and edited and approved the final manuscript. D.C.C. designed the experiment, provided funds, supervised collection of the samples and clinical data, interpreted the data, wrote the manuscript, and edited and approved the final manuscript. L.L.C. designed the experiment, provided funds, supervised spectroscopy measurements and histopathological preparations, conducted statistical analysis, interpreted the data, wrote the manuscript, and edited and approved the final manuscript.

### Corresponding authors

Correspondence to David C. Christiani or Leo L. Cheng.

## Ethics declarations

### Competing Interests

The authors declare no competing interests.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

### 41598_2019_46643_MOESM1_ESM.pdf

Magnetic Resonance Spectroscopy-based Metabolomic Biomarkers for Typing, Staging, and Survival Estimation of Early-Stage Human Lung Cancer

## Rights and permissions

Reprints and Permissions

Berker, Y., Vandergrift, L.A., Wagner, I. et al. Magnetic Resonance Spectroscopy-based Metabolomic Biomarkers for Typing, Staging, and Survival Estimation of Early-Stage Human Lung Cancer. Sci Rep 9, 10319 (2019). https://doi.org/10.1038/s41598-019-46643-5

• Accepted:

• Published:

• ### Time–frequency analysis of serum with proton nuclear magnetic resonance for diagnosis of pancreatic cancer

• Asahi Sato
• , Toshihiko Masui
• , Takashi Ito
• , Keiko Hirakawa
• , Yoshimasa Kanawaku
• , Kaoru Koike
•  & Shinji Uemoto

Scientific Reports (2020)

• ### Elevated levels of circulating betahydroxybutyrate in pituitary tumor patients may differentiate prolactinomas from other immunohistochemical subtypes

• Omkar B. Ijare
• , Cole Holan
• , Jonathan Hebert
• , Martyn A. Sharpe
•  & Kumar Pichumani

Scientific Reports (2020)