Main

Prostate cancer is the most frequently diagnosed malignancy and the third leading cause of cancer mortality after lung and colorectal cancer among men in developed populations (Torre et al, 2015). Its aetiology remains largely unknown, with few established risk factors, other than age, race, family history of prostate cancer, and a number of low-penetrance genetic variants (Bostwick et al, 2004; Chan et al, 2005; Wolk, 2005; Kote-Jarai et al, 2011; Barbieri et al, 2012; Allott et al, 2013). Recent advances in liquid or gas chromatography, mass spectrometry, and nuclear magnetic resonance methods have facilitated measurement of hundreds or thousands of low molecular weight, 80–1000 Da metabolites in biological samples such as serum, urine, and tissue. Measurement of the metabolome provides an integrated assessment of exogenous and endogenous exposures, and the host response to them, and has the potential to elucidate novel disease associations and biological mechanisms.

Two recent studies (Mondul et al, 2014, 2015) in the Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) cohort identified metabolites related to aggressive prostate cancer in sera collected two decades prior to diagnosis. Lipid and energy metabolites including alpha-ketoglutarate (AKG), citrate, inositol-1-phosphate, several glycerophospholipids, and fatty acids, were lower in aggressive cases than in controls (Mondul et al, 2015). The present investigation examines the prospective metabolomic profile of prostate cancer risk in a case–control study nested within the Prostate, Lung, Colorectal, and Ovarian Cancer Screening (PLCO) Trial.

Materials and methods

Study population

The PLCO trial was a large randomised-controlled trial designed to evaluate the efficacy of screening methods for prostate, lung, colorectal, or ovarian cancer. Participants without a history of these malignancies, ages 55 to 74 years, were enroled from 10 centres in the USA between 1993 and 2001, and were randomly assigned to either the screening or non-screening arm (Gohagan et al, 2000; Prorok et al, 2000; Hayes et al, 2005). Blood samples were collected from participants in the screening arm only, between 0700 hours and 1600 hours without regard to the fasting status, and were aliquoted within 2 h and stored at −70 °C. The cancer screenings included chest x-ray, flexible sigmoidoscopy, and annual serum total prostate-specific antigen (PSA) measurement (for 6 years) and digital rectal examination (DRE; for 4 years). Men with an elevated PSA of 4 ng ml−1 or a DRE suspicious for prostate cancer were referred to their medical-care providers for further prostate cancer work-up. Information for prostate cancer diagnoses based on abnormal screening results was retrieved from medical and pathology records. Additional prostate cancer diagnoses, including post-screening trial cases were identified through annual self-reported questionnaire responses and through the National Death Index, followed by medical record review and confirmation. All men (n=28 243) were followed from their initial prostate cancer screen, to the date of prostate cancer diagnosis, date of death (National Death Index), censor date (26 April 2012), or loss to follow-up, whichever occurred first.

Participants completed self-administered questionnaires at enrolment that included data regarding sociodemographics, height, weight, smoking behaviour, family history of cancer, and other diseases, physical activity, use of selected medications, and recent history of screening exams. All participants provided written informed consent, and the PLCO study was approved by the institutional review boards of the U.S. National Cancer Institute and the 10 PLCO screening centres.

Case identification and control selection

In order to increase time from blood collection to clinical diagnoses, 380 prostate cancer cases were randomly chosen from participants diagnosed during the post-screening trial period (4.4–17.0 years after baseline). On the basis of previous study findings (Mondul et al, 2014;, 2015), aggressive cases were oversampled (n=298) and defined as those diagnosed with stage III or IV based on the tumour-node-metastasis staging system (Fleming et al, 1997), or a biopsy Gleason score 8. Cases with a Gleason score sum of 7 and stage II were considered to be of an intermediate level of aggressiveness (n=82). Because baseline blood samples had been depleted by many previous prostate cancer studies, we analysed serum collected at the first or second screening year visit. Using the method of incidence-density sampling without replacement, we randomly selected 380 controls that were free from any cancer at the time of case diagnosis, and individually matched them to cases by age (within 5 years), race, study centre, study year, and date (within 30 days) of blood collection.

Metabolite assays

Serum metabolomic profiling was conducted on a high-resolution accurate mass platform of ultrahigh-performance liquid chromatography/mass spectroscopy and gas chromatograph/mass spectroscopy (GC-MS; Metabolon, Durham, NC, USA). Workflow including extraction of raw data, peak-identification, and quality control (QC) processed on the assay platform, has been described (Evans et al, 2009; Dehaven et al, 2010). Metabolon measured values for a total of 722 identified metabolites. We excluded 27 metabolites for which >500 participants (66%) had missing values (below limit of detection), leaving 695 identified compounds for analysis. Metabolites were categorised as belonging to one of eight mutually exclusive chemical classes: amino acids and amino acid derivatives (subsequently referred to as ‘amino acids’), carbohydrates, cofactors and vitamins, energy metabolites, lipids, nucleotides, peptides, or xenobiotics. Each batch included blinded quality control samples (9%) from two male individuals. To assess the technical reliability of data, coefficients of variation (median=0.2, interquartile range=0.12–0.32), and intraclass correlation coefficients (median=0.87, interquartile range=0.64–0.98) were calculated as measures of QC. The cases and their matched controls were assayed within the same batches in order to avoid any effect of batch differences on the risk estimates.

Statistical analysis

Baseline characteristics of cases and controls were compared by either Wilcoxon rank sum or χ2-tests, for continuous or categorical variables, respectively. The signal of each metabolite was normalised within a given batch to standardise the batch variability, and for missing (below limit of detection) values, we imputed the minimum non-missing value. Metabolite values were log-transformed for analysis. Conditional logistic regression was used to examine odds ratios (ORs) and their 95% confidence intervals (CI), for the association between prostate cancer and each log-metabolite signal with an 80th percentile increase. Only matching factors were included in the final models. In addition to the matching factors, we performed sensitivity analyses to include BMI (<25 kg m−2, 25–30 kg m−2, or 30 kg m−2) in the model, in addition to several other potential confounding factors. These included smoking status (never, former, or current), diabetes (yes or no), height (<175 cm, 175–180 cm, or 180 cm), physical activity (<1 h per week, 1–3 h per week, or 4 h per week), alcohol consumption (<0.05 drinks per day, 0.05–1 drinks per day, or 1 drinks per day), processed meat consumption (<6.6 g per day, 6.6–16.8 g per day, or 16.8 g per day), red meat consumption (<20.4 g per day, 20.4–43.7 g per day, or 43.7 g per day), and total fat intake (<60.3 g per day, 60.3–88 g per day, or 88 g per day). According to a Bonferroni correction for 695 tests, the threshold for statistical significance in our analysis is P=0.000072. However, this threshold is highly stringent due to the inter-correlations between many metabolites. We therefore also used principle component analysis (Jolliffe, 2005), and explored whether the grouped metabolites can distinguish case–control status. The top 10 principle components of metabolite measurements were calculated, and the same approach of conditional logistic regression (log-level with an 80th percentile increase) was applied to examine whether these components were associated with overall or aggressive prostate cancer. We also used a false-discovery rate (FDR) of 20% to define the significance threshold. We used gene-set analysis (GSA), a standard pathway analysis, to examine whether pre-defined metabolic super- and sub-pathways were associated with prostate cancer (Subramanian et al, 2005).

We performed additional analyses restricting to aggressive or non-aggressive cases, non-Hispanic white men, black men, and men of other races, and stratified by age at enrolment (<65 vs 65+ years) and follow-up time (<10 years, 10 years). The analysis by race was conducted, in part, for comparison with prior studies of Caucasian populations (Mondul et al, 2014, 2015).

Analyses were performed with SAS software version 9.3 (SAS Institute, Cary, NC, USA), and the GSA analysis was performed with the R statistical language version 3.2.3 (Vienna, Austria). All reported P-values are two-sided.

Results

Median time from blood collection to prostate cancer diagnosis was 10 years (inter-decile range=7.2–13.0 years) and median age at diagnosis was 72 years. Compared with controls, cases were more likely to have higher serum PSA and a family history of prostate cancer, and were more physically active at age 40 years, but were similar with respect other characteristics at study baseline (Table 1).

Table 1 Selected baseline characteristics of the cases and controls, PLCO studya

Metabolites related to risk of overall, aggressive and non-aggressive prostate cancer with a nominal P-value of <0.05 are shown in Tables 2, 3, 4 sorted by chemical class, sub-pathway and P-values. All metabolite associations with overall prostate cancer risk are presented in Supplementary Table 1. None of the P-values were below the threshold of correction for multiple comparisons or with a FDR<20%. The amino acids pyroglutamine (pGLU), phenylpyruvate, and N-acetylcitrulline, as well as the peptide gamma-glutamylphenylalanine, yielded the strongest signals, being inversely associated with overall prostate cancer (0.001P<0.004, Table 2), whereas the acylcarnitine metabolite stearoylcarnitine was positively associated (OR=1.74, Table 2). Findings were similar for aggressive disease, with the peptide class of compounds being inversely associated (Table 3, P=0.027), but not with non-aggressive cancer. For non-aggressive disease, alpha-tocopherol, primary bile acid, and steroid hormone metabolites were well-represented and inversely associated with risk, as was cyclic AMP (Table 4). Stratification based on median time to diagnosis of aggressive disease revealed long-chain fatty acids, monohydroxy fatty acids, acylcarnitines, monoacylglycerol lipids, and lysolipids being positively associated among cases diagnosed within 10 years, and citrate, sphingolipid, steroid hormone, and glutathione metabolites associated among cases diagnosed 10–20 years after blood collection (data not shown).

Table 2 OR and 95% CI from conditional logistic regression of overall prostate cancer comparision of the 90th and 10th percentiles for serum metabolites (P<0.05) sorted by chemical class, sub-pathway, and P-valuea,b
Table 3 OR and 95% CI from conditional logistic regression of aggressive prostate cancer comparision of the 90th and 10th percentiles for serum metabolites (P<0.05) sorted by chemical class, sub-pathway, and P-valuea,b
Table 4 OR and 95% CI from conditional logistic regression of non-aggressive prostate cancer comparision of the 90th and 10th percentiles for serum metabolites (P<0.05) sorted by chemical class, sub-pathway, and P-valuea,b

These findings were not materially changed when adjusted for BMI (Supplementary Table 2) or for BMI plus several additional factors including smoking status, diabetes, height, physical activity, alcohol consumption, processed and red meat consumption, and total fat intake (Supplementary Table 3). These findings were also not substantially altered when adjusted for serum total PSA (data not shown), and were essentially similar in non-Hispanic white, black, and other races combined groups, although with some differences in the latter two populations that are probably due to small sample sizes (Supplementary Tables 4–6). We found no material interactions between the top 27 metabolites and other risk factors, including age, BMI, and smoking (data not shown).

Although none of the top 10 principle components were significantly associated with overall or aggressive prostate cancer, with all tests having P-values>0.025 (P<0.005 was the significant level), the metabolite sub-pathway analysis revealed that branched and medium-chain fatty acid metabolites, as well as the tocopherol metabolites, were inversely associated with overall prostate cancer (P=0.016, 0.032, and 0.017, respectively, Table 5). Aggressive disease was positively associated with the branched-chain fatty acids P=0.015), and inversely related to the tryptophan and urea cycle/arginine/proline metabolites (P=0.037 and 0.046, respectively). The tocopherol and primary bile acid metabolite subclasses were inversely related to non-aggressive cancer (P=0.014 and 0.021, respectively, Table 5).

Table 5 GSA for sub-pathway of serum metabolites and prostate cancer (P<0.05)

Among the top metabolites related to prostate cancer in our earlier investigation (Mondul et al, 2015) that served as our a priori hypotheses for the present analysis, only three associations with overall prostate cancer replicated in the present data: the lipids 1-palmitoleoyl-2-linoleoyl-GPC (16:1/18:2) (OR=0.60, 95% CI=0.40–0.88, P=0.0096) and tauro-beta-muricholate (OR=0.68, 95% CI: 0.48–0.97, P=0.033), and the nucleotide 2′-deoxyuridine (OR=1.47, 95% CI: 1.07–2.03, P=0.019; Table 2). By contrast, the inverse association between aggressive prostate cancer, and the energy metabolites, (Mondul et al, 2015) alpha-AKG or citrate, did not replicate, and appeared to have opposite associations (AKG OR=1.45, P=0.098, and citrate OR=1.48, P=0.077). It is noteworthy that these positive associations appeared stronger among cases diagnosed 10–20 years after blood collection (AKG OR=1.83, 95% CI 0.99–3.38, P=0.053; citrate OR=2.04, 95% CI 1.06–3.91, P=0.032) compared with those diagnosed within 10 years (OR=1.11, P=0.76 for AKG, and OR=1.13, P=0.68 for citrate). The present data also did not replicate the previous positive associations with aggressive prostate cancer for phenylpyruvate, thyroxine, and arginine (Table 3), and trimethylamine-N-oxide (data not shown).

Discussion

Top metabolite signals for overall prostate cancer risk observed here were pGLU, gamma-glutamylphenylalanine, phenylpyruvate, N-acetylcitrulline, and stearoylcarnitine. Peptide metabolites were associated with aggressive disease, whereas most of the previously reported lipid and energy metabolites were not. These profiles appeared similar across racial/ethnic groups, although our sample size was small for non-white men. By contrast, risk of non-aggressive disease was related to tocopherols, sex steroids, and primary bile acids.

Pyroglutamine is a key amino acid residue for thyrotrophin-releasing hormone (TRH) and TRH-like peptides (Huber et al, 1998) that promote fertilisation by enhancing sperm capacitation (Cockle et al, 1994; Huber et al, 1998), act as paracrine factors influencing prostate cell growth and differentiation (Bilek et al, 1992), and stimulate thyroid hormone secretion (Bilek et al, 1991; Bilek, 2000; Maran et al, 2001). Gamma-glutamylphenylalanine and gamma-glutamylglycine peptides were also inversely associated with aggressive prostate cancer, similar to the gamma-glutamylhistidine finding in the ATBC Study (Mondul et al, 2015). These metabolites are glutathione (GSH) degradation products of gamma-glutamyl transpeptidase activity (Zhang et al, 2005) that regenerates intracellular GSH (Wu et al, 2004; Zhang et al, 2005) for detoxification of reactive oxygen species (Wu et al, 2004; Maher, 2005; Zhang et al, 2005). Several N-acetyl amino acids including N-acetylcitrulline, N-acetylarginine, N-acetyltryptophan, and N-acetylkynurenine were also reduced in men later diagnosed with prostate cancer, possibly indicating increased post-translational protein N-terminal acetylation (Aksnes et al, 2015) and aminoacetylase activity, the latter having been related to increased cell proliferation and carcinogenesis (Aksnes et al, 2015), and histone-chromatin gene regulation (Eberharter and Becker, 2002). Also inversely associated were tryptophan metabolites that influence the role of inflammation in cancer progression (Prendergast et al, 2014; Santhanam et al, 2016). Tryptophan and kynurenine pathway metabolites have been previously associated with several, particularly gastrointestinal, cancers (Uyttenhove et al, 2003; Witkiewicz et al, 2008; Liu et al, 2009; Koblish et al, 2010; Balachandran et al, 2011; Zhang et al, 2011a;, 2011b; Ferdinande et al, 2012; Zhang et al, 2013).

The proliferative anabolic state of prostate adenocarcinomas requires active biosynthesis of amino acids, nucleic acids, peptides, and lipids (Gang et al, 2016), with dynamic metabolomic phenotype stages being likely. Aberrations in lipid metabolites and especially fatty acid metabolism have been extensively studied with respect to prostate carcinogenesis (Harvei et al, 1997; Norrish et al, 1999; Mannisto et al, 2003; Chavarro et al, 2007; Crowe et al, 2008; Bassett et al, 2013; Crowe et al, 2014; Wu et al, 2014). De novo fatty acid biosynthesis is upregulated in aggressive cancers (Nomura et al, 2010, 2011; Carracedo et al, 2013), and the present data appear consistent with this only with respect to branched-chain fatty acids. The lower serum lipid metabolomic profile previously demonstrated for aggressive prostate cancer risk encompassed not only fatty acids, presumably to provide energy for cell proliferation and tumour growth, but also a large number of lysolipids, inositols, and sphingomyelins that implicated cell membrane biosynthesis and signalling (Kuemmerle et al, 2011). Fatty acids rather than the latter compounds were more prominent in the present study and positively associated with risk, possibly indicating triglyceride mobilisation from adipose tissue and increased fatty acid synthase activity leading to increased acylcarnitine-mediated long-chain fatty acid transport into the mitochondria matrix for β-oxidation and ATP synthesis (Carter et al, 1995; Crowe et al, 2008). Despite laboratory evidence that eicosapentaenoic acid can inhibit prostate tumourigenesis (Cave, 1991; Rose, 1997; Astorg, 2004; Larsson et al, 2004; Kobayashi et al, 2006), we found this omega-3 fatty acid to be related to higher risk of aggressive prostate cancer, which is consistent with two studies (Crowe et al, 2008, 2014).

Our findings regarding non-aggressive prostate cancer are also of interest and differ substantially from those for aggressive disease. Notably, we detected an inverse risk association signal for alpha-tocopherol metabolites, consistent with both significantly lower prostate cancer incidence in one low-dose trial (Heinonen et al, 1998) and a recent pooled analysis of 15 international cohorts (Key et al, 2015). Several sex steroids also showed inverse risk associations with non-aggressive prostate cancer, including androgen precursors (e.g., DHEA-sulfate) and metabolites (e.g., 4-androstene-3beta,17 diol-sulfate). (Testosterone was also inversely associated, but more weakly (OR=0.76; P=0.11)). By contrast, only one sex steroid was related to aggressive disease and in a positive direction, more in line with the role androgens are thought to have to promote prostate carcinogenesis. Whether this stage difference represents true underlying biological variation in the influence of androgens on early vs aggressive disease will require further study.

Two other findings of inverse associations with non-aggressive disease risk deserve mention. 2-Aminoadipate (as well as pipecolate) is involved in lysine metabolism, and has been found elevated in malignant (T2 and T3) vs adjacent normal prostate tissue and related to earlier disease recurrence (Jung et al, 2013). One of our strongest signals, cyclic AMP, is synthesised from ATP by plasma membrane-associated, hormonally responsive adenylate cyclase, and serves as an intracellular second messenger that mediates several hormones.

As compared with data from the ATBC Study that showed inverse prostate cancer associations with the energy and lipid compounds, including AKG, citrate, inositol-1-phosphate, glycerophospholipids, and fatty acids (Mondul et al, 2015), the present findings portray a very different metabolite-risk profile that features primarily amino acids and peptides. Additional prospective investigations will be needed to interpret the two distinct patterns, but several differences in study designs and populations are likely responsible. Importantly, cases in PLCO were diagnosed within 15 years post-trial following up to four to six prostate screening examinations (i.e., DRE and serum PSA), whereas the cases in ATBC were clinically diagnosed within 20 years of serum collection without systematic PSA and DRE screening. Mean baseline total PSA concentrations of cases were 2 ng ml−1 and 8 ng ml−1 in PLCO and ATBC, respectively, and even the PLCO cases diagnosed with aggressive disease had a mean baseline PSA concentration of only 2 ng ml−1. These and other study differences likely resulted in examination of developmentally distinct metabolic phases of growing prostate adenocarcinomas related to primary tumour size and disease extent.

Time of serum collection and fasting status also differed markedly between the studies; that is, between 0700 hours and 1600 hours in PLCO without regard to time of last meal, vs 0800 hours to 1100 hours after an overnight fast in ATBC. Although the effects of non-fasting status on most of the circulating metabolites we examined have not been rigorously evaluated, some previous studies indicate minimal changes in PSA (Tuncel et al, 2005), lipids (Langsted et al, 2008; Mora et al, 2008), and amino acids such as homocysteine (Fokkema et al, 2003), and one report suggests that fasting status is not an important source of variability in measurements of over 200 metabolites (Townsend et al, 2016). The different populations, US vs Finland, as well as the fact that ATBC included only smokers of at least five cigarettes per day at study entry may also have influenced the divergent findings.

The major strengths of this investigation include use of serum samples that were collected up to two decades prior to prostate cancer diagnosis and good laboratory reproducibility for several hundred metabolites. Our sample size was moderate, albeit the largest to date for a prospective study of the circulating metabolome and risk of prostate cancer. The impact of the PLCO prostate cancer screening on the stages of cases diagnosed and the non-fasting serologic status may have limited the robustness and sensitivity of replicating the previous ATBC findings. In addition, although we could not rule out the possibility of unmeasured confounding factors explaining some of the present findings, including some related to selection of trial post-screening cases, the materially unchanged results from sensitivity analyses that adjusted for several potential risk factors argues against such biases.

In conclusion, the present study demonstrates a unique metabolomic profile associated with post-screening prostate cancer in the PLCO cohort. The inverse associations for amino acid and peptide metabolites, and positive associations for lipids, contrast with the energy/lipid profile previously reported (Mondul et al, 2014, 2015). This is likely the result of studying prostate cancer cases with serum samples collected at an earlier point in the natural history of the disease due to the highly screened nature of the study population, rather than more clinically advanced, yet still undiagnosed, malignancies. The two distinct metabolite profiles may represent molecular species influencing risk of prostate cancers at different points in their development, including early initiating or tumour promoting metabolic states. Whether the inverse association with pGLU is related to a role for thyrotropin status of early prostate cancer will require further study. The metabolomic profiles identified to date should be re-examined in additional prospective analyses of prostate cancer in larger studies and consortia.