Mounting evidence suggests that whole grain (WG) intake plays an important role in chronic disease prevention. However, numerous human studies have failed to produce clear-cut conclusions on this topic. Here, a combination of non-targeted and targeted metabolomics approaches, together with kinetic studies, was used to investigate biomarkers of WG wheat intake and further explore the diet-disease associations. Via these integrated approaches, forty-one compounds were identified as the most discriminating endogenous metabolites after WG versus refined grain (RG) wheat bread consumption. The corresponding biological assessment of these endogenous changes suggests that, in contrast to RG consumption, WG wheat consumption may facilitate antioxidant defense systems and moderate the risk factors of cancer, cardiovascular diseases, and other chronic diseases. A panel of urinary markers consisting of seven alkylresorcinol metabolites and five benzoxazinoid derivatives as specific biomarkers, as well as five phenolic acid derivatives, was also established to cover multiple time points and longer time periods for correctly and objectively monitoring WG wheat intake. Through these findings, we have established a comprehensive biomarker pool to better assess WG wheat consumption, and to monitor the endogenous changes that are linked to health effects of WG wheat consumption.
Metabolomics is the comprehensive analysis of all metabolites in a biological system1, and has been applied in various areas to quantitatively assess biochemical fluxes and metabolites that are indicative of unusual biological or environmental perturbations2. Metabolic analysis is typically categorized into two complementary methods: targeted and non-targeted. The non-targeted analysis measures all small molecules including endogenous and exogenous metabolites in biological samples and then identifies potential and putative metabolites of interest for further tests. In contrast, a targeted approach mainly focuses on the identification and quantification of selected metabolites3. Recently, nutrimetabolomics, which refers to metabolomics in nutritional sciences, has been developed to explore the complex relationships between the dietary consumption and health outcomes in animals and to also investigate the endogenous changes after dietary intake4.
Whole grains (WG) contain endosperm, germ, and bran, in contrast to refined grains (RG) which have the germ and bran removed during the milling process5. Modern nutritional epidemiology indicates that WG intake, but not RG intake, is inversely associated with the risk of major chronic diseases, such as cancer6,7,8, cardiovascular diseases (CVD)7,9,10, type 2 diabetes6,7,11, and obesity12,13,14. However, the outcomes of large-scale prospective cohort studies or human intervention studies testing the causality of these relationships have often proved inconclusive or have failed to demonstrate causality of cancers15,16, CVD17,18, and diabetes18,19. Some metabolomics studies have tried to predict the diet-disease associations by interpreting the putative links between the risk factors of diseases and certain endogenous changes20,21. However, in these studies only spot urine or plasma samples were collected, and the determined endogenous metabolites alterations were limited as well, which may have clouded the diet-disease association. The accumulation and excretion of the postprandial metabolites can be monitored by kinetic studies22,23. Therefore, non-targeted metabolomics approaches coupled with kinetic analysis in a diet-controlled trial may enhance the determination of changes in numerous endogenous metabolites and thus facilitate the estimation for health effects of WG intake.
It is challenging to accurately measure WG intake with the traditional self-assessment approaches typically used in large observational studies such as food journals and food frequency questionnaires due to inherent limitations24. Moreover, translation of food intake into energy, nutrients, and bioactive food components is heavily dependent on food composition tables25. Measurement errors associated with assessment methods further compound the problem of dietary estimates and may also obscure the diet-disease associations. For this reason, there is a pressing need for dietary biomarkers to better capture exposure. To date, WG alkylresorcinols (ARs) and their metabolites 3,5-dihydroxybenzoic acid (3,5-DHBA) and 3-(3,5-dihydroxyphenyl)-1-propanoic acid (3,5-DHPPA) have been developed as major potential exposure biomarkers for WG wheat and rye intake in epidemiological studies26,27. In addition, benzoxazinoid (BX) derivatives, such as 2-hydroxy-N-(2-hydroxyphenyl)acetamide (HHPAA) and N-(2-hydroxyphenyl)acetamide (HPAA), have recently been identified as alternative biomarkers to discriminate WG wheat and rye consumers from control group28. However, there are limitations for ARs, BXs, or their metabolites when as single use exposure biomarkers of WG wheat and rye intake in cohort studies. ARs are merely short- to medium-term biomarkers of intake of WG wheat and rye, with estimated apparent half-lives and absorption half-lives in plasma at ~5 h and 6–8 h, respectively29. The poor/moderate reproducibility for AR metabolites, 3,5-DHBA and 3,5-DHPPA30, may also limit the use of single measurements of these metabolites in cohort studies, and BXs are also found in WGs of maize, wild barleys, and other human plant food31. Therefore, the discovery of more specific biomarkers for WG consumption would give birth to a better assessment of compliance in large-scale human studies.
Whole wheat is one of the top ten largest-selling baked goods in supermarkets in USA32. Determination of specific WG wheat biomarkers helps to better assess whole wheat consumption in epidemiological studies. In the present study, a non-targeted metabolomics approach was applied to analyze all metabolites, including dietary exposures and endogenous biomarkers, in urine samples collected from WG wheat bread- and RG wheat bread-consumers, and a targeted metabolomics approach was utilized to further investigate the metabolism of specific WG wheat phytochemicals. Concomitantly, kinetic studies were applied with both approaches. The objective of this study was to understand the impact of dietary WG wheat intake on the endogenous metabolome. Another aim was to generate new insights into biological mechanisms underlying the health-related effects of WG wheat consumption on major chronic diseases, including cancers, CVD, and diabetes. Efforts were also made to identify more specific biomarkers that precisely define dietary exposures and thus provide better estimates of diet-disease associations.
Non-targeted metabolomics: endogenous and exogenous urinary metabolites
The metabolomics data was analyzed by three different Orthogonal Partial Least Squares (OPLS) analyses, including 1) regular OPLS discriminant analysis (OPLS-DA), 2) OPLS effect projections (OPLS-EP), an OPLS analysis for paired data, and 3) OPLS-DA based batch processing analysis, an OPLS method for multiway data.
The regular OPLS-DA was applied to investigate differences in the urinary metabolomics profiles between the WG and RG samples. All of the intervals were included in one analysis to examine the overall pattern of the data sets (Fig. 1a). Altogether, 579 metabolites were profiled. Between the two interventions, the best separations were observed from urine samples collected at 0–2 h, 2–4 h, and 4–6 h followed by samples at 6–9 h and 9–12 h (Fig. 1a,b). The data from urine samples collected at 12–24 h were overlapped between the two treatments (Fig. 1a,b). To verify the validity of the separation at each interval between the WG and RG groups, OPLS-DA was further performed at each interval separately, and then permutation-based validation was used to determine if the separation was statistically valid. As a result, only separations at 2–4 h (Fig. 2a) and 4–6 h (Fig. 2c) passed permutation-based validation and were found to be valid. The values of R2Y = 1 and Q2Y = 0.688 for interval of 2–4 h, and R2Y = 1 and Q2Y = 0.596 for interval of 4–6 h, indicate good model fit and predictive ability. The predictive ability of this model was found to be better than at least 95% of the models built using the 999 permutated datasets, therefore the probability that the separations at 2–4 h and 4–6 h were due to chance was less than 5%, i.e. the separations were found to be significant (Fig. 2b,d).
Because of the repeated measures design, OPLS-EP, a multivariate data analysis method specifically designed for paired data33, was applied to compare WG and RG samples at respective intervals of 2–4 h and 4–6 h. The model fit and predictive ability of the OPLS-EP models for 2–4 h (R2Y = 1, Q2Y = 0.792) and 4–6 h (R2Y = 0.96, Q2Y = 0.725) were found to be even better than OPLS-DA. To identify the most discriminating urinary metabolic differences after WG versus RG consumption, a cut-off of Variability Influence on Projection (VIP) >1.5 in OPLS-EP was applied to the two intervals of 2–4 h and 4–6 h, separately. The metabolites with VIP >1.5 and p ≤ 0.05 at the 2–4 h and/or 4–6 h intervals were considered as the most discriminating metabolites. On the basis of clustered mass features, exact masses for in-source fragments, and experimental MS/MS fragmentations, 62 compounds were identified (Table 1). Their super/subpathways and statistical parameters including fold changes, VIP scores from OPLS-EP, and p and q values from repeated measures ANOVA are summarized in Table 1. Meanwhile, kinetic curves for 62 metabolites in the two groups were generated as box plots (Figs 3,4 and 5 and Supplementary Fig. 1). These discriminating metabolites were classified into two groups: endogenous metabolites and food-derived metabolites (exogenous metabolites). Among them, 41 metabolites are endogenous with seven elevated and two decreased at 2–4 h of interval, two elevated and 25 decreased at 4–6 h of interval, and five decreased at both 2–4 h and 4–6 h in WG samples; and 21 metabolites are food-derived with six increased at 2–4 h of interval, three decreased at 4–6 h of interval, and nine elevated and three decreased at both 2–4 h and 4–6 h in WG samples (Table 1).
To explicitly include the time dimension in the analysis, OPLS-DA based batch processing analysis was applied. Batch processing analysis has been used to analyze multiway data, including the analysis of time series metabolomic data34,35. This analysis operates on two levels. At the lower level, OPLS regression was performed to study trajectories of samples along the time dimension. At the upper level (batch level), OPLS-DA was performed to differentiate WG and RG samples when all of the time points were considered. The trajectories were not found to be markedly different between WG and RG samples (data not shown). This is not unexpected, because the lower level OPLS analysis was intended to find metabolic variations related to time, and was not aimed to differentiate WG and RG samples. On the other hand, the upper level OPLS-DA, which was aimed to provide differentiation, was able to separate WG and RG samples (Supplementary Fig. 2a). However, the separation did not pass permutation-based validation (Supplementary Fig. 2b). Altogether, the batch analysis suggests that there was some overall difference between WG and RG samples when all of the time points were considered. However, the overall difference was not strong enough to reach statistical significance. The significant difference between WG and RG was only observed at intervals of 2–4 h and 4–6 h, separately, as suggested by regular OPLS-DA (Fig. 2) and OPLS-EP (Table 1).
Changes of endogenous metabolites
Non-targeted metabolomics is a powerful tool to determine the changes in endogenous metabolites between two interventions. The urinary endogenous changes determined in the present study could be classified into three groups as follows.
Accumulation of cysteine and related thiol compounds within the WG group
When the WG group was compared to the RG group, significant increases in urinary cysteine were seen across the intervals (0–2, 2–4, 4–6, 6–9, 9–12, and 12–24 h) of the analysis (Table 1 and Fig. 3), regardless of the presence of cysteine in the body endogenously and exogenously36. Although cysetine-related thiol compounds, N-acetylcysteine (NAC), and cysteinylglycine, did not pass the cut-off VIP >1.5 in OPLS-EP analysis, their VIP scores were greater than 1.2 at 2–4 h after WG versus RG consumption (Supplementary Table 1). The excretion of urinary NAC in the WG group was found to be much higher than that in the RG group during all intervals (Supplementary Table 1 and Fig. 3). Elevated cysteinylglycine in the WG group was also observed in 12 h after intake compared to the RG group (Supplementary Table 1 and Fig. 3). These observations suggest that WG consumption accumulates cysteine and related thiol compounds in the body.
Significant differences with regards to nitrogen balance between WG and RG groups
It is largely accepted that nitrogen balance reflects both protein and energy intake from the diet37. The WG consumers exhibited significant decreases in several urinary metabolites related to nucleosides following WG wheat consumption compared to the RG group (Table 1 and Fig. 4). As shown, the levels of adenine, uridine, thymine, and adenosine in the WG group were reduced at the 0–2, 2–4, and 4–6 h intervals (Fig. 4). Moreover, decreases in pseudouridine (Pseu) in the WG group were extended to 9–12 h when compared to the control group (Fig. 4). Likewise, the excretion of 7-methylguanine (m7Gua) in the WG consumers was reduced at 12 h after consumption (Fig. 4). The WG consumption also decreased the levels of N1-methyladenosine (m1A), N6-carbamoylthreonyladenosine (t6A), N2,N2-dimethylguanosine (m22G), and 3-methylcytidine, from 0–2 to 9–12 h (Table 1 and Fig. 4). Decreases in urate in the WG group were also found from 0–2 to 6–9 h after intake (Fig. 4). In addition to metabolites related to nucleosides, WG intake also inversely correlated with metabolites of some amino acids, such as tryptophan and proline (Table 1 and Fig. 5). Decreases in indolelactate in the WG group were seen across all intervals (Table 1 and Fig. 5). Decreased levels of trans-4-hydroxyproline in the WG consumers were observed from the starting interval (0–2 h) to the late interval (9–12 h) (Fig. 5). On the other hand, the WG group displayed significant increases in dopamine (DA) sulfate in the WG consumers in the early intervals (<9 h) after consumption (Fig. 5).
Metabolism of lipids and carbohydrates
The WG consumption displayed an increase in urinary azelaic acid (AzA) in 12 h after intake (Table 1 and Fig. 5). When the samples of the WG and RG groups were compared, a decrease in ribulose/xylulose in the WG consumers were also observed (Table 1 and Fig. 5).
Changes of exogenous metabolites
With regard to food-derived metabolites, significant increases in 3,5-DHBA, and HPAA sulfate in the WG group (Table 1 and Supplementary Fig. 1) were attributed to wheat phytochemicals ARs and BXs38,39. Other altered compounds linked to WG consumption are phenolic acids and related metabolites, such as ferulic acid (FA) sulfate, 1,2,3-benzenetriol sulfates degraded from gallic acid40, 3-methoxycatechol sulfate derived probably from 3-methylgallic acid40, 3-methylcatechol41, and 4-acetylphenol42 (Table 1 and Supplementary Fig. 1). Tartarate, derived from food preparation43, also exhibited significant increases in the WG group in relation to the RG group (Table 1 and Supplementary Fig. 1).
Targeted metabolomics: urinary metabolites of WG wheat phytochemicals
Identification of urinary metabolites of WG wheat phytochemicals
The identification of metabolites that are not included in mass spectral libraries is the current bottleneck in non-targeted LC-MS-based metabolomics studies2. Most of the metabolites originated from specific components in food are not commercially available and are unlikely to be included in the libraries of non-targeted metabolomics. These specific metabolites can be investigated by targeted approach based on the chemical properties of their known sources2. WG wheat is a rich source of various phytochemicals such as ARs, BXs, phytosteroids, lignans, and phenolic acids44. In the present study, a targeted LC-tandem MS in data-dependent acquisition was applied to explore the metabolites of these specific wheat phytochemcials. Urinary metabolites of phytochemicals were identified by comparison of their retention times and fragmentation patterns with standards or data in literature, eventually resulting in the discovery of 26 metabolites (Table 2). Among them, 20 metabolites were identified by targeted approaches only. These metabolites were chemically categorized into five groups: AR metabolites, BX derivatives, phenolic acid derivatives, phytosterol-related compounds, and metabolites of lignans. Particularly, three metabolites, 7-(3,5-dihydroxyphenyl)heptanoic acid (DHPHTA) sulfate (7), 3-hydroxy-N-(2-hydroxyphenyl)propanamide (HHPPA) sulfate (13), and glycochenodeoxychol-5-en-24-oic acid glucuronide (25) were identified as new metabolites (Table 2, and Fig. 6).
The urinary excretion of AR metabolites, 3,5-DHBA (1), 3,5-DHPPTA (2), and 3,5-DHBA glycine (3) (Table 2 and Fig. 7), was confirmed by comparing their retention times and MS/MS fragmentations with standards. The presence of AR metabolite 3,5-DHBA has also been granted by the non-targeted metabolomics approach above. Metabolites 4–6 were identified as 3,5-DHBA sulfate (4), 3,5-DHPPA sulfate (5), and 3,5-DHPPTA sulfate (6) (Table 2 and Fig. 7), as reported in a previous study22. Metabolite 7 had a molecular ion at m/z 317 (237+80) [M−H]− under negative ESI/MS mode, which was 28 units higher than that of 3,5-DHPPTA sulfate (6), indicating that two more –CH2 groups are present in the structure of metabolite 7 than in 6. Similar to 6, metabolite 7 had fragment ions at m/z 219 [M−sulfate−H2O−H]−, m/z 193 [M−sulfate−CO2−H]−, and m/z 177 [M−sulfate−HOAc−H]− (corresponding to a neutral loss of HOAc from aglycone through γ-H shift) in its MS3 spectrum (Fig. 6a). Further breakdown of the daughter ion at m/z 219 produced the major fragment at m/z 177 in MS3 spectrum of 7, which was formed by McLafferty rearrangement via a neutral loss of CH≡COH (Fig. 6a). All of these spectra features suggest that metabolite 7 is 7-(3,5-dihydroxyphenyl)heptanoic acid (DHPHTA) sulfate (Fig. 6a), a new metabolite of ARs. In addition, metabolite 8 had a major MS3 fragment ion at m/z 123 [M−sulfate−CH2OH−H]−, suggesting that 8 is 3,5-dihydroxyphenylethanol sulfate (Table 2), a known metabolite of WG rye in human45.
The structure of N-(2-hydroxyphenyl)acetamide (HPAA) (9) (Table 2 and Fig. 7) was confirmed by comparison of its retention time and MS/MS fragmentation pattern with those of the authentic standard. HPAA sulfate (10), 2-aminophenol sulfate (11), and HHPAA sulfate (12) (Table 2 and Fig. 7) have been reported as biomarkers of WG rye bread exposure45. Metabolite 13 had a molecular ion at m/z 260 (180+80) [M−H]−, which was 14 units higher than that of 12 (Table 2), indicating that one more –CH2 group appears in the structure of metabolite 13 than in metabolite 12. Similar to 12, 13 had daughter ions at m/z 162 [M−sulfate−H2O−H]−, m/z 150 [M−sulfate−HCHO−H]− (neutral loss of HCHO via γ-H shift), m/z 118 [M−sulfate−CH3CHO−H2O−H]− (formed by loss of CH3CHO via α-cleavage of amide followed by loss of H2O after cyclization of intermediate), and m/z 108 (corresponding to the breakdown of amide bond) in its MS3 spectrum (Fig. 6b). These spectra features demonstrated that metabolite 13 is 3-hydroxy-N-(2-hydroxyphenyl)propanamide (HHPPA) sulfate (Fig. 6b), a new derivative of BXs. Isopropyl-2-hydroxyphenylcarbamate (14) (Table 2 and Fig. 7), a metabolite of a herbicide46, was identified from wheat bread consumption for the first time in the present study.
Phenolic acid derivatives
All phenolic acids (15–21) in Table 2 were identified by comparing their MS/MS fragmentation data with those given in the literature4,28,47. The majority of phenolics metabolites detected in urine were the end products of bacterial fermentation in the colon, such as ferulic acid (FA) sulfate (15), caffeic acid sulfate (16), and vanillic acid sulfate (17) (Table 2 and Fig. 8)47. Homovanillic acid sulfate (18) could also be a result of bacterial bioconversions4. Urinary metabolites including dihydroferulic acid (DHFA) sulfate (19), feruloyglycine (20) and feruloyglycine sulfate (21) (Table 2 and Fig. 8), were associated with the metabolic conversion of dietary FA by microbial enzymes or endogenous enzymes in liver47. Phenolic acids, such as FA, caffeic acid, and vanillic acid, are predominantly found in the outer bran layer of wheat grains48. This could be an explanation of the higher levels of phenolics and related metabolites in the WG group than RG group.
Glycine conjugated bile acids, 22–25 (Table 2 and Fig. 8), were identified by comparison of their MS/MS fragmentations with data reported previously49,50. Metabolite 25 had a molecular ion at m/z 622 (446+176) [M−H]−, which was 176 units (corresponding to a glucuronide moiety) higher than that of glycochenodeoxychol-5-en-24-oic acid (23), indicating 25 is a glucuronide metabolite of 23. The MS3 fragmentation patterns of the major daughter ion at m/z 446 of 25 is the same as the MS2 spectra of metabolite 23 (Table 2), further supporting that metabolite 25 is glycochenodeoxychol-5-en-24-oic acid glucuronide, a new metabolite found in human urine. It was reported that plant sterols could be converted into bile acids such as cholic and chenodeoxycholic acids in the liver in animals and in men51,52. Bile acids are then secreted into the small intestine where they undergo several enzymatic bacterial transformations53, producing a large and conjugated hydrophilic bile acid pool in the host54. Thus, the host and gut microbiome appear to modify bile acids derived from sterols in wheat grains, that can potentially create novel conjugated bile acid profiles.
Metabolite 26 had a molecular ion at m/z 473 (297 + 176) [M−H]−, suggesting that it is a conjugation of glucuronide. Daughter ions at m/z 253, 189, 165 and 121 in MS3 spectra (Table 2) agreed that the aglycone is enterolactone55. Therefore, metabolite 26 was recognized as enterolactone glucuronide, which is known as the mammalian fermentation product of plant lignans56.
Kinetics of urinary metabolites of WG wheat phytochemicals
The urinary changes of specific dietary exposures in two groups were indicated by kinetic curves (Figs 7 and 8) as well as fold changes at each interval (Table 2). In this study, short-, medium-, and long-term refer to half-life (t1/2) < 6 h, 6 h < t1/2 < 12 h, and 12 h < t1/2, respectively. Similar to 3,5-DHBA (1), apparent increases in other AR metabolites 2–7 in the WG group at each interval were observed (Table 2 and Fig. 7a), indicating 3,5-DHBA along with other AR metabolites, 2–7, can be used as specific exposure biomarkers for WG wheat and rye intake. As indicated in Fig. 7a and Table 2, the urinary excretions of AR metabolites 1–7 were rapidly elevated after 2 h of administration and up to maximum around the period of 6–12 h, and thereafter decreased gradually. The levels of these metabolites were remarkably elevated in the WG group across all intervals when compared to the RG group. In particular, increases in metabolites 1–6 in the WG group were significantly different (p < 0.05) after the 0–2 h interval (Table 2). Moreover, the levels of metabolites 1–6 at the 12–24 h interval were still higher than those at 0–2 h (Fig. 7a), indicating the excretion of these metabolites may last longer than 24 h. All these observations featured AR metabolites 1–7 as medium- to long-term urinary biomarkers for WG wheat consumption.
In addition to HPAA sulfate (10), four more BX derivatives, HPAA (9), 2-aminophenol sulfate (11), HHPAA sulfate (12), and HHPPA sulfate (13) were identified in non-targeted metabolome analysis and exhibited comparable increases in the WG group at each interval, with particularly significant increases at interval of 6–9 h (p < 0.05) for all (Table 2 and Fig. 7b), suggesting that BX derivatives 9–13 can be used as alternative biomarkers to distinguish WG consumption over RG group. The kinetic curves indicate that BX derivatives 9–13 are excreted into urine and climbed up to maximum around 6–12 h after interventions (Fig. 7b), indicating that BX derivatives can be used as medium- to long-term urinary biomarkers to reflect WG wheat intake.
As for phenolic acids and their related compounds, elevated levels of FA sulfate (15), caffeic acid sulfate (16), DHFA sulfate (19), and feruloyglycine (20) and its sulfate (21) were observed at each interval in the WG group (Table 2 and Fig. 8a), suggesting that these phenolic acid metabolites may aid in discriminating WG wheat consumption from other food intake. Trajectory curves of phenolic acid derivatives showed that excretion of FA sulfate (15) and feruloyglycine sulfate (21) reached maximum concentrations in 4 h while caffeic acid sulfate (16), DHFA sulfate (19), and feruloyglycine (20) climbed up to maximum in 9 h (Supplementary Fig. 1 and Fig. 8a), indicating these metabolites may be short- to medium-term urinary biomarkers. Since phenoilic acids are abundant in the bran part of WG wheat45, these derivatives of phenolic acids may be used as biomarkers to distinguish WG consumers over RG group. However, due to non-specificities of phenolic acids in human food57, their derivatives can be an aid of AR metabolites and BX derivatives in measuring WG consumption in epidemiological studies.
Urinary excretion of phytosterol-related derivatives (22, 24, and 25) between WG and RG groups was not significantly different at each interval in this study, and moreover, trajectory curves of these derivatives (22–25) between two groups showed multiple points of intersection (Table 2 and Fig. 8b), despite the reports that glycine conjugated and unconjugated bile acids may be responsible for the discrimination between venous and arterial plasma samples after WG wheat intake in pigs49. Similarly, the lignan metabolite, enterolactone glucuronide (26), has been reported as a biomarker to distinguish WG rye consumption from RG group39. However, our data showed that the urinary excretion of enterolactone glucuronide in the WG group is not significantly different from the RG group (Table 2 and Fig. 8b).
As a consequence, we propose a panel of urinary biomarkers consisting of seven AR metabolites (1–7) and five BX derivatives (9–13) as more specific medium- to long-term biomarkers, as well as five phenolic acid derivatives (15, 16, 19–21), to estimate WG consumption (Supplementary Fig. 1; and Figs 7a,b and 8a). This combination may facilitate the objective monitoring of WG consumption for a longer period in the intervention studies.
In this study, a non-targeted metabolomics approach coupled with three OPLS analyses was used to investigate the metabolic difference between WG and RG samples. The results from the three analyses suggest that there was some overall difference between WG and RG samples when all of the time points were considered, but a statistically significant difference was only observed at 2–4 h and 4–6 h after the WG or RG consumption.
Endogenous differences determined by non-targeted metabolomics may reveal unexplored mechanisms responsible for the observed beneficial effects of WG consumption in relation to RG consumption in epidemiological studies, and thus provide better estimates of diet-disease associations4. Our results suggest that endogenous cysteine and related thiol compounds accumulate in the body after WG consumption (Table 1 and Fig. 3). Extracellular cysteine constitutes a major and independent component in antioxidant defense through cysteine/cystine redox cycle58. This thiol redox cycle may protect the body against oxidative stress that has been implicated in the etiology or the progression of a number of human diseases, such as cancers, CVD, and diabetes59. It has been revealed that cysteine/cystine redox signaling plays a vital role in preventing CVD onset or progression60. Recent studies have demonstrated lower blood sugar levels in normal subjects or diabetic patients after supplementation with cysteine or NAC61. Proteins containing oxidation-sensitive cysteines in tissue may prevent the development of tumorigenesis in colorectal cancer patients62. Therefore, WG wheat consumption may stimulate the antioxidant defense system to protect the body against the damage of oxidative stress when compared to RG consumption. However, additional studies are needed to further validate these findings.
A pool of endogenous metabolites related to nucleotides was extensively decreased in the WG consumers compared to the RG consumers (Table 1 and Fig. 4). Levels of urinary nucleotides, including adenine63, adenosine63,64, 3-methylcytidine64, uridine65,66, thymine67, Pseu65,68, m7Gua69, m1A65,66,68, t6A66, m22G65,66, and urate70, are reported to be significantly elevated in patients with various cancers. Increased urate can also be a risk factor for hypertension and CVD in patients71. Changes on metabolites of amino acids, such as tryptophan and proline (Table 1 and Fig. 5), are also related to risk of diseases. Indolelactate, a metabolite of tryptophan72, reportedly plays an important role in the progression of cancers and CVD73. The level of trans-4-hydroxyproline in cancer patients was significantly higher than those in healthy subjects74. Dopamine in clinical use has been proposed to have a role in the treatment of gastric cancer75. On the other hand, WG consumption may also produce health effects through lipid and carbohydrate metabolism (Table 1 and Fig. 5). AzA exerted antitumor properties76. Serum xylulose was reported to correlate to diabetics patients77. These results indicate that consumption of WG wheat may attenuate risk factors of cancers, CVD, and diabetes, compared to those in the RG group. However, the mechanisms underlying the protective effects of WG wheat consumption are still a future research subject.
Kinetics studies of metabolites provide detailed information of kinetic parameters, and are a better approach to validate novel metabolites as potential biomarkers for dietary exposure than those using single spot or 24-h samples in epidemiological studies. However, all of the previous metabolomics studies on WG wheat/rye investigate biomarkers of dietary exposure using merely single spot or 24-h samples39,78. This is the first study to identify the appropriate biomarkers as a panel to estimate WG wheat consumption using kinetics parameters. In the present study, the most discriminative food-derived metabolites that linked to WG wheat consumption mainly originated from ARs, BXs, and phenolic acids (Tables 1 and 2). Wheat ARs have been revealed to play an important role in the prevention of cancers79, diabetes80, and CVD80. Similarly, cereal BXs showed cytotoxic effects81, and could also significantly decrease lipid accumulation in 3T3-L1 preadipocytes82. In addition to the direct radical scavenging activity83,84, simple phenolic acids, such as ferulic acid, caffeic acid, gallic acid, and cinnamic acid, possess antidiabetic properties85. Consumption of products rich in phenolic acids correlates with a reduced risk of CVD86. Thus, the health-promoting effects of WG wheat consumption observed in epidemiological studies may be attributed in part to the unique phytochemical contents of ARs, BXs, and phenolic acids in this whole grain.
The fingerprinting of small molecules by non-targeted metabolomics approaches heavily relies on public databases such as HMDB, Metlin, and MassBank, and metabolites originating from specific food components are unlikely to be included in those databases. Therefore, non-targeted metabolomics approaches have limitations regarding the discovery of novel metabolites that derive from specific food components in biofluids, which could be more specific and sensitive exposure biomarkers. In contrast, targeted metabolomics is a useful tool to investigate the metabolism of specific food components using non-commercially available standards, such as synthetic products in lab and naturally occurring isolates. Therefore, a joint use of non-targeted and targeted metabolomics approaches can compensate for each other, and expedite the discovery of better biomarkers to measure WG consumption and to better predict the correlation between WG intake and the risk of chronic diseases. The biomarkers identified in this study will be used to create a library for the development of automated quantification methods to quantify these metabolites in large interventional or observational clinical studies in order to further validate the usefulness of these biomarkers. In addition, it is worthwhile to further study the mechanisms by which WG wheat intake affects the levels of these endogenous metabolites.
The study design was published elsewhere22. In brief, twelve healthy participants had a three-day washout period to avoid foods containing phenolic compounds and all cereal bran-related products before the study and throughout the study for 5d. After the initial 3-d washout period, all participants ingested a single dose (196 g, fresh) of RG wheat bread plus 21 g of butter, and 30 min were allotted for the ingestion of the RG bread and butter. Urine samples were collected for the following 24 h in six intervals: 0–2, 2–4, 4–6, 6–9, 9–12, and 12–24 h. The following day, the same participants ingested a single dose (208 g, fresh weight) of WG wheat bread plus 21 g of butter in 30 min. In a fashion similar to RG ingestion, urine samples at six intervals in the following 24 h after WG consumption were collected. Hydrochloric acid was added to urine samples (0.4% of total urine volume). Samples were then portioned into aliquots and stored at −80 °C until analysis. Participants provided informed consent, and the study procedures were approved by the institutional review board at North Carolina Agricultural and Technical State University, Greensboro, NC (IRB#:11–0026). The experiment was conducted and data was collected according to the approved procedures at North Carolina Research Campus in Kannapolis, NC. The pharmacokinetics of several AR metabolites from these samples has been determined in our previous studies22. Based on our previous data, three subjects were excluded for this metabolomics study due to either high levels of AR metabolites in the RG samples or in the WG samples.
Non-targeted Metabolomics Platform
Sample preparation was carried out at Metabolon Inc. (Durham, NC), as described previously87,88. In brief, several recovery standards were added prior to the first step in the extraction process for quality control (QC) purposes. To remove protein, dissociate small molecules bound to protein or trapped in the precipitated protein matrix, and to recover chemically diverse metabolites, proteins were precipitated with methanol under vigorous shaking for 2 min (Glen Mills Genogrinder 2000) followed by centrifugation.
Three types of controls were analyzed in concert with the experimental samples: a pooled matrix sample generated by taking a small volume of each experimental sample as a technical replicate; extracted water samples as process blanks; and a cocktail of standards spiked into every analyzed sample for instrument performance monitoring. These QC samples and standards were described in Supplementary Tables 2 and 3. Instrument variability was determined by calculating the median relative standard deviation (RSD) for the standards that were added to each sample prior to injection into the mass spectrometers. Overall process variability was determined by calculating the median RSD for all endogenous metabolites present in 100% of the urine technical replicates. Experimental samples were randomized across the platform run, as outlined in Supplementary Fig. 3.
The ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) platform utilized a Waters Acquity UPLC and a Thermo Scientific Q-Exactive high resolution/accurate mass spectrometer interfaced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at 35,000 mass resolution. The sample extract was dried then reconstituted in acidic or basic LC-compatible solvents, which contained a series of standards at fixed concentrations to ensure injection and chromatographic consistency. Two aliquots were analyzed under acidic, positive ion-optimized conditions and a third one under basic, negative ion-optimized conditions, using a Waters UPLC BEH C18 column (2.1 × 100 mm, 1.7 μm). Two extracts reconstituted in acidic conditions were gradient eluted using water and MeOH containing 0.05% perfluoropentanoic acid (PFPA) and 0.1% formic acid (FA) for hydrophilic compounds and MeOH, MeCN, and water containing 0.05% PFPA and 0.01% FA for hydrophobic compounds, respectively, whereas the basic extracts used water and MeOH containing 6.5 mM ammonium bicarbonate at pH 8. A fourth aliquot was analyzed via negative ionization following elution from a HILIC column (Waters UPLC BEH Amide 2.1 × 150 mm, 1.7 μm) using a gradient consisting of H2O and MeCN with 10 mM ammonium formate at pH 10.8. The MS analysis alternated between MS and data-dependent MS/MS scans using dynamic exclusion, and the scan range varied from 70 to 1000 m/z.
Data extraction and compound identification
Raw data was extracted, peak-identified, and QC processed. Metabolites were identified by automated comparison to library entries of purified standards. Biochemical identifications were based on three criteria: retention index (RI) within a narrow RI window of the proposed identification, accurate mass match to the library +/− 10 ppm, and the MS/MS forward and reverse scores between the experimental data and authentic standards. The MS/MS scores are based on a comparison of the ions present in the experimental spectrum to the ions present in the library spectrum.
Metabolite quantification and data normalization
Peaks were quantified using area-under-the-curve (AUC). For studies spanning multiple days, a data normalization step was performed to correct variation resulting from instrument inter-day tuning differences. Essentially, each compound was corrected in run-day blocks by registering the medians to equal one (1.00) and normalizing each data point proportionately (Supplementary Fig. 4). For studies that did not require more than one day of analysis, no normalization is necessary, other than data visualization. In all instances, biochemical data have been normalized to osmolality (milliosmoles of biochemicals per kg of urine, mOsm/kg) to account for differences in metabolite levels due to differences in the amount of material present in each sample.
Targeted Metabolomics Platform
Urine samples were prepared according to our previous method with slight modifications22. Briefly, 50 μL urine samples from each interval were diluted 1:1 with methanol to precipitate proteins. The mixture was vortexed for 1 min, and then centrifuged at 17,000 × g for 10 min. 50 μL of the supernatant was transferred into vials for LC-MS analysis.
LC-MS analysis was carried out with a Thermo-Finnigan Spectra System that consists of a Dionex XRS pump, a Dionex XRS open autosampler, and a LTQ Velos Pro ion trap mass detector (Thermo Electron, San Jose, CA, USA) incorporated with an electrospray ionization (ESI) interface. A Gemini C18 column (150 mm × 4.6 mm i.d., 5 μm, Phenomenex) was used to analyze metabolites with a flow rate of 0.3 mL/min. The binary mobile phase system consists of 5% aqueous methanol with 0.1% formic acid (FA) as phase A and methanol with 0.1% FA as phase B. The column was eluted by a gradient progress (5% B from 0 to 7 min; 5 to 30% B from 7 to 15 min; 30–50% B from 15 to 25 min; 50–80% B from 25 to 35 min; 80–100% B from 35 to 45 min; 100% B from 45 to 55 min, and then 5% B from 55 to 60 min). The injection volume was 10 μL for each sample. The LC elute was introduced into the ESI interface. The negative ion polarity mode was set for an ESI ion source with the voltage on the ESI interface maintained at approximate 3.6 kV. Nitrogen gas was used as the sheath gas at a flow rate of 3 AU and the auxiliary gas at 10 AU, respectively. Optimized parameters include ESI interface temperature, capillary voltage, ion spray voltage, sheath gas flow rate, tube lens offset voltage, and ion optics settings.
The data-dependent experiment was set up with the following scan events: 1) scan event 1 to collect the full MS spectrum of all the ions in the sample; 2) scan event 2 to collect the MS/MS spectra of the most intense ion from the MS spectrum in scan event 1; and 3) scan event 3 collect the MS/MS spectra of the second intense ion from the MS spectrum in scan event 1. In the dynamic exclusion setting, the repeat count for each ion was 5, the repeat duration was 30 s, the exclusion list size was 50, and the exclusion duration was 180 s. The collision-induced dissociation was conducted with an isolation width of 1.0 Da and normalized collision energy of 35 for MS/MS analysis. The mass range was measured from 50 to 800 m/z. Data acquisition was performed with Xcalibur version 2.0 (Thermo Electron, San Jose, CA).
Identification and quantification of metabolites and data normalization
Metabolites were identified by either directly comparing the ion features in the urine samples to authentic standards in our lab or interpreting MS/MS fragmentation patterns of ions in the urines according to published data in literature. Targeted peaks were quantified using AUC. Biochemical data was normalized to the same osmolality (mOsm/kg) as used in non-targeted metabolomics analysis.
Data are expressed as mean ± standard error of measurement (SEM) unless otherwise stated. P < 0.05 was considered to indicate significance. Fold changes across intervals were calculated by dividing the mean of the normalized intensity in the WG group by the mean scaled intensity of the same compound in the RG consumption. For the non-targeted metabolomics data, raw area counts for each metabolite in each sample were normalized to osmolality. Missing values were imputed with the observed minimum after normalization. Following log transformation of the normalized data, a 2 (treatments) × 6 (time points) repeated measures ANOVA was performed to identify metabolites exhibiting significant treatment × time interaction effect. Post hoc tests were performed to compare WG and RG groups at each interval. An estimate of the false discovery rate (q-value) was calculated to take into account the multiple comparisons. Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA), OPLS-effect projections (OPLS-EP), and OPLS-DA based batch processing analysis in SIMCA (Umetrics, Umeå, Sweden) were used to detect a panel of metabolites that best distinguishes RG samples and WG samples. The default 7-round cross-validation in the SIMCA software package was applied with 1/7 of the samples being left out from the mathematical model in each round. Permutation-based validation was used to prevent overfitting. Variable Influence on Projection (VIP) score was calculated for each metabolite based on OPLS weight and variability explained in OPLS-EP. Metabolites with VIP >1.5 were considered important metabolites in separating WG and RG groups.
Regarding targeted metabolic profiling, raw area counts for each metabolite in each sample were normalized to the same osmolality as used in non-targeted analysis. The same repeated measures ANOVA model described above was also used to analyze the targeted metabolite data.
How to cite this article: Zhu, Y. et al. Urinary Biomarkers of Whole Grain Wheat Intake Identified by Non-targeted and Targeted Metabolomics Approaches. Sci. Rep. 6, 36278; doi: 10.1038/srep36278 (2016).
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was partially supported by USDA-NIFA grants 2012–38821–20012 and 2014–67018–21751 to S. S. The authors wish to thank Mr. Aaron Yerke who assisted in the editing of the manuscript.