Introduction

Adverse blood pressure (BP) (prehypertension and hypertension) is a major independent risk factor for epidemic cardiovascular diseases (CVDs), affecting ~40% of the adult population worldwide.1, 2, 3, 4 Its causation is multifactorial, encompassing both environmental—diet and other aspects of lifestyle—and genetic factors. Public health policies aiming to improve the prevention of high BP and/or maintain optimal BP levels typically involve efforts to tackle known modifiable risk factors such as reduction of high salt intake, moderation of alcohol intake, maintenance of normal weight and increased physical activity. Although large-scale genome-wide association studies have identified common variants, novel loci and pathways associated with BP,5, 6, 7, 8, 9 the genetic contribution to BP variation in the population is modest.5, 7, 8 Despite the effectiveness of non-pharmacological approaches to BP control, many people with high BP are reliant on antihypertensive drugs, or their BP remains untreated or poorly controlled.10, 11

The INTERnational study of MAcro/micronutrients and blood Pressure (INTERMAP) is a cross-sectional epidemiological study of 4680 men and women aged 40–59 years from Japan, the People's Republic of China, the United Kingdom and the United States.12 The main aim of the INTERMAP Study is to investigate the etiology of adverse BP with an emphasis on studying diet–BP associations (Figure 1).12, 13 Participants were selected randomly from population lists, stratified by age/sex. Each participant attended four visits: visits 1 and 2 on consecutive days and visits 3 and 4 on consecutive days, on average 3 weeks later. At each of the four visits, BP was measured twice with a random zero sphygmomanometer, and dietary data were collected by a trained certified interviewer with use of the in-depth multipass 24-h recall method.14 Height and weight were measured twice on the first and third visits. At the first visit, questionnaire data were obtained on demographic, medical history, medication intake, and other possible confounders. Each participant provided two 24-h urine collections, with the beginning and end timed at the research center. The two timed 24-h urine collections per person on each of the 4680 INTERMAP participants are a pivotal feature of the study design that enabled the introduction of metabolome-wide association (MWA) study (see later section). The study received institutional ethics committee approval for each site, and all participants gave written informed consent.

Figure 1
figure 1

INTERnational study of MAcro/micronutrients and blood Pressure (INTERMAP) metabolome-wide association (MWA) studies. A full color version of this figure is available at the Hypertension Research journal online.12, 13

In the past decade, the INTERMAP Study has added new knowledge to the limited data previously available on the effects of nutritional factors on BP.15 These advances include the inverse relationship between BP and intakes of vegetable protein,16 glutamic acid,17 total and insoluble fiber,18 raw and cooked vegetables,19 total polyunsaturated fatty acids and linoleic acid,20 oleic acid from vegetable sources,21 total omega-3 fatty acids and linolenic acid,22 phosphorus (P), calcium (Ca) and magnesium (Mg),23 non-heme iron (Fe) and total Fe,24 and starch25 and the direct associations of sugars (fructose, glucose and sucrose),26 cholesterol,27 glycine and alanine,28 raw fruits29 and oleic acid from animal sources.21 The INTERMAP Study showed that diet-induced metabolic acidosis was positively associated with BP (not significant after controlling for body mass index, BMI),30 whereas a cohort of over 61 000 persons reported a positive association between the metabolic acid load (such as serum bicarbonate, urine acidity) and cardiovascular mortality.31 While the INTERMAP Study reported a small nonsignificant inverse relationship between urinary Mg and BP,32 the World Health Organization-coordinated Cardiovascular Diseases and Alimentary Comparison (CARDIAC) Study showed that urinary Mg-to-creatinine ratio was inversely associated with CVD risk factors such as BMI and BP.33 Although important advances have been made by the INTERMAP Study in improving understanding of the etiology of high BP, together with other research into the physiology of BP regulation such as the control of kidney fluid and salt balance via the renin–angiotensin–aldosterone system,34, 35, 36 sympathetic nervous system activity37, 38 and the role of the structure and function of blood vessels,39 gaps remain in our knowledge of the causes and mechanisms of adverse BP levels. New approaches are needed to enhance the understanding of the multifactorial etiopathogenesis of BP.

High-resolution proton nuclear magnetic resonance (NMR) spectroscopy has been successfully applied for the investigation of drug metabolism and toxicology, as well as disease development within a biological system, using biological fluids such as urine, plasma/serum, bile, cerebrospinal fluids and dialysates.40, 41 Recent studies have shown the importance of the gut microbiome in the etiology of a number of chronic diseases such as atherosclerosis, diabetes and the metabolic syndrome, obesity and elevated BP.42, 43, 44, 45, 46, 47 Metabolic phenotyping of biological fluids using spectroscopic methods enables the investigation of gene–environmental–gut microbiome interactions on disease risk and is thus an attractive approach for gaining new insights into BP mechanisms and its associated pathways.45, 48, 49, 50 The INTERMAP Study has capitalized on the evolving technologies in metabolic phenotyping to enhance its rich nutrient and anthropometric data by incorporating this top-down system approach to investigate the association of BP and urinary markers that are linked to environmental exposures, including diets and xenometabolomes.48, 49, 50, 51, 52 This review demonstrates the key progress made by the INTERMAP Study and the introduction of MWA studies on diet and BP.13 However, readers may wish to refer to other recent reviews on the principles of metabolic phenotyping.53, 54, 55

Analytical method development

Metabolic phenotyping involves the application of high-throughput advanced spectroscopic methods, such as proton nuclear magnetic resonance (1H NMR) spectroscopy and mass spectroscopy (MS), to biological samples. In INTERMAP, comprehensive metabolic phenotyping by 1H NMR spectroscopy of the two 24-h urine samples from each participant (N=4630) has been performed.13 The repeatability, accuracy and stability of the 1H NMR spectral profiles were evaluated, and the overall analytical reproducibility of the 1H NMR was found to be >98%.56 In INTERMAP, boric acid (borate, a preservative known to bind covalently to vicinal diols and some amino acids) was added to the urine collection jars in the field to prevent bacterial overgrowth. The effect of the boric acid on the 1H NMR urinary spectra was also assessed.37, 38 It was shown that the overall changes in the urinary 1H NMR spectral profile caused by borate addition were negligible compared with the physiological and metabolic differences between individuals.57 These studies have led to recommendations for improved sample preparation and handling of urine samples, as well as processing methods for large-scale epidemiologic research.58

In addition to metabolic phenotyping via 1H NMR spectroscopy, we obtained extensive data on 20 urinary amino acids using conventional cation-exchange chromatography followed by postcolumn derivatization (Biochrom 20 and Biochrom 30). These data provide a unique population-based data set on urinary amino-acid excretion levels in different populations. We then applied gas chromatography mass spectrometry and liquid chromatography-tandem mass spectrometry (LC-MS/MS) for high-throughput urinary amino-acid analysis and compared their sample preparation, run time, number of analytes amenable to quantification, cost, limit of quantification, reproducibility and validity with conventional amino-acid analysis.59 The amounts of urine needed for GS-MS and LC-MS/MS were 40–50 μl, much less than the 200 μl required for the amino-acid analyzer. Moreover, the run time for the amino-acid analyzer was ~5 to 6 times longer than that of gas chromatography mass spectrometry and LC-MS/MS. The Pearson's correlation coefficients of amino acids measured by gas chromatography mass spectrometry and the amino-acid analyzer ranged from 0.80 (tryptophan) to 0.98 (glycine); correlation coefficients comparing LC-MS/MS with the amino-acid analyzer ranged from 0.56 (arginine) to 0.95 (lysine). Our findings showed that gas chromatography mass spectrometry offered higher reproducibility and completely automated sample pre-treatment compared with LC-MS/MS and conventional amino-acid analysis, which covered more amino acids and related amines.

We also applied ultra-performance liquid chromatography-triple quadruple-tandem mass spectrometry for simultaneous detection (both positive and negative electrospray ionization modes) and quantification of three gut microbial cometabolites, phenylacetylglutamine, 4-cresyl sulfate and hippurate in the urinary specimens from 2000 US INTERMAP participants.60 This targeted high-throughput ultra-performance liquid chromatography-triple quadruple-tandem mass spectrometry method was developed specifically to measure these gut microbial cometabolites, which may be implicated in obesity61 and other chronic diseases such as cardiovascular and kidney diseases.62, 63 Following the US Food and Drug Administration guidelines, the imprecision (coefficient of variation) and inaccuracy (recovery) of the method were assessed using replicates of quality control urine samples at different concentration levels. The coefficient of variation and recovery were found to be within the acceptable limits of the Food and Drug Administration guidelines. The study demonstrated the applicability of metabolic phenotyping by ultra-performance liquid chromatography-triple quadruple-tandem mass spectrometry in a large-scale epidemiological study.

Statistical method development

Metabolomic data pose special challenges for statistical analysis, including high dimensionality, strong colinearity, nonlinear and highly complex spectral profiles, and the presence of structured and unstructured noise (due to within- and between-individual variability).64, 65, 66, 67, 68 Spectroscopic data are first preprocessed, including spectral alignment and normalization of the full-resolution spectral data. Multiple statistical strategies are then applied, including the use of both unsupervised and supervised multivariate data analysis techniques to extract information from the data.69 Principal component analysis is routinely used to identify the main sources of variation in the data and detect outlying values with the goal of providing the most compact representation of the data.64, 67, 68, 69 Hierarchical cluster analysis is another method that is widely used in exploratory data analysis to provide an overview of the data by grouping phenotypes according to their similarity, without assuming any prior knowledge of the data.64, 68, 69 Other techniques include partial least-squares (PLS), orthogonal partial least-squares (OPLS) and orthogonal partial least-squares discriminant analysis (OPLS-DA)70, 71, 72 aim to extract discriminatory metabolic signals from the data sets. They are often applied after the initial exploratory analysis, whereas statistical spectroscopy methods, accommodating the complex spectral data structure and correlations, such as statistical total correlation spectroscopy,73, 74 iterative-statistical total correlation spectroscopy,75 cluster analysis statistical spectroscopy76 and subset optimization by reference matching,77 are used to aid in the biomarker identification process. Statistical heterospectroscopy is a statistical strategy for the analysis of multiple spectroscopic data sets, for example, 1H NMR and ultra-performance liquid chromatography-mass spectrometry on the same samples. This method has enabled the characterization of drug metabolites (xenometabolome) in an epidemiological study.78 Other recent statistical spectroscopic methods include statistical homogeneous cluster spectroscopy79 and automatic spectroscopic data categorization by clustering analysis.80 These methods aim to remove the influence of irrelevant interferences within the data set to enhance the biomarker selection process79 and to differentiate reliably potential discriminatory markers from non-discriminatory markers in a biological data set,80 respectively. Figure 2 shows the steps of statistical analysis of INTERMAP NMR metabolic phenotyping data.13, 81, 82

Figure 2
figure 2

Steps involved in the statistical analysis of INTERnational study of MAcro/micronutrients and blood Pressure (INTERMAP) nuclear magnetic resonance (NMR) metabolic phenotyping data. A full color version of this figure is available at the Hypertension Research journal online.13, 81, 82

In MWA studies, hundreds to thousands of biomarkers are assayed leading to data that are highly multivariate and colinear. To detect statistically significant relationships between molecular variables and phenotype, we defined the metabolome-wide significance level, a threshold required to control the family-wise error rate through a permutation approach.83 Using the spectra of the INTERMAP Chinese participants (N=836) as the reference population, we investigated the influence of spectral resolution and the number of variables in the NMR spectra, and we examined population heterogeneity by repeating the analysis in the INTERMAP US population samples (N=2164). The results showed that metabolome-wide significance level of 2 × 10−5 and 4 × 10−6 for a family-wise error rate of 0.05 and 0.01, respectively, could be used as a benchmark for NMR-based MWA studies of human urine. For the subsequent INTERMAP MWA studies, metabolome-wide significance level of 4.0 × 10−6 is used to identify significant metabolic features as a conservative estimate taking into account the high degree of colinearity in urinary NMR spectral data.81, 82 A similar approach may be used for MS-based MWA studies.

Novel biomarker discovery

Identification of unknown discriminatory metabolites is a key bottleneck in MWA studies as the human metabolome is still largely unknown. Elucidating the chemical structure of unknown metabolites is often labor-intensive and requires multiple analyses using a series of analytical experiments. Within the INTERMAP, we have been successful in the identification of a number of metabolites, including metabolites derived from dietary and drug intake.84, 85, 86, 87

Ethyl glucoside

The INTERMAP MWA study has discovered novel metabolites related to the intake of alcohol among the Chinese and Japanese population samples. From the 1H NMR urinary spectra, a doublet at δ4.93 corresponding to ethyl glucoside was observed, and it was derived following the ingestion of rice wine and sake.84 This doublet was not observed among the western population samples in the INTERMAP Study.

Proline betaine

We used metabolic phenotyping by 1H NMR spectroscopy to detect an increased excretion of proline betaine, tartaric acid and hippurate after fruit consumption compared with baseline diet in a dietary intervention study.85 We then measured the concentrations of proline betaine in selected fruit and commercially available fruit juices by 1H NMR spectroscopy optimized for quantification of this compound. All citrus fruit tested contained proline betaine; concentrations varied from 75 mg l−1 in orange squash to 1316 mg l−1 in orange juice from concentrate. After the consumption of 250 ml orange juice, we found a singlet peak at δ3.11, representing the CH3 moiety of proline betaine. Most proline betaine excretion occurred in the first 14 h after consumption, peaked at the 2-h postintervention urine collection and declined to almost baseline levels after 24 h.

The 24-h dietary recalls of INTERMAP UK participants were then assessed to validate the use of proline betaine excretion as a biomarker of citrus fruit consumption and as a surrogate marker of healthier eating patterns.85 Proline betaine excretion differed significantly between individuals with no recorded citrus consumption in their 24-h dietary recall data and individuals with recorded citrus consumption (P<0.0001). Those who reported citrus consumption and had higher levels of proline betaine excretion also showed a healthier nutrient profile, with higher intake of vegetable protein; lower intakes of total fat, trans fatty acids, cholesterol and animal protein; and lower urinary sodium–potassium (Na/K) ratio, BMI and systolic BP compared with non-citrus consumers. These findings provide proof that metabolic phenotyping can discover novel dietary biomarkers that can be used to validate dietary assessment in large-scale epidemiologic data sets.

Acetaminophen and ibuprofen

Although the metabolism of commonly used analgesics such as acetaminophen and ibuprofen has been widely described,40, 41, 88, 89 within the INTERMAP Study, we showed that we could detect urinary metabolite signatures related to commonly used analgesics such as acetaminophen and ibuprofen,86 enabling the use of metabolic signatures to verify the self-reported data.87 We applied principal component analysis to the 1H NMR spectra of US participants and identified 413 urine samples containing acetaminophen and its metabolites (acetaminophen users) and then applied OPLS-DA analysis to a subset of 70 urine samples from acetaminophen users and 70 urine samples from acetaminophen non-users.86 The OPLS-DA loading coefficient plot showed that differentiation between acetaminophen users vs. non-users primarily resulted from the presence of acetaminophen and its metabolites acetaminophen glucuronide, acetaminophen sulfate and the acetylcysteine conjugate of acetaminophen. Similarly, a principal component analysis model was constructed to identify urine samples of ibuprofen users, and OPLS-DA was performed on a subset of urine samples. The OPLS-DA loading coefficient plot showed that participants who had ingested ibuprofen were differentiated from non-users by the presence of 2-hydroxy, carboxy and glucuronide conjugates of ibuprofen in the urinary NMR spectra.

We then used these metabolic signatures to verify the self-reported analgesic use of INTERMAP particpants.87 Urinary spectra of UK (Belfast, N=216) and US (Chicago, N=280) participants were inspected for the presence or absence of metabolites in spectral regions containing acetaminophen and ibuprofen.87 These spectra were used to construct prediction models (sensitivity >98%) based on self-reported analgesic use. The overall rates of concordance between questionnaire data and urinary spectra were high for both populations: 83.8% (95% confidence interval: 78.9, 88.7) in Belfast and 81.1% (95% confidence interval: 76.5, 85.7) in Chicago. Overall rates of underdetection of acetaminophen and ibuprofen were low (~1%) and were comparable for both Belfast and Chicago. We then applied these prediction models to 9260 urine spectra to evaluate reported analgesic use from a self-report questionnaire. High-level concordance was observed between self-reported analgesic use and 1H NMR-detected urinary acetaminophen and/or ibuprofen metabolites for all Western population samples, an overall concordance of 70.5% (95% confidence interval: 68.7, 72.2). Our findings demonstrated the efficacy of an objective 1H NMR-based method for validation of self-reported data on analgesic use, detecting an underreporting rate of ~15% in the INTERMAP Study. This MWA approach has demonstrated the potential of metabolic phenotyping in reducing recall bias and other biases in epidemiologic studies for a range of substances, including pharmaceuticals, dietary supplements and foods.

INTERMAP MWA study

INTERMAP is the first large-scale human population MWA study on diet and BP, using an exploratory analytical approach to investigate metabolic phenotype variation across and within 17 population samples in the East (China and Japan) and West (United Kingdom and United States) based on 1H NMR spectroscopy.13 Using a hierarchical clustering algorithm, we investigated the similarity/dissimilarity between populations based on their urinary profiles. East Asian and Western populations had well-differentiated metabolic phenotypes (Figure 3). Among the East Asian samples, Japan was differentiated from China and within China, North China was differentiated from the South despite a similar genetic background. Using OPLS-DA,70, 73 we reported significant differences among the metabolic profiles of these populations; discriminatory metabolites included gut microbial–host cometabolites (hippurate, phenylacetylglutamine and methylamines), amino acids (alanine, lysine and taurine), dietary-related metabolites (e.g., ethyl glucoside and trimethylamine-N-oxide), compounds related to energy metabolism (acetylcarnitine) and tricarboxylic acid cycle intermediates (succinate and citrate). Four discriminatory metabolites reflecting diet and gut microbial activities—alanine, formate, hippurate and N-methylnicotinate—were then quantified from the 1H NMR urinary spectral profiles. We found that alanine was highly correlated with 2-oxoglutarate (metabolic linkage via glutamate-pyruvate transaminase activity) and with formate (pyruvate/Co-A metabolism), and hippurate was highly correlated with N-methylnicotinate (renal transporter/secretion mechanisms). In multiple linear regression models, both formate and hippurate were inversely associated with systolic and diastolic BP, and alanine was positively associated with BP.

Figure 3
figure 3

Hierarchical cluster analysis (HCA) of proton nuclear magnetic resonance (1H NMR) urine spectra, the INTERnational study of MAcro/micronutrients and blood Pressure (INTERMAP) Study.13 The HCA algorithm produces a dendrogram showing the overall similarity/dissimilarity between population samples. Similarity index is normalized to intercluster distance. Each branch of the dendrogram defines a subcluster; population samples within subclusters are more similar to each other than to those in other subclusters. The dendrogram shows clustering based on country and geographical location or gender. A full color version of this figure is available at the Hypertension Research journal online.

More detailed analysis was later performed using the Chinese population samples.81 We found that urinary metabolites were significantly different between the northern and southern Chinese, reflecting variations in dietary patterns as well as CVD risk between these two populations. Those that were higher in northern than southern Chinese populations included dimethylglycine, alanine, lactate, branched-chain amino acids (isoleucine, leucine, valine), N-acetyls of glycoprotein fragments (including uromodulin), N-acetyl neuraminic acid, pentanoic/heptanoic acid and methylguanidine; metabolites that were significantly higher in the south included gut microbial–host cometabolites (hippurate, 4-cresyl sulfate, phenylacetylglutamine, 2-hydroxyisobutyrate), succinate, creatine, scyllo-inositol, proline betaine and trans-aconitate. Compared with the south, northern Chinese had higher BMI, less favorable diet including lower Ca, Mg and P intakes, higher 24-h urinary Na excretion, higher urinary sodium–potassium ratio of excretion and higher BP (Table 1).90 The significant north–south differences in BP, BMI and diet90, 91, 92 are reflected in geographic variations in both CVD incidence and mortality rates, with higher rates in the north than the south.93, 94, 95 The INTERMAP MWA study indicates the likely importance of environmental influences (e.g., diet), endogenous metabolism and mammalian–gut microbial cometabolism in helping to explain north–south differences in Chinese CVD risk.

Table 1 Descriptive statistics, mean or prevalence (%), North and South China and P-value of the differences90

The INTERMAP Study confirmed that African Americans (AA) had higher systolic and diastolic BP compared with non-Hispanic white Americans (NHWAs),82 and this BP difference was due, in part, to less favorable multiple nutrient intake by AA, with lower intakes of fruits, vegetables and dairy products and lower intakes of vegetable protein, starch, fiber, K, Ca, Mg and P, compared with those of whites. In addition, there was greater obesity prevalence among black women compared with white women.82, 96 1H NMR spectra of the INTERMAP US participants showed that urinary metabolites that were significantly higher in AA compared with that in NHWA included creatinine, 3-hydroxyisovalerate, N-acetyls of glycoprotein fragments, dimethylglycine, lysine, N-acetyl neuraminic acid, leucine, dimethylamine, taurine and 2-hydroxy-isobutyrate; metabolites significantly higher in NHWA included trimethylamine, N-methylnicotinate, hippurate and succinate (Figure 4).82 The mean values of urinary hippurate (2.9 mmol per 24 h for AA men vs. 4.1 mmol per 24 h for NHWA men, P<5 × 10−9) and N-methylnicotinate (0.24 mmol per 24 h for AA men vs. 0.42 mmol per 24 h for NHWA men, P<3 × 10−10) were significantly lower in AA compared with NHWA. Multiple linear regression was used to examine these AA-NHWA differences in dietary and urinary metabolites in relation to BP; multiple foods, nutrients and metabolites accounted for part of the higher BP among AA.

Figure 4
figure 4

The median urinary proton nuclear magnetic resonance (1H NMR) spectrum of African Americans (AAs) and non-Hispanic white Americans (NHWAs).82 Top: median urinary 1H NMR spectrum of INTERnational study of MAcro/micronutrients and blood Pressure (INTERMAP) US AA and NHWA participants, based on the first urine collection (N=1455). Bottom: Manhattan plot indicating the significant spectral variables. Metabolites higher in AA individuals compared with NHWA individuals are shown in red, and metabolites higher in NHWA individuals compared with AA individuals are shown in blue. 1, leucine; 2, 3-hydroxyisovalerate; 3, 2-hydroxyisobutyrate; 4, N-acetyls of glycoprotein fragments; 5, N-acetyl neuraminic acid; 6, succinate; 7, dimethylamine; 8, trimethylamine; 9, dimethylglycine; 10, lysine; 11, creatinine; 12, hippurate; 13, N-methyl nicotinic acid.

Summary

In recent years, major advances have been made in the metabolic phenotyping of epidemiological samples. The advancement in the analytical techniques and the development of new statistical data analysis tools have enabled the identification of novel metabolic phenotypes associated with diet (including Na intake), xenobiotics and BP. The findings of the INTERMAP MWA study may provide insights into molecular pathways underlying complex biological processes such as adverse BP levels. We envisage that future studies will include the generation of testable hypotheses based on the findings from the INTERMAP MWA study. Moreover, the increasing number of population-based cohort studies, which also apply the MWA approach, will undoubtedly contribute to our understanding of the key mechanisms that are associated with CVD. As noted above, metabolomic data with hundreds to thousands of biomarkers being assayed are highly multivariate, colinear and noisy, with the potential for false-positive findings. It is also always a possibility but unlikely that phenotypes are specific to INTERMAP Study populations and not generalizable; replication studies are needed. Nonetheless, it is reasonable to state at this juncture that INTERMAP findings to date have demonstrated significant independent relationships of several nutrients/foods/eating patterns/metabolites to BP, thereby moving the field forward in exciting and unprecedented ways.