A systematic comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease

Milanlouei, Soodabeh; Menichetti, Giulia; Li, Yanping; Loscalzo, Joseph; Willett, Walter C.; Barabási, Albert-László

doi:10.1038/s41467-020-19888-2

Download PDF

Article
Open access
Published: 27 November 2020

A systematic comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease

Nature Communications volume 11, Article number: 6074 (2020) Cite this article

17k Accesses
30 Citations
22 Altmetric
Metrics details

Subjects

Abstract

Environmental factors, and in particular diet, are known to play a key role in the development of Coronary Heart Disease. Many of these factors were unveiled by detailed nutritional epidemiology studies, focusing on the role of a single nutrient or food at a time. Here, we apply an Environment-Wide Association Study approach to Nurses’ Health Study data to explore comprehensively and agnostically the association of 257 nutrients and 117 foods with coronary heart disease risk (acute myocardial infarction and fatal coronary heart disease). After accounting for multiple testing, we identify 16 food items and 37 nutrients that show statistically significant association – while adjusting for potential confounding and control variables such as physical activity, smoking, calorie intake, and medication use – among which 38 associations were validated in Nurses’ Health Study II. Our implementation of Environment-Wide Association Study successfully reproduces prior knowledge of diet-coronary heart disease associations in the epidemiological literature, and helps us detect new associations that were only marginally studied, opening potential avenues for further extensive experimental validation. We also show that Environment-Wide Association Study allows us to identify a bipartite food-nutrient network, highlighting which foods drive the associations of specific nutrients with coronary heart disease risk.

Large-scale diet tracking data reveal disparate associations between food environment and diet

Article Open access 18 January 2022

Exposome-wide ranking of modifiable risk factors for cardiometabolic disease traits

Article Open access 08 March 2022

Genomic analysis of diet composition finds novel loci and associations with health and lifestyle

Article Open access 11 May 2020

Introduction

The prevalence of heart disease, the leading cause of death throughout the world, is strongly influenced by diet and eating habits^1,2,3,4. For example, a recent CDC (Centers for Disease Control and Prevention) report⁵, focusing on death rates caused by heart disease across the United States, documented substantial regional differences compatible with different eating patterns. Similarly, while among individuals of Japanese descent, coronary heart disease (CHD) incidence rates are only 1.6 per person-years in Japan, it increases to 3.0 in Hawaii, and 3.7 in San Francisco⁶, differences that cannot be explained by genetic factors, documenting the key role dietary and other environmental factors play in the development of the disease.

Much of our knowledge about the role of food on health comes from epidemiological association studies in which a single or limited number of exposure(s) is/are analyzed in relation to a phenotype, representing a hypothesis-driven path towards understanding diet–disease relationships. Yet, diet is not a simple sum of several nutrients, as each food product consists of a mixture of nutrients associated with multiple compounds of limited or unknown nutritional value⁷. Accordingly, the effect of each dietary compound on human health should not be investigated in isolation, but in the presence of other associated chemical compounds and relevant food sources. For example, Kolonel et al.⁸ initially reported that beta-carotene consumption was positively associated with the risk of prostate cancer. While this finding was worrisome, in a subsequent analysis of foods, intake of carrots, the largest source of beta-carotene, was not related to the risk of prostate cancer; the observed association was due to intake of papaya⁹. Thus, the analysis of foods provided evidence against the effect of beta-carotene and suggested that some factors specific to papaya might be responsible for the original finding¹⁰. An alternative approach is dietary pattern analysis that focuses on the effects of the overall diet^11,12,13, rather than a single or a few nutrients. While dietary pattern analyses are ideal in the development of nutritional guidelines¹⁴, they are insufficient for the agnostic discovery of new signals for further experimental or mechanistic validation.

As an alternative to the traditional epidemiological studies, environment-wide association studies (or EWAS) were proposed to identify new environmental factors in disease and disease-related phenotypes in an unbiased manner. EWAS is inspired by the analytical procedures developed in genome-wide association studies (GWAS)¹⁵ in which a panel of “exposures” (genotype variants) is studied in relation to a phenotype of interest. For example, using the National Health and Nutrition Examination Survey dataset, an EWAS study explored the associations of 543 environmental attributes with type 2 diabetes, identifying five statistically significant associations validated across independent cohorts¹⁵. Wulaningsih et al.¹⁶ investigated 182 nutrition and lifestyle factors in relation to abdominal obesity, finding a statistically significant association of obesity with five factors in men and seven factors in women. Merritt et al.¹⁷ used European Prospective Investigation into Cancer and Nutrition (EPIC) data to evaluate endometrial cancer risk associations for the dietary intake of 84 foods and nutrients, concluding that only coffee intake had a statistically significant inverse relationship.

Despite the recent success of the EWAS methodology in unveiling multiple nutritional factors that together may contribute to our health, its widespread use is undermined by several factors and limitations¹⁸. Indeed, failing to achieve adequate statistical power in association detection, EWAS studies could not always recover known environment–disease associations confirmed by large, prospective cohort studies and randomized trials^17,19,20. As we show below, these failures are not inherent in the EWAS methodology, but are mainly rooted in the limited size, limited variability, and lack of repeated measurements of the datasets to which EWAS has been applied thus far. While the statistical power of the EWAS study approach is a legitimate concern, the magnitude of the statistical power depends on multiple factors, including the nature and the size of the dataset, as well as the statistical tools/models used for the analysis. Indeed, as we show here, if we apply a wide-association study approach to an adequately sized longitudinal cohort dataset with sufficient variability, we consistently recover prior knowledge about diet–disease relationships.

The EWAS methodology may be particularly useful for diseases for which nutritional associations are unknown¹⁸. While the effect of dietary exposures on heart disease has been extensively studied and the causal effects of many of these associations confirmed, the diet–disease literature occasionally demonstrates conflicting findings^2,21,22,23, limiting our understanding of the true effect of dietary exposures on diseases. We will show that the wide-association study approach can provide comparable insights in an efficient manner by applying an unbiased standardized set of analytical tools.

Here we implement an EWAS methodology, aiming to identify dietary factors associated with CHD systematically and comprehensively, focusing on both nutrient intake and food consumption. To overcome the limited statistical power of previous studies, we apply our methodology to the Nurses’ Health Study (NHS), a longitudinal prospective study designed to investigate the longitudinal effects of nutrition on health and disease development. While there are larger cohorts available for studying chronic diseases, such as the UK Biobank²⁴ and the China Kadoorie Biobank²⁵, both with around 500,000 participants, NHS is unique, owing to comprehensive longitudinal dietary data collection. Beginning in 1976, NHS gathered registered female nurses, ages 30–55 years, from across the United States, initially designed to investigate the use of oral contraception in relation with risk of breast cancer. Participants are asked to complete questionnaires every 2 years, and in 1980 a Food Frequency Questionnaire (FFQ) has been included, designed to capture dietary behaviors. Follow-up dietary questionnaires were administered in 1984, 1986, and every 4 years since then. Questionnaires used from 1984 and thereafter included about 130 foods plus detailed information about brands and types of margarine, breakfast cereals, multiple vitamins, and types of fat used for cooking and baking. As health professionals, nurses were chosen for their ability to complete the health-related questionnaires thoroughly and accurately²⁶. To date, NHS has been expanded to NHS II and NHS III to cover a younger population of nurses²⁷. These three cohorts resulted in an extensive published body of research on the relationships of environmental and genetic factors to various diseases²⁸. The dietary drivers of CHD have been extensively studied within NHS data, most analyses primarily focusing on a single or limited number of exposures, while controlling for an appropriate set of adjusting variables. Some of these findings, as those for trans-fats, have inspired experimental studies and were confirmed to have a causal effect on developing heart diseases^29,30,31.

To have a broad picture of the existing knowledge about diet–disease associations in the NHS data, we mined the literature to identify all studies exploring the dietary determinants of heart-related diseases in original NHS and successive cohorts. The resulting knowledge graph (Fig. 1) shows that the most extensively studied cardiovascular phenotype using NHS data is CHD. Here, we use the term negative when a higher level of exposure is associated with a lower CHD risk. Similarly, we use positive term when a higher level of an exposure is associated with CHD risk. We made this choice for simplicity, and it should not be confused with a causal relationship.

**Fig. 1: The Knowledge Graph of the Dietary Factors Associated with Cardiovascular Disease in the NHS Data.**

Excluding studies of biomarkers and tissues, the neighborhood of CHD comprises nutrients, food items, and dietary scores. Cumulatively, 120 associations were studied, documenting 63 negative associations and 22 positive relations with CHD; in the remaining cases, there was no significant association between dietary exposure and CHD risk, as in the case of dietary magnesium³². The space of studied exposures is rather heterogeneous and is often driven by either the researcher’s interests or evidence from animal or mechanistic studies.

As illustrated in the knowledge graph, the single-association studies using NHS data have broadened our understanding of the dietary determinants of CHD. Some non-significant associations were found to be significant after the application of new statistical approaches in larger datasets. For example, while a study using the original NHS data found no association between fruit fiber intake and risk of CHD³³, a pooled analysis of three NHS cohorts found that the higher consumption of dietary fiber from fruits was negatively associated with risk of CHD³⁴.

Here, we show that a wide-association approach allows us to scan efficiently and systematically the dietary determinants of CHD, bypassing the problem with missing significant associations in epidemiological studies. By applying EWAS methodology to the NHS data, we find that a wide-association approach not only recovers the existing knowledge on diet–disease association, but also facilitates the discovery of novel associations, potentially inspiring future follow-up studies.

Results

Main findings

During the follow-up period, 2774 incident cases of non-fatal MI or fatal CHD were documented in NHS. The baseline average of total caloric intake and body mass index (BMI) among participants who later developed CHD were slightly higher than in those who did not develop CHD. In addition, prospective case subjects on average had lower physical activity compared with the non-case population (Supplementary Table 1). We examined the effect of 374 exposures on CHD risk, including 257 nutrients and 117 food items. The descriptive characteristics of these exposures are shown in Supplementary Table 2.

For the first phase of EWAS, using Cox regression models we collected the estimated effect size, the variance of effect size, hazard ratio (HR) for one standard deviation, P value, the 95% confidence interval (CI) for the HR, P value regarding the proportionality assumption, and the variance inflation factor (VIF). Consider, for example, the results of the long-term effect of isorhamnetin—an O-methylated flavon-ol from the class of flavonoids (Fig. 2b). The estimated HR (0.91, 95% CI: 0.87–0.95; P value 1.59 × 10⁻⁵) implies that one standard deviation higher consumption of Box–Cox-transformed isorhamnetin is associated with 91% lower CHD risk. The P value regarding the proportionality assumption indicates that the use of the Cox model is appropriate. The VIF equal to 1.27 suggests that there is no severe multicollinearity among the variables involved in the isorhamnetin test.

**Fig. 2: Ewas Methodology Description and Output.**

Figure 3a shows the distribution of Cox model P values for all investigated exposures. Exposures are ordered by the estimated HR, so that exposures with HR > 1.0 have harmful effects on CHD risk and exposures with HR < 1.0 are expected to be beneficial. Insignificant associations are mainly distributed around an HR of one, and exposures with smaller P values are scattered at the two ends of the distribution. Using the permutation procedure to account for multiple testing, we found 53 significant associations, including 16 food items and 37 nutrients. All significant associations had VIF < 5. For all significant associations, except for phytate, the P value regarding the proportionality assumption is >0.05. The minimum statistical power for detecting the smallest absolute effect size was 0.59, which is considered to be a moderate to a high level of power in clinical studies. A list of exposures that have a statistically significant association with CHD risk, together with their estimated HRs, is shown in Table 1. A list of both significant and non-significant associations is provided in Supplementary Table 3. We analyzed the correlations among the significant exposures, the result of which is shown in Supplementary Fig. 3. To help interpret our findings, we calculated the HR of each quintile of exposure intake compared with the first quintile, as a reference group (Supplementary Table 6).

**Fig. 3: P value Distribution of Dietary Factors and the Bi-partite Food-Nutirent Network.**

Table 1 EWAS results.

Full size table

Traditional epidemiological studies are limited to the detection of a single exposure (food or nutrient) in relation to CHD; however, the developed EWAS methodology allows us to explore the space of food/nutrient associations related to the disease. That is, in addition to unveiling which nutrient shows significant association with CHD, EWAS also helps us understand which nutrient in connection with which food is responsible for the effect. To demonstrate this principle, we use the food composition table of NHS to extract the contribution of each significant food to the total amount of a significant nutrient in the food supply (Fig. 3a). Using a force-directed layout algorithm, we represent this information as a bipartite network, allowing us to explore the significant inter-dependencies among nutrients and food items. In Fig. 3b, negatively associated nutrients and foods are color coded as green and positively associated nutrients and foods are shown in red. We retrieve two clearly distinct clusters, negatively associated nutrients and foods on one hand, and positively associated nutrients and foods on the other. We also find that several food items, such as white bread and yogurt are connected to both negatively and positively associated nutrients. As expected, foods high in negatively associated nutrients are also related to lower CHD risk and foods containing positively associated nutrients are related to higher CHD risk.

In the network shown in Fig. 3b, each node has its own estimated HR. For each nutrient, we compare the estimated HR with the expected geometric hazard ratio ${\langle H{R}_{i}^{N}\rangle }_{f}$, determined by Eq. (1),

$${\langle {\mathrm{HR}}_{i}^{N}\rangle }_{f}={e}^{\mathop{\sum }\nolimits_{k = 1}^{j}{w}_{k}^{\prime}{\beta }_{k}^{F}}=\mathop{\prod }\limits_{k = 1}^{j}{({\mathrm{HR}}_{k}^{F})}^{{w}_{k}^{\prime}},$$

(1)

where ${w}_{k}^{\prime}$ is the normalized weight of the link connecting nutrient i to food k, ${\beta }_{k}^{F}$ is the estimated effect size of food k, ${\mathrm{HR}}_{k}^{F}$ is the estimated hazard ratio of food k, and ${\mathrm{HR}}_{i}^{N}\lessgtr {\langle {\mathrm{HR}}_{i}^{N}\rangle }_{f}$ (Fig. 4a).

**Fig. 4: Comparison between Actual Nutirent HR and Weighted Food HR.**

As we show in Fig. 4b, for the majority of nutrients, the actual nutrient HR has the same directionality as the weighed food HR, indicating that EWAS captures both important nutrients and their main drivers in the food system. Whole-grain carbohydrate and milled whole-grain carbohydrate were found to be negatively associated with CHD risk; however, among food items high in these two exposures, we only found doughnuts to relate with higher CHD risk statistically significantly. Consequently, the weighted food HR for these exposures has an opposite directional effect. Yet, the correlation analysis (Supplementary Fig. 3) shows that these two exposures have negative correlations with positively associated foods, such as white bread and processed meats, and positive correlations with negatively associated foods, such as cold breakfast cereal and raw carrots. Moreover, myristic acid, trans-18:2, sodium, and total saturated fat were found to be positively associated with higher CHD risk themselves, but the weighted food HR for them indicates an opposite direction. These nutrients are not only distributed among positively associated foods, but also negatively associated foods. Myristic acid is in both processed meats and yogurt, trans-fatty acid 18:2 is present in salad/oil, vinegar dressing, and doughnuts, sodium is spread among salad/oil and vinegar dressing and processed meats, and total saturated fat is found in yogurt and processed meat. However, the consumption of these exposures is positively correlated with positively associated foods (Supplementary Fig. 3). The observed disparity indicates that for most of the nutrients, the structure of the food system determines the amount of nutrients in the diet, while for some other nutrients individual choices drive the nutrient amount in the diet. The signal determined by significant foods is a strong driver, but not always sufficient to capture exhaustively the nutrient associations with CHD. Even though some of the nutrients are not well captured by the food approximation (yellow points), with this approach we tend to correctly estimate the sign of the association, while underestimating the effect size. This observation indicates that solely looking at food items, one would underestimate the effect of those nutrients whose consumption is strongly determined by the behavioral aspect and not mainly by their average amount in food. Additionally, this observation can be partially explained by the higher resolution in the calculation of nutrient intake for breakfast cereals, margarine, and types of fats used for cooking and baking, for which we asked separate questions about their type and brand that linked with extensive databases for detailed composition values.

Comparison with the literature

Applying the EWAS methodology to NHS data allowed us to identify 53 dietary exposures that show statistically significant association with CHD risk. Next, we discuss our results in the context of the previous literature, offering a direct validation of our findings, also helping detect novel knowledge, and to generate new hypotheses. We found that in most cases, EWAS strengthens the existing knowledge about the effect of diet on CHD, and in some cases, it sheds a light on exposures that have not been thoroughly studied in the literature.

Most of our statistically significant findings were in agreement with the previous literature, strengthening the prior findings and supporting the robustness of the EWAS platform. Among food items, we found that white wine, red wine, and liquor, but not beer, have a negative association with higher CHD risk, aligned with previous findings^35,36. Despite recovering an inverse association between alcohol intake and CHD risk (HR 0.88; 95% CI: 0.84–0.91; P value < 5 × 10⁻¹¹), it is worth noting that the overall level of alcohol consumption is not very high in this population study, and such inverse association has not been replicated in other quasi-experimental Mendelian randomization studies^37,38.

Moreover, we found salad/oil and vinegar dressing, yogurt, cold breakfast cereal, raw carrots, raisins or grapes, and peanuts have a negative association with higher CHD risk, in line with previous studies^{39,40,41,42,43,44}. Moreover, we found that total processed meat consumption, hot dogs, apple juice or cider, beverages with sugar, and white bread have positive association with higher CHD risk, as previously shown by other studies^45,46,47,48.

Many compounds from lipid and fatty acid groups have been studied previously in relation to CHD risk. In line with previous studies^{49,50,51,52,53,54,55}, we found that higher consumption of cholesterol, trans-fatty acid 16:1, trans-fatty acid 18:1, trans- fatty acid 18:2, total saturated fat, animal monounsaturated fatty acids (MUFA), myristic acid, palmitoleic acid, palmitic acid, and stearic acid are associated with a higher risk of developing CHD (P value < 1 ×10⁻³). These dietary factors are mainly distributed among animal-based foods. By contrast, we found that plant MUFA can be protective against CHD development (HR 0.90; 95% CI: 0.85–0.95; P value < 2 × 10⁻⁴), which has also been shown by Zong et al.⁵³. Plant MUFAs are abundant in salad/oil and vinegar dressing and peanuts, food items that we also found to be statistically significantly associated with lower CHD risk (salad/oil and vinegar dressing: HR 0.90; 95% CI: 0.87–0.94; P value < 4 × 10⁻⁷; peanuts: HR 0.93; 95% CI: 0.89–0.97; P value < 5 × 10⁻⁴).

Whole grains are composed of endosperm, germ, and bran, in contrast with milled whole grains in which only endosperm is retained. We found that higher consumption of carbohydrate from whole grains is associated with lower CHD risk, similar to the findings in ref. ⁵⁶. While the milling process removes several valuable compounds in whole grains, we interestingly detected a similar protective effect for carbohydrate from milled whole grains. In addition, we showed that both natural bran and added bran are negatively associated with CHD risk, consistent with previous studies^57,58. We also documented a negative association for cereal fiber with CHD (HR 0.91; 95% CI: 0.87–0.95; P value < 5 × 10⁻⁵), in agreement with ref. ⁵⁹. One of the food groups that can be rich in natural bran and germ, added bran, and cereal fiber is cold breakfast cereal, which we also found to be negatively related to CHD (HR 0.94; 95% CI: 0.90–0.97; P value < 2 × 10⁻³).

While we rediscovered the negative association of manganese consumption with CHD risk (HR 0.92; 95% CI: 0.88–0.97; P value < 1 × 10⁻³)⁶⁰, we also found that higher supplemental selenium is associated with lower CHD risk (HR 0.92; 95% CI: 0.88–0.96; P value < 5 × 10⁻⁴). A specific cardiomyopathy responsive to selenium supplementation has been observed in domestic animals⁶¹ and among Chinese persons with Keshan disease⁶². However, more recent studies found no association between selenium supplementation and primary prevention of cardiovascular disease (CVD), for which reason it is not recommended for CVD prevention^63,64. Moreover, our observed positive association of sodium (HR 1.14; 95% CI: 1.08–1.21; P value < 6 × 10⁻⁶) is also consistent with prior studies⁶⁵.

Our findings indicate that higher consumption of dietary folate (HR 0.90; 95% CI: 0.85–0.95; P value < 9 × 10⁻⁵) and folic acid (HR 0.92; 95% CI: 0.88–0.97; P value < 7 × 10⁻⁴) are related to lower CHD risk, again aligned with previous findings^66,67. While the beneficial effect of natural vitamin B₆ was previously documented⁶⁶, we interestingly found that increased synthetic vitamin B₆ consumption is related to lower CHD risk (HR 0.94; 95% CI: 0.90–0.97; P value < 2 × 10⁻³). Moreover, our results reveal that beta-tocopherol, total dietary tocopherol intake, alpha-tocotrienol, and beta-tocotrienol are negatively associated with CHD risk (HR < 0.92; P value < 6 × 10⁻⁴). Earlier studies reported similar effects regarding alpha-tocopherol, total tocopherol intake, and alpha-tocotrienol^68,69. While the antioxidant and anti-inflammatory effects of some of the vitamin E isomers have been documented, little is known about the effect of beta-tocopherol and beta-tocotrienol on cardiovascular health. These two compounds are mainly found in whole-grain products and nuts. In the liver, beta-tocopherol undergoes omega-hydroxylation, oxidation, and beta-oxidation to generate 13′-hydroxychromanols/carboxychromanols, which have potential antioxidant properties^70,71. This fact strengths our findings regarding the protective effect of beta-tocopherol against CHD. Moreover, beta-tocotrienol, in particular, was shown to be inversely related to the risk of type 2 diabetes mellitus⁷², but has not been well-studied regarding its effects on the cardiovascular system. However, among different forms of tocotrienols, beta-tocotrienol has the highest antioxidant activity⁷³, in support of the protective effect against CHD that we observed in EWAS.

We found that heme iron (HR 1.08; 95% CI: 1.03–1.14; P value < 2 × 10⁻³) is statistically significantly associated with higher CHD risk. Similar effects were detected in ref. ⁷⁴. We also found isorhamnetin and apigenin (HR < 0.92; P value < 2 × 10⁻⁴) to be inversely associated with CHD risk, in line with refs. ^75,76 (Supplementary Table 7). Moreover, in EWAS, we found that higher dietary hydroxyproline intake is associated with higher CHD risk (HR 1.12; 95% CI: 1.06-1.17; P value < 2 × 10⁻⁵). Hydroxyproline is a nonessential amino acid derivative and a major component of the protein collagen mainly found in animal-based food products, such as beef, chicken, and pork. Increased hydroxyproline levels in the urine and/or serum are normally associated with degradation of connective tissue and Marfan syndrome⁷⁷, and were also found to be related to Paget disease⁷⁸. The decrease in various hydroxyproline fractions in aortic tissue of rabbits has been shown to be a risk factor for the progression of atherosclerosis⁷⁹. Nonetheless, serum hydroxyproline is mainly associated with peptides released from the breakdown of collagen, and dietary hydroxyproline intake does not considerably affect serum hydroxyproline levels unless consumed in the form of gelatin⁸⁰. While non-prescription hydroxyproline supplements are available as l-hydroxyproline and N-acetyl-l-hydroxyproline, there seems to be no evidence for the effectiveness of oral hydroxyproline supplements in the prevention or treatment of osteoarthritis, osteoporosis, rheumatoid arthritis, skin ulcers, sports injuries, and wrinkled skin, or in promoting muscle growth or weight loss⁸¹. The lack of effectiveness of dietary hydroxyproline is probably a consequence of its failure to be incorporated into collagen: only proline is bioavailable for this purpose. Proline only becomes hydroxylated during a later stage of collagen formation in order to facilitate the strengthening of the collagen helix. Once a collagen helix forms, it does so irreversibly in mammals⁸². The ambiguous role of dietary hydroxyproline, along with the positive association with higher CHD risk that we observed in EWAS, emphasizes the need for exploring the metabolic role that this amino acid plays in health and disease. A potential mechanism that can explain the positive association between hydroxyproline consumption and CHD risk is its metabolic reaction with succinate and CO₂, producing 2-oxoglutarate. Chen et al.⁸³ showed that increased serum 2-oxoglutarate is associated with high myocardial energy expenditure and poor prognosis in chronic heart failure patients.

Validation in NHS II

In the second phase of our study, we used NHS II in order to validate statistically significant associations that we found in the original NHS. During 20 years of follow-up in NHS II, 90,861 participants were followed and 604 CHD incidents were documented (Supplementary Methods 3.1). We examined the relationship between 53 exposures found in NHS with CHD risk and deemed an exposure tentatively validated if it had achieved a false discovery rate (FDR) <0.05 significance in NHS and achieved nominal statistical significance in NHS II (P value < 0.05). Tentatively validated exposures in NHS II had the same directional association with CHD risk as in the original NHS (Table 2). A list of validated and non-validated associations is provided in Supplementary Table 5.

Table 2 Validation in NHS II.

Full size table

Discussion

Our analysis of the dietary determinants of CHD has several limitations based on the nature of the data we used in our analysis. First, the study subjects are only women with a specific occupation (nurses), which restricts the generalizability of the findings to populations comprising males, as well as to more heterogeneous occupational groups and socio-economic backgrounds (Supplementary Figs. 4 and 5). Second, the present study only focused on the effect of dietary factors on CHD risk. Even though diet is an important part of an individual’s environmental exposure, it does not cover the entire exposome⁸⁴. The environment also includes persistent organic pollutants, plastic-associated chemicals, bacterial and viral infections, air quality, stress, and social network effects⁸⁵, as well as the endogenous microbiome. Moreover, our diet is not limited to nutrient content; it also carries food additives and other chemicals added during the packaging process, which are absent in the food composition databases, and hence are not included in this analysis. Including these chemicals is necessary for a more comprehensive picture of the effect of the diet on health^86,87. The only way to overcome these limitations is to include more environmental factors, offering a more comprehensive understanding of the global environment’s effects on health and disease. Moreover, the FFQ used in NHS covers a number of food items that includes the large majority of those consumed by Americans, but the diet of some participants may not be completely represented. Third, while we included the confounding variables that were previously used in other studies on CHD relying on NHS data, our study remains limited to the common confounding variables usually considered when exploring diet–CHD associations. Hence, residual confounding by unmeasured variables cannot be excluded. The obtained results do not unveil causal effects, but, rather, help us generate new hypotheses, which need to be examined in more detail in these and other prospective cohorts and experimental studies. We must also investigate carefully and mechanistically the influence of these dietary factors on human metabolism, exposures that require detailed measurements in terms of dietary bioavailability, hence they can serve as targets for further investigation for mechanism-based analysis.

Our overall goal was to apply GWAS-like analytical approaches to study the dietary determinants of CHD. The methodology allowed us to explore both food items and nutrients, offering a more comprehensive picture of the effect of diet on CHD and helping us visualize the obtained relationships using network tools. Our study not only reproduced the prior knowledge in the diet–CHD domains, but also led to novel associations. While some of the previous EWAS studies failed to achieve adequate statistical power in association detection^19,20, our positive results suggest that these failures were often related to the cohort size and the absence of repeated longitudinal dietary assessments. Indeed, our use of a large longitudinal dataset with a long follow-up period and a sufficient number of subjects helped us achieve sufficient statistical power to detect even relatively small effect sizes. However, enabling a wide-association study to investigate environmental factors requires careful consideration in designing cohort studies, and detailed, comprehensive exposure assessment methods to ensure that the effect of the environment is fully captured. While selectively testing and reporting one or a few associations has been argued to be a source of biased results and false positives^88,89, there is clearly a role for testing specific etiologic hypotheses as this allows greater statistical power and a more detailed examination of an exposure–outcome relationship. An environment-wide association study is a complementary approach that allows us to rank the associations and report transparently both significant and non-significant associations. It also allowed us to generate new hypotheses that can be further investigated in single-association studies and mechanism-based studies. It is worth mentioning that until recently NIH was unlikely to fund research proposals without well-developed hypotheses, restricting the possibility of conducting wide-association studies and this analysis was only possible because of the accrual of over 30 years of follow-up in this large cohort.

In the present study, we explored the effect of only 374 dietary exposures. Yet, when it comes to the chemical composition of the food we consume, these nutritional components represent only a tiny fraction of the thousands of distinct definable biochemicals that have been identified in foods⁹⁰. While many of these chemicals have well-documented or potential implications for health, they remain largely unquantified in any systematic fashion across different individual foods. Their invisibility to experimental, clinical, epidemiological, and demographic studies—turning them into the virtual dark matter of nutrition research—represents a roadblock toward a better, more consistent, more reproducible understanding of how diet affects health^91,92. In the high-resolution diet description space, the conventional single-association approach is even more impractical and lacks scalability. The EWAS methodology, however, would be able to test higher order of magnitude of dietary compounds, to identify significant associations with a disease of interest or with a prescription for health.

Methods

Knowledge graph

To create the knowledge graph, we firstly identified in PubMed all papers that have NHS or Health Professionals Follow-up Study in the title or abstract, along with papers co-authored by the main PIs of the NHS. We manually filtered papers that studied the association between dietary exposures and cardiovascular complications, such as coronary heart disease, stroke, and hypertension. Since not all papers are indexed on PubMed, we searched the web using the same criteria to find the remaining papers. Next, we manually examined the abstracts of the obtained papers and extracted the exposure–phenotype relations, the associations found, the effect size, and other related information (Table 3). More than one association might be studied in a paper. Overall, we found 292 studied associations documented in 91 papers, altogether 124 negative and 45 positive associations were documented in relation to cardiovascular complications. In the remaining cases, there was no significant association between an exposure and a phenotype of study. The obtained data are shown in a knowledge graph (Fig. 1) where each link represents an association. The space of studied exposures is rather heterogeneous and is often driven by the researcher’s interests and experience. For example, the effects of some nutrients were studied with respect to replacement with other nutrients, such as the effect of replacing trans-fat with MUFAs. In some cases, the intake ratio of two nutrients, for example, the ratio of polyunsaturated fat to trans-fat, or of two food items was examined. Moreover, according to the exposure of interest, the set of adjusting variables used to account for confounding effects varied from one study to another. The raw data and code used for constructing the knowledge graph are available at ref. ⁹³.

Table 3 Knowledge graph data preparation.

Full size table

Population

Using the year 1986 as the baseline because the dietary questionnaires have been unchanged since, we followed women who were healthy and free of chronic diseases up to 2014. In the baseline year, participants with a history of CVD, diabetes mellitus, and cancer were excluded. We also excluded women whose demographic data were missing, whose reported average energy intake was <600 or >3500 kcal/day, or left >70 questions in the FFQ unanswered^{10,94,95,96,97}. Participants received one questionnaire every 2 years to report their medical data and one questionnaire every 4 years to document their dietary data. At any point within the follow-up period, if a participant reported development of non-fatal myocardial infarction (MI) or fatal CHD^98,99,100, she will be classified as a case, with no further update of her dietary records. If she developed other diseases, such as diabetes mellitus or cancer, she would still be classified as a non-case, with no further update of her dietary records (Supplementary Methods 3.1 and Supplementary Fig. 1). These exclusion criteria were chosen by virtue of minimizing reverse causation bias and reducing the impact of measurement errors and missing data. In total, we included 62,811 subjects in the analysis, representing 2774 cases (4%) and 60,037 controls (96%).

Ascertainment of diet

NHS uses an internally designed FFQ, with documented reproducibility and validity^10,101,102. The FFQ has been regularly updated to adapt to changes in the food market and to capture additional food items¹⁰³. For each food item, the FFQ specified a commonly used unit or portion size, asking each subject how often, on average, she had consumed that quantity during the past year¹⁰⁴. Nine responses were possible, ranging from “almost never” to “six or more times per day.” We converted the frequency responses to the number of servings per day for each food item. We calculated daily intake of nutrients by multiplying the frequency of consumption of each item by its nutrient content and summing the nutrient contributions of all foods on the basis of Harvard University Food Composition Database derived from US Department of Agriculture sources¹⁰⁵ and other resources, including published reports, data from manufacturers, and in-house analyses of fatty acid composition¹⁰⁶. We looked into several food items more closely. For example, we used an algorithm designed by Jacobs et al.¹⁰⁷ to classify breakfast cereals into wholegrain and refined grain. We also collected detailed information on the type of fat or oil used in food preparation and brand or type of margarines to calculate the fatty acid consumption.

Ascertainment of CHD

We ascertained incident cases of CHD (non-fatal MI or fatal CHD) that occurred after the return of the 1986 questionnaire but before June 1, 2014. Physicians, unaware of the self-reported risk factor status, systematically reviewed the medical records of those who reported having an MI on each biennial questionnaire. MI was classified as confirmed if the World Health Organization criteria, that is, symptoms, electrocardiographic changes, or elevated cardiac enzyme concentrations, were met¹⁰⁸. Fatal CHD was confirmed by either hospital records or through an autopsy if CHD was listed as the cause of death on the death certificate, if it was listed as an underlying and most plausible cause of death, or if evidence of previous CHD was available. Deaths were identified from state vital statistics records and the National Death Index, or were reported by the families and the postal system⁴⁷.

Statistical analysis

Figure 2a shows a brief snapshot of the statistical approaches used in this paper. We used the extended Cox model for time-dependent variables to associate each exposure with the time to occurrence of CHD. The underlying time for the Cox model is the time on study for each participant. We used the cumulative average (Supplementary Methods 3.2) of the food intakes from baseline to the start of each 2-year follow-up interval, which represents the long-term habitual intake and reduces random within-person variation^{10,109,110,111,112,113,114,115,116} (for analyses in which the time-dose effects are taken into account see Supplementary Table 4 and Supplementary Fig. 6). We adjusted the analyses for potential risk factors and confounders, including age (Supplementary Fig. 2), BMI, physical activity, and total caloric intake as continuous covariates; and ethnicity, smoking status, multivitamin use, vitamin E supplement use, post-menopausal hormone use, aspirin use, high blood pressure¹¹⁷, elevated cholesterol¹¹⁸, and family history of MI and high blood pressure as categorical variables. We selected the set of confounding variables based on their potential effects on both exposures and the outcome. Dietary exposures entered the analysis as continuous variables. We used Box–Cox transformation to stabilize the variance and improve the validity of measures of association. Later, the exposures were z-transformed in order to compare the effect sizes from many regressions.

To examine the validity of the EWAS results, we assessed the proportionality assumption for each test. We also tested whether there was severe multicollinearity among the variables in each test by calculating the VIF, which can potentially make effect size estimates unstable, reduce or eliminate statistical power, and cause the coefficients to switch signs¹¹⁹. Ultimately, to control for type I error due to multiple hypotheses testing, we calculated the FDR, the estimated proportion of false discoveries made versus the number of real discoveries at a given significance level¹²⁰ (Supplementary Methods 3.3). To estimate the number of false discoveries, we created a null distribution of Cox model P values by randomly shuffling the CHD status 1000 times and recomputing the P values. Accordingly, we estimated the FDR to be the ratio of the proportion of results that were called significant at a given level α in the null distribution and the proportion of results called significant from real tests. Since in the FDR estimation, we utilize the data themselves, we naturally consider the correlated structure of the data, given the intrinsic dependencies among dietary factors¹²¹. Since the confounding effect of the adjusting covariates on CHD risk exists, certain subjects have greater odds of developing CHD. Therefore, we maintained the confounding role of the adjusting covariates in each permuted dataset while the association between the exposure and CHD has been eliminated¹²². We used Cox regression to estimate the odds of developing CHD as a function of adjusting covariates. Next, we permuted the CHD cases among the subjects as taking a random sample from a biased pool. Furthermore, we re-ran the analyses and calculated the null P values. Repeating this procedure 1000 times, we measured the FDR as the ratio of the proportion of results that were called significant at a given level α in the null distribution to a proportion of results called significant from our real tests.

Study protocol

The study protocol was approved by the institutional review board (IRB) of the Brigham and Women’s Hospital, and the IRB allowed participants’ completion of questionnaires to be considered as implied consent. Written informed consent was obtained from participants to release medical records documenting the incidence of coronary heart disease.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The authors declare that all data supporting the findings of this study are available upon request to Nurses’ Health Study (NHS) and when the request for data access is approved. Access is restricted due to participant confidentiality and privacy concerns. Individuals who want to request access to NHS data must first submit an online request form. If the project is approved, completion of a data use agreement, completion of CITI training demonstrating ethical training in using human subjects’ data, and a small provision of funds to support the computing system will also be required. Further information including the procedures to obtain and access data from the NHS is described in ref. ¹²³ (contact email: nhsaccess@channing.harvard.edu). The Food Frequency Questionnaires used in NHS are available in ref. ¹⁰⁴. Harvard University Food Composition Database can be accessed in ref. ¹⁰⁶. Source data are provided with this paper.

Code availability

The programming materials are available on GitHub and Zenodo platforms⁹³. R version 3.4.0 was used for data analysis and Python version 2.7.16 and MATLAB 2019a were used for data visualizations.

References

Siscovick, D. S. et al. Dietary intake and cell membrane levels of long-chain n-3 polyunsaturated fatty acids and the risk of primary cardiac arrest. JAMA 274, 1363–1367 (1995).
Article CAS PubMed Google Scholar
Estruch, R. et al. Primary prevention of cardiovascular disease with a Mediterranean diet. N. Engl. J. Med. 368, 1279–1290 (2013).
Article CAS PubMed Google Scholar
Li, S. et al. Better diet quality and decreased mortality among myocardial infarction survivors. JAMA Intern. Med. 173, 1808–1819 (2013).
Article ADS PubMed Google Scholar
Mente, A., de Koning, L., Shannon, H. S. & Anand, S. S. A systematic review of the evidence supporting a causal link between dietary factors and coronary heart disease. Archiv. Intern. Med. 169, 659–669 (2009).
Article CAS Google Scholar
Division for Heart Disease and Stroke Prevention. Heart Disease and Stroke Maps www.cdc.gov/dhdsp/maps (2016).
Robertson, T. L. et al. Epidemiologic studies of coronary heart disease and stroke in Japanese men living in Japan, Hawaii and California: coronary heart disease risk factors in Japan and Hawaii. Am. J. Cardiol. 39, 244–249 (1977).
Article CAS PubMed Google Scholar
Capuano, E., Oliviero, T. & van Boekel, M. A. Modeling food matrix effects on chemical reactivity: challenges and perspectives. Crit. Rev. Food Sci. Nutr. 58, 1–15 (2017).
Kolonel, L. N., Yoshizawa, C. N. & Hankin, J. H. Diet and prostatic cancer: a case-control study in Hawaii. Am. J. Epidemiol. 127, 999–1012 (1988).
Article CAS PubMed Google Scholar
Marchand, L. L., Hankin, J. H., Kolonel, L. N. & Wilkens, L. R. Vegetable and fruit consumption in relation to prostate cancer risk in Hawaii: a reevaluation of the effect of dietary beta-carotene. Am. J. Epidemiol. 133, 215–219 (1991).
Article PubMed Google Scholar
Willett, W. Nutritional Epidemiology (Oxford Univ. Press, 2012).
Cespedes, E. M. & Hu, F. B. Dietary patterns: from nutritional epidemiologic analysis to national guidelines. Am. J. Clin. Nutr. 101, 899–900 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hu, F. B. Dietary pattern analysis: a new direction in nutritional epidemiology. Curr. Opin. Lipidol. 13, 3–9 (2002).
Article CAS PubMed Google Scholar
Shimazu, T. et al. Dietary patterns and cardiovascular disease mortality in Japan: a prospective cohort study. Int. J. Epidemiol. 36, 600–609 (2007).
Article PubMed Google Scholar
Tapsell, L. C., Neale, E. P., Satija, A. & Hu, F. B. Foods, nutrients, and dietary patterns: interconnections and implications for dietary guidelines. Adv. Nutr. 7, 445–454 (2016).
Article CAS PubMed PubMed Central Google Scholar
Patel, C. J., Bhattacharya, J. & Butte, A. J. An environment-wide association study (EWAS) on type 2 diabetes mellitus. PLoS ONE 5, e10746 (2010).
Article ADS PubMed PubMed Central CAS Google Scholar
Wulaningsih, W. et al. Investigating nutrition and lifestyle factors as determinants of abdominal obesity: an environment-wide study. Int. J. Obes. 41, 340 (2017).
Article CAS Google Scholar
Merritt, M. A. et al. Investigation of dietary factors and endometrial cancer risk using a nutrient-wide association study approach in the EPIC and Nurses’ Health Study (NHS) and NHS II. Cancer Epidemiol. Prev. Biomarkers 24, 466–471 (2015).
Article CAS Google Scholar
Forman, J. P. & Willett, W. C. Nutrient-wide association studies: another road to the same destination. Circulation 126, 2447–2448 (2012).
Tzoulaki, I. et al. A nutrient-wide association study on blood pressure. Circulation 126, 2456–2464 (2012).
Article CAS PubMed PubMed Central Google Scholar
McGinnis, D. P., Brownstein, J. S. & Patel, C. J. Environment-wide association study of blood pressure in the national health and nutrition examination survey (1999–2012). Scientific Rep. 6, 30373 (2016).
Article ADS CAS Google Scholar
Mubarak, A., Hodgson, J. M., Considine, M. J., Croft, K. D. & Matthews, V. B. Supplementation of a high-fat diet with chlorogenic acid is associated with insulin resistance and hepatic lipid accumulation in mice. J. Agric. Food Chem. 61, 4371–4378 (2013).
Article CAS PubMed Google Scholar
Onakpoya, I., Terry, R. & Ernst, E. The use of green coffee extract as a weight loss supplement: a systematic review and meta-analysis of randomised clinical trials. Gastroenterol. Res. Pract. 2011, 382852 (2011).
Vogel, R. A., Corretti, M. C. & Plotnick, G. D. The postprandial effect of components of the Mediterranean diet on endothelial function. J. Am. College Cardiol. 36, 1455–1460 (2000).
Article CAS Google Scholar
Sudlow, C. et al. Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Article PubMed PubMed Central Google Scholar
Chen, Z. et al. China kadoorie biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int. J. Epidemiol. 40, 1652–1666 (2011).
Article PubMed PubMed Central Google Scholar
Colditz, G. A. The Nurses’ Health Study: a cohort of us women followed since 1976. J. Am. Med. Women’s Assoc. 50, 40 (1995).
CAS Google Scholar
Colditz, G. A., Manson, J. E. & Hankinson, S. E. The Nurses’ Health Study: 20-year contribution to the understanding of health among women. J. Women’s Health 6, 49–62 (1997).
Article CAS Google Scholar
Colditz, G. A. & Hankinson, S. E. The Nurses’ Health Study: lifestyle and health among women. Nat. Rev. Cancer 5, 388 (2005).
Article CAS PubMed Google Scholar
Willett, W. C. et al. Intake of trans fatty acids and risk of coronary heart disease among women. Lancet 341, 581–585 (1993).
Article CAS PubMed Google Scholar
Mensink, R. P., Zock, P. L., Kester, A. D. & Katan, M. B. Effects of dietary fatty acids and carbohydrates on the ratio of serum total to HDL cholesterol and on serum lipids and apolipoproteins: a meta-analysis of 60 controlled trials. Am. J. Clin. Nutr. 77, 1146–1155 (2003).
Article CAS PubMed Google Scholar
Mozaffarian, D. & Clarke, R. Quantitative effects on cardiovascular risk factors and coronary heart disease risk of replacing partially hydrogenated vegetable oils with other fats and oils. Eur. J. Clin. Nutr. 63, S22 (2009).
Article CAS PubMed Google Scholar
Chiuve, S. E. et al. Dietary and plasma magnesium and risk of coronary heart disease among women. J. Am. Heart Assoc. 2, e000114 (2013).
Article PubMed PubMed Central CAS Google Scholar
Wolk, A. et al. Long-term intake of dietary fiber and decreased risk of coronary heart disease among women. JAMA 281, 1998–2004 (1999).
Article CAS PubMed Google Scholar
Pereira, M. A. et al. Dietary fiber and risk of coronary heart disease: a pooled analysis of cohort studies. Arch. Intern. Med. 164, 370–376 (2004).
Article PubMed Google Scholar
Mukamal, K. J. et al. Roles of drinking pattern and type of alcohol consumed in coronary heart disease in men. N. Engl. J. Med. 348, 109–118 (2003).
Article PubMed Google Scholar
Estruch, R. et al. Moderate consumption of red wine, but not gin, decreases erythrocyte superoxide dismutase activity: a randomised cross-over trial. Nutr. Metab. Cardiovasc. Dis. 21, 46–53 (2011).
Article CAS PubMed Google Scholar
Millwood, I. Y. et al. Conventional and genetic evidence on alcohol and vascular disease aetiology: a prospective study of 500 000 men and women in china. Lancet 393, 1831–1842 (2019).
Article PubMed PubMed Central Google Scholar
Holmes, M. V. et al. Association between alcohol and cardiovascular disease: Mendelian randomisation analysis based on individual participant data. BMJ 349, g4164 (2014).
Article PubMed PubMed Central CAS Google Scholar
Hu, F. B. et al. Dietary intake of α-linolenic acid and risk of fatal ischemic heart disease among women. Am. J. Clin. Nutr. 69, 890–897 (1999).
Article CAS PubMed Google Scholar
Rice, B. H. Dairy and cardiovascular disease: a review of recent observational research. Curr. Nutr. Rep. 3, 130–138 (2014).
Article CAS PubMed PubMed Central Google Scholar
Djoussé, L. & Gaziano, J. M. Breakfast cereals and risk of heart failure in the physicians’ health study I. Arch. Intern. Med. 167, 2080–2085 (2007).
Article PubMed Google Scholar
Gaziano, J. M. et al. A prospective study of consumption of carotenoids in fruits and vegetables and decreased cardiovascular mortality in the elderly. Ann. Epidemiol. 5, 255–260 (1995).
Article CAS PubMed Google Scholar
Puglisi, M. J. et al. Raisins and additional walking have distinct effects on plasma lipids and inflammatory cytokines. Lipids Health Dis. 7, 14 (2008).
Article PubMed PubMed Central CAS Google Scholar
Hu, F. B. et al. Frequent nut consumption and risk of coronary heart disease in women: prospective cohort study. BMJ 317, 1341–1345 (1998).
Article CAS PubMed PubMed Central Google Scholar
Micha, R., Michas, G. & Mozaffarian, D. Unprocessed red and processed meats and risk of coronary artery disease and type 2 diabetes—an updated review of the evidence. Curr. Atheroscler. Rep. 14, 515–524 (2012).
Article CAS PubMed PubMed Central Google Scholar
Pase, M. P., Grima, N., Cockerell, R. & Pipingas, A. Habitual intake of fruit juice predicts central blood pressure. Appetite 84, 68–72 (2015).
Article PubMed Google Scholar
Fung, T. T. et al. Sweetened beverage consumption and risk of coronary heart disease in women. Am. J. Clin. Nutr. 89, 1037–1042 (2009).
Article CAS PubMed PubMed Central Google Scholar
Sieri, S. et al. Dietary glycemic load and index and risk of coronary heart disease in a large italian cohort: the epicor study. Arch. Intern. Med. 170, 640–647 (2010).
Article CAS PubMed Google Scholar
Connor, W. E. & Connor, S. L. Dietary cholesterol and coronary heart disease. Curr. Atherosclerosis Rep. 4, 425–432 (2002).
Article Google Scholar
Mozaffarian, D., Aro, A. & Willett, W. C. Health effects of trans-fatty acids: experimental and observational evidence. Eur. J. Clin. Nutr. 63, S5 (2009).
Article CAS PubMed Google Scholar
Sun, Q. et al. A prospective study of trans fatty acids in erythrocytes and risk of coronary heart disease. Circulation 115, 1858–1865 (2007).
Article CAS PubMed Google Scholar
Zong, G. et al. Intake of individual saturated fatty acids and risk of coronary heart disease in us men and women: two prospective longitudinal cohort studies. BMJ 355, i5796 (2016).
Article PubMed PubMed Central Google Scholar
Zong, G. et al. Monounsaturated fats from plant and animal sources in relation to risk of coronary heart disease among us men and women. Am. J. Clin. Nutr. 107, 445–453 (2018).
Article PubMed PubMed Central Google Scholar
Djoussé, L., Weir, N. L., Hanson, N. Q., Tsai, M. Y. & Gaziano, J. M. Plasma phospholipid concentration of cis-palmitoleic acid and risk of heart failureclinical perspective. Circ. Heart Fail. 5, 703–709 (2012).
Article PubMed PubMed Central CAS Google Scholar
Hu, F. B., Manson, J. E. & Willett, W. C. Types of dietary fat and risk of coronary heart disease: a critical review. J. Am. Coll. Nutr. 20, 5–19 (2001).
Article PubMed Google Scholar
Li, Y. et al. Saturated fats compared with unsaturated fats and sources of carbohydrates in relation to risk of coronary heart disease: a prospective cohort study. J. Am. Coll. Cardiol. 66, 1538–1548 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wu, H. et al. Association between dietary whole grain intake and risk of mortality: two large prospective studies in US men and women. JAMA Intern. Med. 175, 373–384 (2015).
Article PubMed PubMed Central Google Scholar
Jensen, M. K. et al. Intakes of whole grains, bran, and germ and the risk of coronary heart disease in men. Am. J. Clin. Nutr. 80, 1492–1499 (2004).
Article CAS PubMed Google Scholar
AlEssa, H. B. et al. Carbohydrate quality and quantity and risk of coronary heart disease among us women and men. Am. J. Clin. Nutr. 107, 257–267 (2018).
Article PubMed PubMed Central Google Scholar
Rosique-Esteban, N., Guasch-Ferré, M., Hernández-Alonso, P. & Salas-Salvadó, J. Dietary magnesium and cardiovascular disease: a review with emphasis in epidemiological studies. Nutrients 10, 168 (2018).
Article PubMed Central CAS Google Scholar
Burk, R. F. in Human and Veterinary Nutrition, Biochemical Aspects of Nutrients, Vol. 30, 88–106 (Karger Publishers, 1978).
Chen, X. et al. Studies on the relations of selenium and keshan disease. Biol. Trace Element Res. 2, 91–107 (1980).
Article CAS Google Scholar
Stranges, S. et al. Effects of selenium supplementation on cardiovascular disease incidence and mortality: secondary analyses in a randomized clinical trial. Am. J. Epidemiol. 163, 694–699 (2006).
Article PubMed Google Scholar
Flores-Mateo, G., Navas-Acien, A., Pastor-Barriuso, R. & Guallar, E. Selenium and coronary heart disease: a meta-analysis. Am. J. Clin. Nutr. 84, 762–773 (2006).
Article CAS PubMed Google Scholar
Aburto, N. J. et al. Effect of lower sodium intake on health: systematic review and meta-analyses. BMJ 346, f1326 (2013).
Article PubMed PubMed Central Google Scholar
Rimm, E. B. et al. Folate and vitamin b6 from diet and supplements in relation to risk of coronary heart disease among women. JAMA 279, 359–364 (1998).
Article ADS CAS PubMed Google Scholar
Malinow, M. R. et al. Reduction of plasma homocyst(e)ine levels by breakfast cereal fortified with folic acid in patients with coronary heart disease. N. Engl. J. Med. 338, 1009–1015 (1998).
Article CAS PubMed Google Scholar
Li, G. et al. Circulating tocopherols and risk of coronary artery disease: a systematic review and meta-analysis. Eur. J. Prev. Cardiol. 23, 748–757 (2016).
Article PubMed Google Scholar
Prasad, K. Tocotrienols and cardiovascular health. Curr. Pharm. Des. 17, 2147–2154 (2011).
Article CAS PubMed Google Scholar
Jiang, Q. Natural forms of vitamin E: metabolism, antioxidant, and anti-inflammatory activities and their role in disease prevention and therapy. Free Radic. Biol. Med. 72, 76–90 (2014).
Article CAS PubMed PubMed Central Google Scholar
Mathur, P., Ding, Z., Saldeen, T. & Mehta, J. L. Tocopherols in the prevention and treatment of atherosclerosis and related cardiovascular disease. Clin. Cardiol. 38, 570–576 (2015).
Article PubMed PubMed Central Google Scholar
Montonen, J., Knekt, P., Järvinen, R. & Reunanen, A. Dietary antioxidant intake and risk of type 2 diabetes. Diabetes Care 27, 362–366 (2004).
Article CAS PubMed Google Scholar
Shahidi, F. Antioxidants in food and food antioxidants. Food/Nahrung 44, 158–163 (2000).
Article CAS PubMed Google Scholar
Yang, W. et al. Is heme iron intake associated with risk of coronary heart disease? A meta-analysis of prospective studies. Eur. J. Nutr. 53, 395–400 (2014).
Article CAS PubMed Google Scholar
Gao, L. et al. Isorhamnetin protects against cardiac hypertrophy through blocking PI3K–AKT pathway. Mol. Cell. Biochem. 429, 167–177 (2017).
Article CAS PubMed Google Scholar
Zhang, K., Song, W., Li, D. & Jin, X. Apigenin in the regulation of cholesterol metabolism and protection of blood vessels. Exp. Ther. Med. 13, 1719–1724 (2017).
Article CAS PubMed PubMed Central Google Scholar
Laitinen, O., Nikkilä, E. & Kivirikko, K. Hydroxyproline in the serum and urine normal values and clinical significance. Acta Med. Scand. 179, 275–284 (1966).
Article CAS PubMed Google Scholar
Wheeless, C. R. Wheeless’ Textbook of Orthopaedics (C.R. Wheeless, M.D., 1996).
Abdelhalim, M. A. K., Siddiqi, N., Alhomida, A. & Al-Ayed, M. S. The changes in various hydroxyproline fractions in aortic tissue of rabbits are closely related to the progression of atherosclerosis. Lipids Health Dis. 9, 26 (2010).
Article PubMed PubMed Central CAS Google Scholar
Prockop, D. J. & Sjoerdsma, A. Significance of urinary hydroxyproline in man. J. Clin. Invest. 40, 843–849 (1961).
Article CAS PubMed PubMed Central Google Scholar
US Food and Drug Administration. New Dietary Ingredient Notification for N-Acetyl-l-Hydroxyproline (US Food and Drug Administration, 2001).
Barbul, A. Proline precursors to sustain mammalian collagen synthesis. J. Nutr. 138, 2021S–2024S (2008).
Article CAS PubMed Google Scholar
Chen, P. et al. Increased serum 2-oxoglutarate associated with high myocardial energy expenditure and poor prognosis in chronic heart failure patients. Biochim. Biophys. Acta 1842, 2120–2125 (2014).
Article CAS PubMed Google Scholar
Rappaport, S. M. & Smith, M. T. Environment and disease risks. Science 330, 460–461 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Christakis, N. A. & Fowler, J. H. The spread of obesity in a large social network over 32 years. N. Engl. J. Med. 357, 370–379 (2007).
Article CAS PubMed Google Scholar
Bouvard, V. et al. Carcinogenicity of consumption of red and processed meat. Lancet Oncol. 16, 1599 (2015).
Article PubMed Google Scholar
Tonacchera, M. et al. Relative potencies and additivity of perchlorate, thiocyanate, nitrate, and iodide on the inhibition of radioactive iodide uptake by the human sodium iodide symporter. Thyroid 14, 1012–1019 (2004).
Article CAS PubMed Google Scholar
Ioannidis, J. P., Loy, E. Y., Poulton, R. & Chia, K. S. Researching genetic versus nongenetic determinants of disease: a comparison and proposed unification. Sci. Transl. Med. 1, 7ps8–7ps8 (2009).
Article PubMed Google Scholar
Boffetta, P. et al. False-positive results in cancer epidemiology: a plea for epistemological modesty. J. Natl Cancer Inst. 100, 988–995 (2008).
Article PubMed PubMed Central Google Scholar
The Metabolomics Innovation Centre (TMIC). FooDB http://foodb.ca (2018).
Barabási, A.-L., Menichetti, G. & Loscalzo, J. The Nutritional Dark Matter: the unmapped chemical complexity of our diet. Nat. Food 1, 33–37 (2019).
Article Google Scholar
Hooton, F., Menichetti, G. & Barabási, A.-L. Exploring food contents in scientific literature with FoodMine. Scientific Reports 10, (2020)
Milanlouei, S. & Menichetti, G. soodimilanlouei/EWAS-NHS: v1.0.1 (Version v1.0.1). Zenodo. https://doi.org/10.5281/zenodo.4038928 (2020).
Michels, K. B. & Willett, W. C. Self-administered semiquantitative food frequency questionnaires: patterns, predictors, and interpretation of omitted items. Epidemiology 20, 295 (2009).
Article PubMed PubMed Central Google Scholar
Gates, M. A. et al. A prospective study of dietary flavonoid intake and incidence of epithelial ovarian cancer. Int. J. Cancer 121, 2225–2232 (2007).
Article CAS PubMed Google Scholar
Conen, D. et al. Caffeine consumption and incident atrial fibrillation in women. Am. J. Clin. Nutr. 92, 509–514 (2010).
Article CAS PubMed PubMed Central Google Scholar
Guasch-Ferré, M. et al. Nut consumption and risk of cardiovascular disease. J. Am. Coll. Cardiol. 70, 2519–2532 (2017).
Article PubMed PubMed Central CAS Google Scholar
Hu, F. B. et al. Dietary fat intake and the risk of coronary heart disease in women. N. Engl. J. Med. 337, 1491–1499 (1997).
Article CAS PubMed Google Scholar
Hu, F. B. et al. Fish and omega-3 fatty acid intake and risk of coronary heart disease in women. JAMA 287, 1815–1821 (2002).
Article CAS PubMed Google Scholar
Liu, S. et al. Whole-grain consumption and risk of coronary heart disease: results from the Nurses’ Health Study. Am. J. Clin. Nutr. 70, 412–419 (1999).
Article CAS PubMed Google Scholar
Yuan, C. et al. Relative validity of nutrient intakes assessed by questionnaire, 24-hour recalls, and diet records as compared with urinary recovery and plasma concentration biomarkers: findings for women. Am. J. Epidemiol. 187, 1051–1063 (2017).
Article PubMed Central Google Scholar
Yuan, C. et al. Validity of a dietary questionnaire assessed by comparison with multiple weighed dietary records or 24-hour recalls. Am. J. Epidemiol. 185, 570–584 (2017).
Article PubMed PubMed Central Google Scholar
Oh, K., Hu, F. B., Manson, J. E., Stampfer, M. J. & Willett, W. C. Dietary fat intake and risk of coronary heart disease in women: 20 years of follow-up of the Nurses’ Health Study. Am. J. Epidemiol. 161, 672–679 (2005).
Article PubMed Google Scholar
Harvard, T. H. Chan School of Public Health Nutrition Department. Nurses’ Health Study: Food Frequency Questionnaires. https://www.nurseshealthstudy.org/participants/questionnaires (2020).
US Department of Agriculture. Composition of Foods, 1976 to 1992. Agricultural Handbook No. 8 Series (Department of Agriculture, Washington, 1989).
Harvard T.H. Chan School of Public Health Nutrition Department. Harvard University Food Composition Database https://regepi.bwh.harvard.edu/health/nutrition.html (2020).
Jacobs, D. R., Meyer, K. A., Kushi, L. H. & Folsom, A. R. Whole-grain intake may reduce the risk of ischemic heart disease death in postmenopausal women: the Iowa Women’s Health Study. Am. J. Clin. Nutr. 68, 248–257 (1998).
Rose, G. A. et al. Cardiovascular Survey Methods, Vol. 56 (WHO, Geneva, 1982).
Kahn, H. A. & Dawber, T. R. The development of coronary heart disease in relation to sequential biennial measures of cholesterol in the Framingham study. J. Clin. Epidemiol. 19, 611–620 (1966).
CAS Google Scholar
Larsson, S. C., Giovannucci, E. & Wolk, A. Coffee consumption and stomach cancer risk in a cohort of Swedish women. Int. J. Cancer 119, 2186–2189 (2006).
Article CAS PubMed Google Scholar
Kim, E. H. et al. Dietary fat and risk of postmenopausal breast cancer in a 20-year follow-up. Am. J. Epidemiol. 164, 990–997 (2006).
Article PubMed Google Scholar
Haring, B. et al. Dietary protein intake and coronary heart disease in a large community based cohort: results from the Atherosclerosis Risk in Communities (ARIC) Study. PLoS ONE 9, e109552 (2014).
Halton, T. L. et al. Low-carbohydrate-diet score and the risk of coronary heart disease in women. N. Engl. J. Med. 355, 1991–2002 (2006).
Article CAS PubMed Google Scholar
Streppel, M. T., Ocké, M. C., Boshuizen, H. C., Kok, F. J. & Kromhout, D. Dietary fiber intake in relation to coronary heart disease and all-cause mortality over 40 y: the Zutphen Study. Am. J. Clin. Nutr. 88, 1119–1125 (2008).
Article CAS PubMed Google Scholar
Eshak, E. S. et al. Soft drink intake in relation to incident ischemic heart disease, stroke, and stroke subtypes in Japanese men and women: the Japan Public Health Centre–based study cohort I. Am. J. Clin. Nutr. 96, 1390–1397 (2012).
Article CAS PubMed Google Scholar
Hu, F. B. et al. Dietary fat and coronary heart disease: a comparison of approaches for adjusting for total energy intake and modeling repeated dietary measurements. Am. J. Epidemiol. 149, 531–540 (1999).
Article CAS PubMed Google Scholar
Lawes, C. M., Bennett, D. A., Lewington, S. & Rodgers, A. in Seminars in Vascular Medicine, Vol. 2, 355–368 (Thieme Medical Publishers, Inc., New York, 2002).
Huxley, R., Lewington, S. & Clarke, R. in Seminars in Vascular Medicine, Vol. 2, 315–324 (Thieme Medical Publishers Inc., New York, 2002).
Belsley, D. A. in Encyclopedia of Statistical Sciences, Vol. 2 (Wiley, 2004).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Article ADS MathSciNet CAS PubMed MATH PubMed Central Google Scholar
Noble, W. S. How does multiple testing correction work? Nat. Biotechnol. 27, 1135 (2009).
Article CAS PubMed PubMed Central Google Scholar
Epstein, M. P. et al. A permutation procedure to correct for confounders in case-control studies, including tests of rare variation. Am. J. Hum. Genet. 91, 215–223 (2012).
Article CAS PubMed PubMed Central Google Scholar
Harvard T.H. Chan School of Public Health Nutrition Department. External Collaboration Request https://www.nurseshealthstudy.org/researchers (2020).

Download references

Acknowledgements

We were supported by grants from NIH (grants P01 HL132825, UM1 CA186107, P01 CA87969, R01 CA49449, R01 HL034594, R01 HL088521, UM1 CA176726, R01 CA67262, U54 HL119145, U01 HG007690, and P50 GM107618) and AHA (grants 151708, 414110-68953, and D700382). A.-L.B. was supported by NIH 1P01HL132825, Rockefeller Foundation 2109 FOD 026, and the European Union’s Horizon 2020 research and innovation programme under grant agreement No 810115 - DYNASNET.

Author information

These authors contributed equally: Soodabeh Milanlouei, Giulia Menichetti.

Authors and Affiliations

Center for Complex Network Research, Northeastern University, Boston, MA, USA
Soodabeh Milanlouei, Giulia Menichetti & Albert-László Barabási
Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Yanping Li & Walter C. Willett
Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
Joseph Loscalzo, Walter C. Willett & Albert-László Barabási
Center for Network Science, Central European University, Budapest, Hungary
Albert-László Barabási

Authors

Soodabeh Milanlouei
View author publications
You can also search for this author in PubMed Google Scholar
Giulia Menichetti
View author publications
You can also search for this author in PubMed Google Scholar
Yanping Li
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Loscalzo
View author publications
You can also search for this author in PubMed Google Scholar
Walter C. Willett
View author publications
You can also search for this author in PubMed Google Scholar
Albert-László Barabási
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.M. performed data query and integration, statistical modeling, network analysis, and programming and contributed to writing the manuscript. G.M. contributed to network analysis, statistical modeling, programming, and writing the manuscript. Y.L. contributed to data query and programming. J.L. contributed to interpreting the results and writing the manuscript. W.C.W. contributed to data collection, analyzing the results, and writing the manuscript. A.-L.B. contributed to the conceptual design of the study and writing the manuscript.

Corresponding author

Correspondence to Albert-László Barabási.

Ethics declarations

Competing interests

A.-L.B. is founder of Nomix and Foodome, and J.L. and A.-L.B. are founders of Scipher Medicine, companies that explore the use of network-based tools in health. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Ramon Estruch, Paolo Vineis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Supplementary Software

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Milanlouei, S., Menichetti, G., Li, Y. et al. A systematic comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease. Nat Commun 11, 6074 (2020). https://doi.org/10.1038/s41467-020-19888-2

Download citation

Received: 21 June 2019
Accepted: 29 October 2020
Published: 27 November 2020
DOI: https://doi.org/10.1038/s41467-020-19888-2

This article is cited by

Nutritional redundancy in the human diet and its application in phenotype association studies
- Xu-Wen Wang
- Yang Hu
- Yang-Yu Liu
Nature Communications (2023)
Macrophages in cardiac remodelling after myocardial infarction
- Jonathan Yap
- Jason Irei
- William A. Boisvert
Nature Reviews Cardiology (2023)
Machine learning prediction of the degree of food processing
- Giulia Menichetti
- Babak Ravandi
- Albert-László Barabási
Nature Communications (2023)
Nutrient concentrations in food display universal behaviour
- Giulia Menichetti
- Albert-László Barabási
Nature Food (2022)
Application of Medical Knowledge Graphs in Cardiology and Cardiovascular Medicine: A Brief Literature Review
- Hong Wang
- Quannan Zu
- Jiawang Ding
Advances in Therapy (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Main findings

Comparison with the literature

Validation in NHS II

Discussion

Methods

Knowledge graph

Population

Ascertainment of diet

Ascertainment of CHD

Statistical analysis

Study protocol

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links