Introduction

Elevated low-density lipoprotein (LDL)-cholesterol is an important risk factor of atherosclerosis (a disease of the large arteries), which is the primary cause of atherosclerotic cardiovascular diseases (CVDs)1. Nutrient trials have shown that intake of saturated fatty acids increases serum LDL-cholesterol relative to intake of unsaturated fatty acids and carbohydrates2. As whole-fat dairy products have a high saturated fat content, food-based dietary guidelines recommend consumption of low-fat dairy products in place of high-fat dairy products3. However, the health effects of saturated fat varies depending on the specific fatty acid and possibly by the specific food source4,5.

The role of dairy foods in CVD prevention has been reviewed and summarized6,7,8,9. Because dairy is a heterogeneous food group of solid, semi solid and liquid fermented or non-fermented products, differing in nutrients and probiotics found in fermented dairy products, the focus has been on the intake of specific types of dairy products in relation to CVD development rather than on total intake of dairy products. In a systematic review and meta-analysis from 2017 by Guo et al.9 associations between intake of milk, yogurt and cheese and risk of coronary heart disease (CHD) were summarized. No associations between intake of any of the investigated dairy product subgroups and CHD were observed. In a systematic review and meta-analysis from 2016 by de Goede et al.6 associations between intake of milk, yogurt, cheese and butter and risk of stroke were summarized. A higher intake of milk was found associated with a lower risk of stroke, and a higher intake of high-fat milk was found to be associated with a higher risk of stroke. No associations between the intake of low-fat milk, yogurt, cheese or butter and stroke were observed. In subgroup analyses by stroke type (ischemic and hemorrhagic), no associations were observed with milk intake. Associations between stroke type and intake of low-fat milk or high-fat milk were not considered; nor were associations with other dairy product subgroups.

We conducted a systematic review and meta-analysis of cohort studies to summarize the findings on the associations between total intake of dairy products and intake of dairy product subgroups (milk, yogurt, cheese and butter) and the risk of major atherosclerotic CVDs (total (the sum of fatal and nonfatal) CHD, total ischemic stroke and peripheral artery disease) in the general adult population. Data synthesis was also differentiated according to the fat content of the dairy product subgroups. The focus of this systematic review is on atherosclerotic CVDs because dairy products may play an important role in prevention of atherosclerotic CVDs. We included only studies considering total CHD and total ischemic stroke as outcomes. Studies on total CHD and total ischemic stroke may be seen as studies on etiology of CHD and ischemic stroke, respectively, whereas studies on only fatal outcomes may be seen as studies exploring combined etiology and prognosis. The associations between dairy and major atherosclerotic CVDs were investigated using high versus low intake and dose–response (linear and non-linear) meta-analysis.

Methods

Protocol and registration

This systematic review and meta-analysis was planned and conducted according to the preferred reporting items for systematic reviews and meta-analyses (PRISMA)10,11. The systematic review protocol, including review question, search strategy, eligibility criteria (participants, exposure, comparator, outcome and study design (PECOS) items) and risk of bias assessment, was registered in PROSPERO International Prospective Register of Systematic Reviews (http://www.crd.york.ac.uk/PROSPERO, identifier CRD42019125455) ahead of conducting the review. The PRISMA checklist has been submitted to the journal as an attachment to this article (see Supplementary Table S1 online).

Eligibility criteria

Participants: Participants recruited from the general adult population.

Exposure: Considering at least one of the exposures total intake of dairy products, intake of low-fat dairy products, intake of high-fat dairy products, intake of milk, intake of low-fat milk (fat content lower than whole-fat milk), intake of high-fat milk (fat content equalling whole-fat milk), intake of yogurt/other soured milk products (referred to hereinafter as yogurt), intake of low-fat yogurt/other soured milk products (fat content lower than whole-fat yogurt/other soured milk products, referred to hereinafter as low-fat yogurt), intake of high-fat yogurt/other soured milk products (fat content equalling whole-fat yogurt/other soured milk products, referred to hereinafter as high-fat yogurt), intake of cheese, intake of low-fat cheese, intake of high-fat cheese and intake of butter.

Comparator: When the exposure is total intake of dairy products, the comparator is a lower intake of total intake of dairy products and when the exposure is intake of a dairy product subgroup, the comparator is a lower intake of that particular dairy product subgroup.

Outcome: Considering at least one of the outcomes total CHD, total ischemic stroke and peripheral artery disease.

Study design: Cohort study that provides evidence about associations between exposures and incidence of hard outcomes.

Only published articles were considered. There were no restrictions to language.

Search

Studies were identified by searching the bibliographic databases PubMed and Embase (through 15 August 2019). The search included only terms related to exposures and outcomes. The full literature search strategy for PubMed is shown in Supplementary Methods online. Additionally, we checked the reference lists of the included studies. Previous reviews and meta-analyses were also used as an information source.

Study selection

An eligibility assessment was performed independently in a standardised manner by two reviewers (M.U.J. and M.O. or M.U.J. and E.T.). Titles and/or abstracts of records identified through the search were screened against the eligibility criteria. The full text of articles that appeared to meet the study eligibility criteria, or where there was any uncertainty about eligibility, was retrieved and assessed for eligibility. Any disagreement between the two reviewers over the eligibility assessment of a particular study was resolved by discussion. The PRISMA flow diagram11 was used to summarize the study selection processes.

Data collection process and data items

A data extraction sheet (developed a priori by the reviewers and pilot-tested) was used to extract data from the included studies. One reviewer (M.U.J.) extracted the following data items from the studies and a second reviewer (M.O., E.T., M.G.G., C.B.L. or S.K.V.) checked the extracted data: First author’s last name, publication year, cohort name, study origin, recruitment year/period, gender, age at entry, sample size, exclusion criteria, exposure and exposure definition, exposure frequency and quantity (M.O.); method for collecting information on exposure (E.T.); outcome, method for ascertainment of outcome (S.K.V.); length of follow-up, lost to follow-up, adjustment variables; total number of events (M.O.); number of events per exposure level, number of participants or person-years per exposure level (M.G.G. or C.B.L.); point risk estimates for associations and their confidence limits, funding sources (M.O.). When risk estimates from more than one multivariable analysis were reported, we extracted data from the analysis adjusting for the largest number of confounders. Moreover, if risk estimates from specified substitution analyses were reported, only risk estimates for substitutions between dairy products were extracted. Any disagreement between the two reviewers over the extracted data of a particular study was resolved by discussion. We contacted (by e-mail) 12 authors12,13,14,15,16,17,18,19,20,21,22,23 of the included studies for further information on study methods (exposure definition, quantitative amount) and results (number of events per exposure category, number of participants or person-years per exposure category). In case a study was published more than once, we included the most comprehensive study in terms of exposures considered or sample size if exposure was the same.

Risk of bias in individual studies

Risk of bias judgements of the included studies were based on the Newcastle–Ottawa scale (NOS) tool24. Two reviewers (M.U.J. and E.T. or M.U.J. and S.K.V.) independently assessed the internal validity of the included studies by assessing risk of selection bias, information bias and confounding. Any disagreement between the two reviewers over the risk of bias judgements (“low risk” of bias, “high risk” of bias, “unclear risk” of bias) of a particular study was resolved by discussion. A cross-tabulation of study by quality item was used to present the risk of bias judgements. A study was classified as being at low risk of bias in general only if ≤ 2 of the 8 items established a “high risk” or “unclear risk” of bias.

Data processing and statistical analysis

A summary table was used to present characteristics of the included studies. The measure of associations in the included studies were hazard ratios (the most common measure) and odds ratios, but both measures were treated as risk ratios (RRs) in the meta-analysis25. Standard error estimates for point risk estimates were recovered from confidence limits. Studies for which information on point risk estimates and confidence limits was not reported were left out. Where results were reported in all participants and in men and women separately, we used results in men and results in women.

When intake by exposure category was provided as a point estimate, we used this for the corresponding risk estimate. Where a range of intake was provided, we calculated the midpoint. Where a lower intake range was open-ended, we calculated the midpoint between zero and the given lower boundary; and when an upper intake range was open-ended, we assumed that its width was the same as the adjacent category. In case this category was a point estimate, then the nearest intake range was used. For studies where intake was provided only as serving size (and where the quantitative amount was not specified), we used conversion standards (see Supplementary Table S2 online).

Quantitative data synthesis was conducted where studies had used similar exposure with similar outcome. For high versus low intake and dose–response meta-analysis, a random-effects model was used to calculate summary RR estimates and corresponding 95% confidence intervals (CIs), which incorporated both within- and between-study variability26. The inverse variance method was used for weighting the studies and the DerSimonian & Laird estimator was used to estimate the between study variance26. In high versus low intake meta-analysis, the reference category was the lowest intake category (for most studies) or no intake as reported in the included studies. Linear dose–response meta-analysis was carried out as a two-stage procedure. In the first stage, a slope (trend) for each study was calculated, and in the second stage, the study-specific slope estimates were combined using the random-effects model26. For studies where a linear dose–response trend was reported, we used this for the analysis. For studies not reporting a linear dose–response trend, we calculated study-specific linear trends and corresponding CIs from the natural logs of the point risk estimates of the exposure categories and their confidence limits. The covariance between the log risk estimates within each study was approximated using the method of Greenland & Longnecker27. This method requires information on the distribution of cases and the number of participants or person-years for each exposure category. Thus, studies for which this information was not provided were left out. However, in case the distribution of participants was not provided in studies where the exposure was defined in quantiles, the distribution was calculated by dividing the total number of participants by the number of quantiles. For studies with only two exposure categories, the covariance matrix between the log risk estimates only consisted of the variance of the log risk estimate for the non-referent exposure category. Thus, no covariance matrix had to be approximated by the method of Greenland & Longnecker27, resulting in the exception that when only one non-referent intake category was present, the study was not left out even though the number of participants and the number of cases for the exposure categories were not provided. For linear dose–response meta-analysis, the combined trend was reported as risk per higher intake in grams per day as follows: for milk 200 g/day, yogurt 100 g/day, cheese 20 g/day, butter 6 g/day. For linear dose–response meta-analysis of substitutions between dairy product subgroups, however, the combined trend was reported as risk per 1 serving/day substituted. Studies were left out when information on quantitative amount of exposure categories was not provided. Non-linear dose–response meta-analysis was also carried out as a two-stage procedure. In the first stage, a restricted cubic spline model using three knots at 10%, 50% and 90% of the total distribution of the reported intake was applied for each study with at least three exposure categories, and in the second stage, the study-specific estimates were combined using the random-effects model26,28. The method of Greenland & Longnecker27 was also used for this analysis. A Wald test was used to test whether the non-linear model could be reduced to a linear model.

We used Cochran’s Q test and calculated the I2 statistic to explore statistical heterogeneity between studies29. Furthermore, in order to identify potential sources of heterogeneity, we performed pre-specified subgroup meta-analysis by study characteristics (gender and continent (Asia, Europe and North America)). The random-effects model was used within each subgroup, whereas a fixed-effects model was used across subgroups. The qualitative data synthesis revealed that the age of the study population and the length of follow-up time were rather homogenous across studies, thus, pre-specified subgroup analyses by age at entry (< 50 years, ≥ 50 years) and follow-up time (< 10 years, ≥ 10 years) were not undertaken.

We performed sensitivity analyses excluding studies where > 2 of the quality items in the risk of bias assessment established a “high risk” or “unclear risk” of bias. Furthermore, comparative fixed-effects models were performed as a basis for providing evidence for the necessity of random-effects modelling. Potential publication bias (risk of bias across studies) was assessed both formally with Egger’s test, if at least three studies were available30, and visually using funnel plots of the study-specific point risk estimates by the inverse of their standard errors.

Statistical analyses were performed using R, version 3.6.131, with packages dmetar32, meta33 and dosresmeta34. Two-sided P-values < 0.05 were considered statistically significant. P-values < 0.1 were given with one significant digit.

We used the NutriGrade35 scoring system to evaluate the quality of evidence of the linear dose–response meta-analyses (i.e. meta-evidence) for each dairy product subgroup. The NutriGrade scoring system (maximum of 10 points), which has shown good agreement and reliability, includes the following eight items: risk of bias assessment of cohort studies (maximum of 2 points), precision (maximum of 1 point), heterogeneity (maximum of 1 point), directness (maximum of 1 point), publication bias (maximum of 1 point), funding bias (maximum of 1 point), effect size (maximum of 2 points) and dose–response (maximum of 1 point). On the basis of this scoring system, four categories to judge the meta-evidence are recommended: high meta-evidence (≥ 8 points), moderate meta-evidence (6–7.99 points), low meta-evidence (4–5.99 points) and very low meta-evidence (0–3.99 points).

Associations for total intake of dairy products are not readily comparable between studies because total intake of dairy products represents variable products with variable serving sizes. Therefore, data synthesis was constrained to qualitative data synthesis, without a summary RR estimate.

Results

Out of the 7,768 records identified through database searching and other sources, 88 full-text articles were assessed for eligibility and 33 studies met our eligibility criteria12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56 (see Supplementary Fig. S1 online). The entire list of the 55 full-text articles excluded, with reasons can be found as Supplementary Table S3 online. Among the studies that met our eligibility criteria, three studies46,49,54 were subsequently excluded due to duplicate publications (Nurses’ Health Study12,54, Caerphilly13,49, Health Professionals Follow-up Study46,55). In three other studies, there was overlap between the study populations (Malmö Diet and Cancer cohort (one of the Swedish contributions to the European Prospective Investigation Into Cancer and Nutrition (EPIC) cohort)16, EPIC-NL cohort (the Dutch contribution to the EPIC cohort)52, EPIC21) and the exposures considered. The three studies were retained but only the results on the common exposures (milk, yogurt, cheese) from the EPIC cohort21 were included in the data synthesis. From the Malmö Diet and Cancer cohort16, the results on total dairy and butter were included in the data synthesis, and from the EPIC-NL cohort52, the results on total dairy, low-fat dairy and high-fat dairy were included in the data synthesis due to the different exposures than in the EPIC cohort21. Thus, 30 studies were included in the qualitative data synthesis12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,47,48,50,51,52,53,55,56 (20 studies for CHD12,13,16,17,18,19,20,21,22,37,38,39,40,45,48,50,51,52,53,56 and 12 for ischemic stroke13,14,15,23,36,40,41,42,43,44,47,55). Included in the meta-analysis were 18 studies12,13,14,15,16,17,18,19,20,21,22,37,38,40,42,43,44,45 (13 studies for CHD12,13,16,17,18,19,20,21,22,3738,40,45 and 7 for ischemic stroke13,14,15,40,42,43,44). No studies on peripheral artery disease were identified.

Table 1 shows the characteristics of the 20 studies on CHD12,13,16,17,18,19,20,21,22,37,38,39,40,45,48,50,51,52,53,56. Among the 20 studies, 5 presented gender-specific results17,20,21,50,53, 3 comprised only men13,45,56 and 3 comprised only women12,18,40. A single study was from Asia22, 13 from Europe13,16,18,19,20,21,37,38,39,45,50,52,53 and 5 from North America12,17,40,48,56. A single study comprised 21 countries in 5 continents51. The age at entry ranged from 38 to 72 years and the length of follow-up time from 5 to 22 years. All studies except for two of them22,51 were classified as being at low risk of bias (see Supplementary Table S4 online).

Table 1 Characteristics of the included cohort studies on intake of dairy products and CHD.

Table 2 shows the characteristics of the 12 studies on ischemic stroke13,14,15, 23, 36, 40,41,42,43,44, 47, 55. Among the 12 studies, 1 study presented gender-specific results55, 3 comprised only men13, 15, 23 and 3 comprised only women14, 36, 40. One study was from Asia41, five from Europe13, 15, 42,43,44 and six from North America14, 23, 36, 40, 47, 55. The age at entry ranged from 46 to 65 years and the length of follow-up time from 8 to 23 years. All studies except for three of them23, 41, 43 were classified as being at low risk of bias (see Supplementary Table S4 online).

Table 2 Characteristics of the included cohort studies on intake of dairy products and ischemic stroke.

Supplementary Table S5 online shows the definition of dairy products as described in the studies, and Supplementary Table S6 online comprises an overview of which of the studies contributed to each of the meta-analyses. Supplementary Table S7 online shows the funding sources of the studies.

Total dairy

Overall, the studies indicated no association between total intake of dairy, intake of low-fat dairy or intake of high-fat dairy and risk of CHD or ischemic stroke (see Supplementary Fig. S2-S4 online).

Milk

For milk, 6 studies with 619,460 participants and 16,478 cases were included in the high versus low intake meta-analysis for CHD (overall intake range: 0–710 g/d). Comparing the highest with the lowest category of milk intake, no association was observed for CHD (RR 1.02 (95% CI 0.92–1.13); I2 = 67%; Pheterogenity (Phet) < 0.01) (see Supplementary Fig. S5 online). Comparing the highest with the lowest category of low-fat milk intake, no association with CHD was observed (RR 1.05 (95% CI 0.92–1.20); I2 = 59%; Phet = 0.03; n studies = 5), whereas comparing the highest with the lowest category of high-fat milk intake, a higher risk of CHD was observed (RR 1.16 (95% CI 1.01–1.33); I2 = 53%; Phet = 0.04; n studies = 6) (see Supplementary Fig. S5 online). The observed heterogeneity between studies was not explained by pre-specified subgroup analyses (see Supplementary Table S8 online). In linear dose–response meta-analysis, no associations were observed between milk intake or low-fat milk intake and CHD (Fig. 1). However, each additional daily 200 g of high-fat milk was associated with an 8% higher risk of CHD (RR 1.08 (95% CI 1.00–1.16); I2 = 0%; Phet = 0.94; n studies = 4) (Fig. 1). Supplementary Table S9 online shows subgroup dose–response meta-analysis by study characteristics. No evidence of non-linear dose–response associations was observed (data not shown).

Figure 1
figure 1

Linear dose–response meta-analysis. Summary RR of CHD (panel a) and ischemic stroke (panel b) per 200 g higher intake of milk/day. P = 0.12 for the association between milk and CHD. P = 0.04 for the association between high-fat milk and CHD. CHD coronary heart disease; CI confidence interval; M males; RR risk ratio; W women.

Three studies with 163,128 participants and 3,691 cases were included in the high versus low intake meta-analysis for milk and ischemic stroke (overall intake range: 0–710 g/day). Comparing the highest with the lowest category of milk intake, a lower risk of ischemic stroke was observed (RR 0.88 (95% CI 0.79–0.98); I2 = 0%; Phet = 0.52) (see Supplementary Fig. S5 online). In linear dose–response meta-analysis, no association between milk intake and ischemic stroke was observed (Fig. 1). Supplementary Tables S8 and S9 online show subgroup meta-analysis by study characteristics. No evidence of a non-linear dose–response association was observed (data not shown).

Yogurt

For yogurt, 6 studies with 552,342 participants and 14,226 cases were included in the high versus low intake meta-analysis for CHD (overall intake range: 0–440 g/day). Comparing the highest with the lowest category of yogurt intake, no association with CHD was observed (RR 0.99 (95% CI 0.91–1.08); I2 = 49%; Phet = 0.06) (see Supplementary Fig. S6 online). In pre-specified subgroup analysis by continent, however, heterogeneity between continents was observed (P < 0.05) but no association between yogurt and CHD was observed in studies from Europe (RR 0.96 (95% CI 0.89–1.04); I2 = 42%; Phet = 0.12; n studies = 5) or in studies from North America (RR 1.25 (95% CI 0.97–1.61); I2 = 0%; Phet = 0.71; n studies = 1) (see Supplementary Table S10 online). No association between low-fat or high-fat yogurt and CHD was observed (see Supplementary Fig. S6 online). In linear dose–response meta-analysis, no association between yogurt and CHD was observed (RR 0.98 (95% CI 0.93–1.03) per 100 g higher intake per day; I2 = 42%; Phet = 0.11; n studies = 5) (Fig. 2). However, heterogeneity between continents was observed (P < 0.05) but no association between yogurt and CHD was observed in studies from Europe (RR 0.97 (95% CI 0.93–1.01) per 100 g higher intake per day; I2 = 35%; Phet = 0.19; n studies = 4) or in studies from North America (1.20 (95% CI 0.98–1.48) per 100 g higher intake per day; I2 = 0%; Phet = 0.71; n studies = 1) (see Supplementary Table S11 online). No evidence of a non-linear dose–response association was observed (data not shown).

Figure 2
figure 2

Linear dose–response meta-analysis. Summary RR of CHD (panel a) and ischemic stroke (panel b) per 100 g higher intake of yogurt/day. Yogurt defined as yogurt/other soured milk products. CHD coronary heart disease; CI confidence interval; M males; RR risk ratio; W women.

Three studies with 187,281 participants and 6,208 cases were included in the high versus low intake meta-analysis for yogurt and ischemic stroke (overall intake range: 0–400 g/day). Comparing the highest with the lowest category of yogurt intake, no association for ischemic stroke was observed (see Supplementary Fig. S6 online). Supplementary Table S10 online shows subgroup high versus low intake meta-analysis by study characteristics. Also in linear dose–response meta-analysis, no association between yogurt intake and ischemic stroke was observed (Fig. 2).

Cheese

For cheese, 7 studies with 554,323 participants and 14,698 cases were included in the high versus low intake meta-analysis for CHD (overall intake range: 0–120 g/day). Comparing the highest with the lowest category of cheese intake, a lower risk of CHD was observed (RR 0.91 (95% CI 0.84–0.99); I2 = 37%; Phet = 0.12) (see Supplementary Fig. S7 online). In pre-specified subgroup analysis by gender, however, heterogeneity between genders was observed (P = 0.03) (see Supplementary Table S12 online). In studies among men, comparing the highest with the lowest category of cheese intake, no association with CHD was observed (RR 1.03 (95% CI 0.93–1.13); I2 = 0%; Phet = 0.74; n studies = 3). In studies among women, comparing the highest with the lowest category of cheese intake, a lower risk of CHD was observed (RR 0.82 (95% CI 0.69–0.97); I2 = 33%; Phet = 0.23; n studies = 3). No association between low-fat or high-fat cheese intake and CHD was observed but heterogeneity for low-fat cheese intake was observed (see Supplementary Fig. S7 online). The observed heterogeneity between studies was not explained by subgroup analyses (see Supplementary Table S12 online). In linear dose–response meta-analysis, each additional daily 20 g of cheese was associated with a 4% lower risk of CHD (RR 0.96 (95% CI 0.93–0.98); I2 = 3%; Phet = 0.41; n studies = 6) (Fig. 3) and no heterogeneity between genders was observed (P = 0.55) (see Supplementary Table S13 online). In studies among men, the summary RR estimate for cheese intake of 20 g/day was 0.96 (95% CI 0.92–1.00) and in studies among women, the summary RR estimate for cheese intake of 20 g/day was 0.94 (95% CI 0.90–0.98). No association between low-fat cheese intake and CHD was observed but heterogeneity was observed (Fig. 3). The observed heterogeneity between studies was not explained by subgroup analyses (see Supplementary Table S13 online). No evidence of a non-linear dose–response association was observed (data not shown).

Figure 3
figure 3

Linear dose–response meta-analysis. Summary RR of CHD (panel a) and ischemic stroke (panel b) per 20 g higher intake of cheese/day. P < 0.05 for heterogeneity for the association between cheese and ischemic stroke. CHD coronary heart disease; CI confidence interval; M males; RR risk ratio; W women.

Three studies with 187,281 participants and 6,208 cases were included in the high versus low intake meta-analysis for cheese and ischemic stroke (overall intake range: 0–400 g/day). Comparing the highest with the lowest category of cheese intake, no association with ischemic stroke was observed (see Supplementary Fig. S7 online). Supplementary Table S12 online shows subgroup high versus low intake meta-analysis by study characteristics. Also in linear dose–response meta-analysis, no association between cheese intake and ischemic stroke was observed but heterogeneity was observed (Fig. 3). The observed heterogeneity between studies was not explained by subgroup analyses (see Supplementary Table S13 online). No evidence of a non-linear dose–response association was observed (data not shown).

Butter

For butter, 4 studies with 128,757 participants and 6,562 cases were included in the high versus low intake meta-analysis for CHD (overall intake range: 0–63 g/day). Comparing the highest with the lowest category of butter intake, no association was observed for CHD (see Supplementary Fig. S8 online). Also in linear dose–response meta-analysis, no association between butter intake and CHD was observed (Fig. 4). Supplementary Tables S14 and S15 online show subgroup meta-analysis by study characteristics.

Figure 4
figure 4

Linear dose–response meta-analysis. Summary RR of CHD (panel a) and ischemic stroke (panel b) per 6 g higher intake of butter/day. CHD coronary heart disease; CI confidence interval; M males; RR risk ratio; W women.

In linear dose–response meta-analysis, no association between butter intake and ischemic stroke was observed (Fig. 4).

Substitutions between dairy product subgroups

In linear dose–response meta-analysis, no associations between substitutions among dairy product subgroups (low-fat milk, high-fat milk, low-fat yogurt, high-fat yogurt, cheese and butter) and risk of ischemic stroke were observed (Fig. 5).

Figure 5
figure 5figure 5

Linear dose–response meta-analysis. Summary RR of ischemic stroke for substitutions between dairy product subgroups (per 1 serving/day). For example in the mean of substitution of low-fat milk for high-fat milk; i.e. low-fat milk takes the place of high-fat milk). For milk and yogurt, the serving size was 200 g, for cheese 20 g, and for butter 6 g. Yogurt defined as yogurt/other soured milk products. CI confidence interval; M males; RR risk ratio; W women.

Sensitivity analyses

One study22, investigating high-fat milk intake and risk of CHD, was excluded in low risk of bias sensitivity analysis. The reanalyses produced small changes in the summary RR estimates and corresponding 95% CIs (see Supplementary Fig. S9 online). Using a fixed-effects model to calculate summary RR estimates and corresponding 95% CIs produced narrower CIs, and eight of the summary RR estimates became statistically significant (see Supplementary Table S16 online). Among these analyses, heterogeneity was observed in four analyses; thus, the narrower CIs from the fixed-effects model were not considered reliable. In four other analyses, no heterogeneity was observed. In these analyses, except for one, the fixed-effects model produced borderline statistical significance with CIs including 0.99. The exception was the linear dose–response meta-analysis of substitution of low-fat yogurt for high-fat yogurt where the fixed-effects model produced statistical significance with a CI of 1.03–1.50 while Cochran’s Q-test for heterogeneity was borderline statistically significant with P = 0.08 (I2 = 67%). In that case, we deem the analysis inconclusive. There was no evidence of publication bias with the exception of high versus low intake meta-analysis of yogurt and risk of CHD (Egger’s test: P < 0.01). Visual inspection of funnel plots suggested moderate small study effects towards the null for yogurt intake and risk of CHD in both high versus low intake and dose–response meta-analysis (see Supplementary Fig. S10 online).

Quality of meta-evidence

Table 3 provides an overview of the findings from linear dose–response meta-analysis on dairy product subgroups and risk of CHD, and NutriGrade meta-evidence grading. The grading was rated moderate for high-fat milk and cheese, and low for milk, low-fat milk, yogurt, low-fat cheese and butter.

Table 3 Linear dose–response meta-analysis on dairy product subgroups and risk of coronary heart disease, and NutriGrade meta-evidence grading.

Table 4 provides an overview of the findings from linear dose–response meta-analysis on dairy product subgroups and risk of ischemic stroke, and NutriGrade meta-evidence grading. The grading was rated low for milk, yogurt, cheese and butter, and very low for all substitutions between dairy product subgroups.

Table 4 Linear dose–response meta-analysis on dairy product subgroups and risk of ischemic stroke, and NutriGrade meta-evidence grading.

Discussion

Intake of dairy product subgroups and associations with major atherosclerotic CVDs in the general adult population was investigated in this systematic review and meta-analysis of cohort studies through comparison of the highest with the lowest intake categories and dose–response (linear and non-linear) analyses. Intake of high-fat milk was positively associated with the risk of CHD, with heterogeneity present in high versus low intake meta-analysis, which could not be explained by pre-specified subgroup analyses. However, no heterogeneity was observed in linear dose–response meta-analysis. For total intake of milk and intake of low-fat milk, no associations were observed. Intake of cheese was inversely associated with the risk of CHD, with heterogeneity between genders present in high versus low intake meta-analysis. In studies among women, high cheese intake compared with low cheese intake was associated with lower CHD risk but not in studies among men. However, no heterogeneity between genders was observed in linear dose–response meta-analysis. For intake of low-fat cheese, yogurt and butter, no associations with CHD were observed. Milk intake was inversely associated with the risk of ischemic stroke in high versus low intake meta-analysis but not in dose–response meta-analysis. For intake of cheese, yogurt and butter no associations with ischemic stroke were observed. The NutriGrade tool for evaluating the quality of meta-evidence suggested a moderate confidence in the summary RR estimates for the associations between high-fat milk and cheese and CHD; further research could add evidence on the confidence and may change the effect estimates. For all other summary RR estimates, the tool suggested a low confidence in the effect estimates (further research will provide important evidence on the confidence and likely change the estimates) or a very low confidence in the effect estimates (meta-evidence is very limited and uncertain).

For total intake of dairy, intake of low-fat dairy and intake of high-fat dairy, data synthesis was constrained to qualitative data synthesis without a summary RR estimate. Overall, these studies indicated no association between total intake of dairy, intake of low-fat dairy or intake of high-fat dairy and CHD or ischemic stroke.

The strengths of our systematic review are the direct evidence, the focus on atherosclerotic CVDs and the inclusion of only studies considering total CHD and total ischemic stroke as outcomes. Previous meta-analyses on dairy intake and CHD and stroke combined results from studies on total and fatal outcomes6,7,8,9. Moreover, our data synthesis was also differentiated according to the fat content of the dairy product subgroups. Whole-fat dairy products have a high saturated fat content and nutrient trials have shown that intake of saturated fatty acids increases serum LDL-cholesterol relative to intake of unsaturated fatty acids and carbohydrates2. Elevated LDL-cholesterol is an important risk factor of atherosclerosis, which is the primary cause of atherosclerotic CVDs1. In addition, we carefully considered dairy product subgroups and contacted authors of the included studies for further information on exposure definition. The quality of a systematic review and meta-analysis depends on the quality of the included studies. The NOS24 is a commonly used tool for quality ranking of cohort studies. However, the NOS tool does not focus on internal validity alone as also was emphasized by Stang57. We employed a modified instrument based on the NOS tool. We assessed the quality of the included studies by assessing eight items concerning the internal validity, categorized into three domains (selection bias, information bias and comparability) similar to the domains in the NOS tool. Studies were classified as being at low risk of bias in general if only ≤ 2 of the 8 quality items established a “high risk” or “unclear risk” of bias. In sensitivity analyses, we repeated the meta-analyses including only studies deemed low risk of bias. The reanalyses produced small changes in the CIs; thus, the confidence limits appeared to be insensitive to judgements about study quality. We did not include selective reporting of outcome and analyses in our assessment of the quality of the cohort studies because most cohort studies are not registered before being conducted. We performed comparative fixed-effects modelling to evaluate the robustness of the results, which provided evidence for the necessity of random-effects modelling. Finally, we also evaluated the meta-evidence for each dairy subgroup using a comprehensive approach; namely the NutriGrade35 scoring system.

Limitations of our systematic review are that the number of included studies was low, limiting the possibility to detect heterogeneity for several exposures and to identify potential sources of heterogeneity in subgroup analyses by study characteristics. The qualitative data synthesis revealed that the age at entry of the study populations (≥ 50 years in all studies, except for 2 studies) and length of follow-up (≥ 10 years, except for 1 study) were homogenous across studies but sources of bias were present and these biases varied across the studies as demonstrated in our risk of bias judgement. For example, socioeconomic status is a risk factor of CVD58, 59 and low-fat dairy products are more likely to be consumed by groups of higher socioeconomic status60. However, most studies adjusted for education and other relevant risk factors of atherosclerotic CVD such as gender (as appropriate), age, total energy intake, smoking, physical activity, alcohol consumption, body mass index and comorbidity. We observed no evidence of publication bias with the exception of high versus low intake meta-analysis of yogurt and risk of CHD (Egger’s test: P < 0.01). However, large P-values do not indicate that publication bias can be safely ignored. In both high versus low intake and dose–response meta-analysis, visual inspection of funnel plots suggested moderate small study effects towards the null for yogurt and risk of CHD, due to the right-skewed form of the funnel plots. We considered the two bibliographic databases PubMed and Embase to be the most important information sources to search studies. We did not supplement our search by consulting other experts in the field as unpublished studies may tend to be of lower quality and because only a biased sample of such studies can be identified25. On the other hand, it can be argued that inclusion of only published studies may introduce publication bias25. Furthermore, the lack of searching non-English databases may be a source of publication bias61. We did not search grey literature (such as reports), because grey literature may not have been subject to peer review and therefore may be of lower quality. Searching study registries was not considered relevant. Until recently, specified food substitutions have not been addressed in cohort studies on dairy and CVDs. Thus, summarizing findings on substitutions between dairy products were possible from only two studies. The lack of specifying substitutions in most previous cohort studies has implications for the interpretation of the results of our systematic review and meta-analysis as the effect of a specific dairy product on CVD depends on the replaced foods62.

Guo et al.9 summarized findings of cohort studies on the associations between total intake of milk (11 studies), total intake of yogurt (3 studies) and cheese (9 studies) and the risk of CHD and observed no associations in their systematic review from 2017. The meta-analysis on milk intake and CHD risk was subsequently updated with one study by Soedamah-Muthu & de Goede63. In agreement with previous findings, no association between total intake of milk and CHD was observed. Also, we did not observe any association for total intake of milk or yogurt with CHD. But we observed an inverse association between intake of cheese and risk of CHD in agreement with the findings by Chen et al.8 in their meta-analysis of eight prospective observational studies on cheese intake and CHD risk from 2017. The effect of cheese intake on blood lipids was reviewed and summarized in 2015 by de Goede et al.64. The authors found that intake of hard cheese decreased total cholesterol, LDL-cholesterol and high-density lipoprotein (HDL)-cholesterol when compared with intake of butter of a similar ratio of polyunsaturated fatty acids to saturated fatty acids, and speculated that calcium, specific types of saturated fatty acids or the food matrix may explain the findings. However, in a cohort study (published after we finished our literature search), no association between substitution of cheese for butter was observed65. In line with the meta-analysis of three prospective observational studies on butter intake and CHD risk from 2016 by Pimpin et al.7, we observed no association between butter intake and CHD risk. Associations for low- and high-fat milk intake with major atherosclerotic CVDs have not been considered in previous systematic reviews. We observed that higher intake of high-fat milk was associated with higher risk of CHD. For intake of low-fat milk, no association was observed. High-fat milk has a high saturated fat content that may partly explain the findings of a higher risk of CHD associated with a higher intake of high-fat milk. Results from dietary trials have shown that whole-fat milk increases total cholesterol and LDL-cholesterol more than low-fat milk such as skim milk66. However, the effects of whole-fat milk on HDL-cholesterol and the ratio of total cholesterol to HDL-cholesterol were less clear66.

de Goede et al.6 summarized findings of cohort studies on the association between intake of dairy products and risk of ischemic and hemorrhagic stroke aggregated in their systematic review from 2016. In subgroup analyses by stroke type (ischemic and hemorrhagic), associations with total intake of milk were investigated. No association between total intake of milk and risk of ischemic stroke (5 studies) was observed, but the 95% CI was compatible with meaningful benefits in line with our findings. In addition, no association between yogurt, cheese or butter and ischemic stroke was observed in our meta-analysis.

In conclusion, this systematic review and meta-analysis of cohort studies indicates a positive association between high-fat milk and the risk of CHD, with heterogeneity present in high versus low intake meta-analysis but not in linear dose–response meta-analysis. The observed heterogeneity in high versus low intake meta-analysis could not be explained by pre-specified subgroup analyses. In addition, this systematic review and meta-analysis of cohort studies indicates an inverse association between intake of cheese and the risk of CHD, with heterogeneity present in high versus low intake meta-analysis but not in linear dose–response meta-analysis. The observed heterogeneity in high versus low intake meta-analysis could partly be explained by gender. The NutriGrade meta-evidence grading was rated moderate for the associations between high-fat milk and cheese and the risk of CHD. For all other associations between dairy product subgroups and risk of CHD or risk of ischemic stroke, the meta-evidence grading was rated low or very low. No studies on peripheral artery disease were identified. Studies with more details about types of dairy products, including fat content, are warranted. Furthermore, future studies should investigate substitutions between dairy product subgroups. Findings from dairy food substitution analyses are important in deriving food-based dietary guidelines. Finally, future studies should describe their methods and data in as much detail as feasible, to facilitate later systematic reviews and meta-analyses.