Breast cancer (BC) is the most common cancer among women worldwide and constitutes the leading cause of cancer death among women in medium- and high-income countries (Ferlay et al, 2013).

Diet is a key modifiable risk factor for many chronic diseases but, except for the case of alcohol consumption (WCRF/AICR, 2007), the evidence on the effect of individual dietary factors on BC risk is inconclusive (Romieu, 2011). Some authors argue that focussing on overall dietary patterns, instead of on individual foods or nutrients, may better capture dietary variability in the population’s diet while allowing the evaluation of interactions between the dietary factors (Jacques and Tucker, 2001; Hu, 2002; Barkoukis, 2007). However, the evidence of a possible association between dietary patterns and BC risk remains weak (Edefonti et al, 2009).

The primary objective of this study was to evaluate the association between dietary patterns and BC risk in Spanish women according to menopausal status and intrinsic tumour subtype. The secondary objective was to compare the results of these patterns with those of two dietary quality scores: the AHEI (Alternate Healthy Index) and aMED (and Alternate Mediterranean Diet Score) (Fung et al, 2005).

Materials and methods

GEICAM case–control study

Data came from a Spanish case–control study on female BC. We recruited 1017 incident cases of BC diagnosed in the Oncology Departments of 23 hospital members of the Spanish Breast Cancer Research Group, GEICAM ( The hospitals are located in 9 of the 17 Spanish regions and account for 78% of the Spanish population. Each case was matched with a healthy control of similar age (±5 years), selected from cases’ in-law relatives, friends, neighbours, or work colleagues residing in the same town.

Cases were subclassified by the following intrinsic subtypes based on local pathology reports (Goldhirsch et al, 2011): (1) luminal human epidermal growth factor receptor 2 (HER2)-negative tumours (oestrogen receptor (ER)+ or progesterone receptor (PR)+ with HER2−), (2) HER2+ tumours (HER2+ irrespective of ER or PR results); and (3) triple-negative tumours (ER−, PR− and HER2−). The ER, PR and HER2 positivity were defined according to ASCO/CAP guidelines (Hammond et al, 2010; Wolff et al, 2013).

The EpiGEICAM study was approved by the Ethics Committees of all 23 participating hospitals. All participants signed an informed consent.


Cases and controls completed a structured questionnaire collecting information on demographic and anthropometric characteristics, personal and family history, past physical activity and diet. Postmenopausal status was defined as absence of menstruation in the past 12 months.

Dietary intake in the past 5 years was estimated using a 117-item semiquantitative food frequency questionnaire (FFQ) (Willett et al, 1985) adapted to and validated in different Spanish adult populations (Vioque et al, 2007, 2013). The responses for each food item were converted to mean daily intake in grams and reduced to 26 food groups (Supplementary Appendix 1) excluding noncaloric and alcoholic beverages. Alcohol was excluded because it is an already established risk factor for breast cancer (WCRF/AICR, 2007, 2010; IARC, 2012) and was included in the analysis as a possible confounder.

Major existing dietary patterns were identified in the control population by applying principal components analysis (PCA) without rotation to the variance–covariance matrix. The PCA reduces a set of intercorrelated variables (26 food groups) into a smaller set of principal components, that is, dietary patterns. The components are linear combinations of food groups that are optimally weighted to successively account for decreasing proportions of the total variation in the intake of the food groups. In our case, principal components were identified among controls because they represent the general population. The first k components that explain 70% of the total variability in dietary intake were selected for initial exploration. We considered food groups with component loadings |0.3| to strongly contribute to a dietary pattern. From the initial k components, we retained those that were intuitively meaningful. For each retained principal component, a score was then calculated for cases and controls by summing the intakes of food groups (centred to have zero means) weighted by their component loadings.

The Alternate Healthy Eating Index (AHEI) considers the adherence to the food guide pyramid (USDA, 1992) and the dietary guidelines for Americans (USDA, 1995), whereas the aMED measures the adherence to a Mediterranean diet (Fung et al, 2005). The AHEI score is based on fixed standards of intake, whereas aMED uses the median of consumption for each item as an internal standard. Calculations of AHEI and aMED were based on the criteria described by Fung et al (2005), with the only difference being that we did not take into account the contribution of alcohol consumption. Briefly, AHEI starts with a score of 0 and reaches a maximum of 75 (80 when alcohol is included). Points are earned according to the agreement with the general recommendations regarding consumption of fruits, vegetables, nuts and soy, ratio of white to red meat, cereal fibre, trans fats, ratio of polyunsaturated to saturated fat and long-term multivitamin use (Supplementary Appendix 2). After excluding alcohol, the aMED ranges from 0 to 8, with subjects starting with a score of 0 and receiving one point for consumption above the median intake of any of the following seven food and nutrient groups: vegetables, legumes, fruit, nuts, whole grains, fish and ratio of monounsaturated to saturated fat. The extra point is added for consumption of red and processed meats below the median intake (Supplementary Appendix 3).

Associations of individual food groups with AHEI and aMED were evaluated using Pearson’s correlation coefficients. To be consistent with the cut point used for the PCA-derived dietary patterns and to allow for comparison, we considered meaningful all correlations |0.3|.

Statistical analysis

Body mass index (BMI; 10%), physical activity in the past year (8%), age at first delivery (5%), smoking habit (<1%), education (<1%) and age at menarche (<1%) contained missing values. In order to obtain unbiased estimates of the effect of each dietary pattern using the information provided by all case–control pairs, missing values were imputed using multiple imputation with chained equations (Lee and Carlin, 2010; White et al, 2011). As explained in Royston and White (2011), the chained equations method imputes missing values in different steps: initially, all missing values are filled at random. The first variable with at least one missing value, say smoking, is then regressed on the other variables including those with missing values imputed at random in the initial step (BMI, physical activity, age at first delivery, education and age at menarche) and another set of potential explanatory variables that do not contain missing (menopausal status, age, number of children, hip and waist circumferences, bra size, calories, alcohol consumption and case–control status). The estimation is restricted to individuals with observed values for smoking and the missing values are replaced by simulated draws for the posterior predictive distribution of smoking. The next variable with missing values, say age at menarche, is regressed on all the other variables including imputed values of smoking and restricting estimation to individuals with observed values for the variable to impute. Again, missing values for age at menarche are replaced by draws from the posterior predictive distribution. This process is repeated until a stable imputation is found for all values of all variables. Following this process we created five imputed data sets that were used for subsequent analyses. The final effect association is a weighted average of the effects found in these five data sets.

The associations of dietary patterns and quality scores with BC risk were evaluated using separate conditional logistic regression models, both in quartiles and as a continuous term (per s.d. increment). All models included the following potential confounders: total calories, alcohol consumption, BMI from self-reported weight and height (BMI=kg m−2), physical activity in the past year, smoking, education, history of breast disease other than cancer, family history of BC, age at menarche, age at first delivery and menopausal status. Same models were adjusted including the interaction term between menopausal status and the corresponding dietary pattern or score to evaluate potential effect modification of menopausal status.

Multinomial logistic regression models were used to evaluate the association of patterns with each of the aforementioned intrinsic BC subtypes. These models were adjusted by age, hospital and the same set of potential confounders described above. The Wald test was used to compare the dose–response effect for each tumour subtype.

Categories for quartiles of patterns/scores were collapsed with the immediate anterior when the number of cases was <5.

For comparison purposes, the analyses for AHEI and aMED indexes were repeated considering the original score, including alcohol intake. Finally, the validity of the imputation was checked by comparing the results obtained with the effects resulting from the analyses of the data with complete information for all variables (complete case analysis).

Analyses were performed using STATA/MP 12.0 (College Station, TX, USA).


After excluding 44 case–control pairs (n=88) because of incomplete data on diet or implausible reported energy intakes (<750 or >4500 kcal per day) in either the case or the control, final analyses were based on 973 cases–control pairs.

Figure 1 depicts the correlation between food consumption and ‘a posteriori’ Western, Prudent and Mediterranean and ‘a priori’ AHEI and aMED scores in the control population. The PCA led to three selected dietary patterns that respectively explained 16%, 13% and 8% of the variation in intake of the 26 food groups. The first component – labeled Western pattern – was characterised by high intakes of high-fat dairy products, processed meat, refined grains, sweets, caloric drinks and other convenience food and sauces and by low intakes of low-fat dairy products and whole grains. The second component – named Prudent pattern – denoted high intakes of low-fat dairy products, vegetables, fruits, whole grains and juices. The third component – the Mediterranean pattern – loaded high in fish, vegetables, legumes, boiled potatoes, fruits, olives and vegetable oil, and low in juices. All three dietary patterns correlated positively with energy intake. The ‘a priori’ scores, AHEI and aMED indices seemed to be similar to our Mediterranean pattern in terms of its correlation with specific foods (fish, fruit, vegetables and olive oil, and even legumes in the case of aMED).

Figure 1
figure 1

Linear correlation between food consumption and ‘ a posteriori ’ Western, Prudent and Mediterranean and ‘ a priori ’ AHEI and aMED scores.

Compared with controls, BC cases were more adherent to the Western pattern, more likely to report higher energy intakes, lower levels of physical activity and formal education and higher age at first delivery than controls and to have history of breast problems and a family history of BC (Table 1).

Table 1 Distribution of scores from ‘a posteriori’ and ‘a priori’ developed diet patterns and other baseline characteristics for cases and controls (the EpiGEICAM study)

A higher Western pattern score was associated with higher odds of BC, with the OR for the top vs the bottom quartile being 1.46 (95% CI 1.04–1.31). This association was stronger in premenopausal women (OR=1.75; 95% CI 1.14–2.67). Conversely, a higher Mediterranean pattern score implied a lower BC risk, with the OR comparing top with bottom quartiles of 0.56 (95% CI 0.40–0.79). No differences were observed between pre- and postmenopausal women for this pattern. In both cases, the linear dose–response trend was significant. No association was found for the Prudent pattern. Higher scores of AHEI and aMED were also inversely associated with BC risk, although the effect sizes were smaller than that of the Mediterranean pattern (Table 2).

Table 2 Adjusted odds ratios for the association between breast cancer incidence and scores of adherence to ‘a posteriori’ and ‘a priori’ diet patterns, by menopausal status

Even though coefficients for the Western pattern point to a possible direct association with BC risk in women with HER2− tumours independently from ER/PR status, no statistically significant differences were observed among BC subtypes (P-value of heterogeneity=0.87). Interestingly, the protective effect of the Mediterranean pattern was stronger for triple-negative tumours (OR for the fourth quartile=0.32; 95% CI 0.15–0.66)), with a steeper dose–response trend compared with other subtypes (P-value of heterogeneity=0.04). Results for AHEI and aMED also showed a protective effect, especially in the case of triple-negative BC, even if the effect size was again smaller (Table 3).

Table 3 Adjusted OR of BC risk related to adherence to ‘a posteriori’ and ‘a priori’ developed patterns by type of tumour

The complete case sensitivity analyses led to very similar results (data not shown).



According to our results, a Western dietary pattern was positively associated with BC risk. This relationship seems stronger among premenopausal women (more likely to report this type of diet in our sample). In contrast, our Mediterranean pattern was associated with a reduced BC risk. Whereas a protective effect was observed for all tumour subtypes, the size of the effect was larger for triple-negative tumours. Compared with ‘a priori’ indices, ‘a posteriori’ patterns identified actual dietary patterns that better capture specific habits of the population.

Limitations and strengths

Recall bias is always a concern in case–control studies. Some recent reviews have suggested that this recall bias might be responsible for the effect found between a Western pattern and BC risk in case–control studies, although the agreement about such an effect is not that clear in cohort studies (Brennan et al, 2010; Schwingshackl and Hoffmann, 2014). However, most of the studies included in these reviews were conducted in Western countries with the consequent limitations in variability of the diet. The number of studies carried out in middle-income countries, where the variability in food intake is wider and food supplementation less prevalent, are still insufficient to reach firm conclusions in this regard (Romieu, 2011). On the other hand, cohort studies rarely update the information provided by the participants at baseline, and this implies that, in many instances, diet was assessed many years before breast cancer occurrence. If the diet changes, this may entail an important degree of misclassification and decreases the probability of observing effects that depend more on current than in past exposure. In our case, the temporal window considered in the FFQ included the past 5 years before diagnosis. Furthermore, the great geographical variability achieved by selecting recruiting centres all over the country ensured the representation of the different diets coexisting within Spain. The fact that some regions exhibit a higher adherence to a Mediterranean dietary pattern increased the variability and, thus, the power to detect an effect and to differentiate Mediterranean from other patterns that might look similar but show no beneficial effect in terms of BC prevention (Prudent pattern). Furthermore, the validity and reproducibility of FFQ used in this study was satisfactory (Vioque et al, 2007, 2013) and the strength of the associations deemed it unlikely that our findings are a result of this bias.

Second, statistical power was limited in the stratified and subgroups analyses by tumour type. On the other hand, the matching design resulted in closely related cases and controls that would bias the OR towards the null effect. In spite of these limitations, we were able to detect a consistent dose–response gradient for some dietary patterns, even in the stratified and subgroup analyses.

Finally, a few studies have explored the association between dietary patterns and BC risk by oestrogen and progesterone receptor status (Baglietto et al, 2011; Woo et al, 2012) but, to our knowledge, none has reported results by HER2 status. Our paper fills in this important gap by exploring the link between different diet patterns and BC risk by tumour subtype, including HER2 status.

Results in relation with other studies

Previous research on BC risk and dietary patterns developed with PCA support the dichotomy of Western/Unhealthy vs Prudent/Healthy pattern (Edefonti et al, 2009). Most studies report a negative impact of a Western/Unhealthy diet (Cui et al, 2007; Cottet et al, 2009; De Stefani et al, 2009) and a positive effect of a Prudent/Healthy diet (Cottet et al, 2009; De Stefani et al, 2009; Wu et al, 2009; Demetriou et al, 2012) on BC risk. However, past studies have not been able to differentiate between a Prudent and a Mediterranean pattern in the same population. The former pattern captures a low-calorie and low-fat eating profile, whereas the latter is characterised by a high consumption of fish, legumes, vegetable oils and whole fruits. Supporting our findings, recent BC prevention intervention studies revealed that a reduction in fat consumption is not sufficient for reducing BC incidence (Prentice et al, 2006; Martin et al, 2011), whereas some prospective studies confirm the potential primary preventive effect of a Mediterranean diet on BC risk (Trichopoulou et al, 2010; Buckland et al, 2013). The evidence regarding the association between BC risk and n-3/n-6 polyunsaturated fatty acids also support this Western/Mediterranean dichotomy. The n-3 and n-6 PUFAs can influence breast tumour cell growth by simultaneously competing for the same metabolic pathway (COX and LOX pathway) to change the balance of tissue eicosanoids, the transcription mediated by nuclear factor-κB (NF-κB) and signal transduction mediated by the mammalian target of rapamycin (mTOR) (Yang et al, 2014). An increase in the intake of n-3 (present in fish, nuts and other vegetables commonly used in the Mediterranean areas) while decreasing the intake of n-6 (present mainly in refined vegetable oils used in cookies, crackers, sweets and fast food, typically included in the Western diet) might reduce BC risk (Zheng et al, 2013; Yang et al, 2014). In fact, a recent published commentary recommends the modernised Mediterranean diet as an effective strategy to achieve an optimal balance between n-3 and n-6 PUFAs reducing the overall cancer risk and specifically BC risk (de Lorgeril and Salen, 2014).

Further biological explanation can be found in a recent review that summarises the possible mechanisms explaining the effect of Mediterranean diet on cancer risk (Grosso et al, 2013). On the one hand, fruits and vegetables are rich in antioxidants that seem to inhibit the growth of several tumours through stopping multiple cancer-related biological pathways, such as carcinogen bioactivation, cell signaling, cell cycle regulation, angiogenesis and inflammation. On the other hand, whole grains (Mediterranean pattern) contain carbohydrates with lower glycaemic index (GI) than refined grains (Western pattern). Products with high GI are more insulin demanding and the insulin–IGF axis has been directly related to cancer promotion. Olive oil counts among its potentially health-promoting components, with tyrosol and hydroxytyrosol that have been demonstrated to decrease glutathione (GSH), the activation of the transcription factor NF-κB and cell death that may be implicated in the carcinogenetic processes. Some of these pathways seem to be particularly important in the development of triple-negative tumours (Davis and Kaklamani, 2012; Paul et al, 2014). According to a recent review, olive oil consumption seems to reduce the risk of several tumours, including breast cancer (Pelucchi et al, 2011). Moreover, a negative association between olive oil and breast density has also been reported in our country (Garcia-Arenzana et al, 2014).

Regarding AHEI and aMED indexes, both were negatively associated with breast cancer in our study. These two scores are based on similar recommendations and both capture diets high in long-chain n-3 fatty acids (Fung et al, 2005). The already-mentioned beneficial effect of olive oil consumption may explain the lower OR obtained with the aMED index, with oleic acid being a monounsaturated fatty acid (Pelucchi et al, 2011). The original AHEI and aMED were based on recommendations addressing cardiovascular diseases and therefore positively scored moderate consumption of alcohol (Fung et al, 2005). Given the evidence of a detrimental effect of alcohol consumption on BC risk (WCRF/AICR, 2007, 2010; IARC, 2012), we decided to consider alcohol as an important confounder and excluded this item from the score, following the example of a previous study that explored the association between Mediterranean diet and BC risk (Buckland et al, 2013). As was the case in that study, the association between the original AMED and BC was attenuated, and the same was true for the AHEI index (see Supplementary Appendices 4 and 5). It should be noted that alcohol consumption was very moderate in our women, as one-third of them did not drink, and 75% of our drinkers had less than 1 cup per day (mean=0.66 cup per day), mainly consuming wine (mean=0.27 cups per day) or beer (mean=0.35 cups per day). Interestingly, alcohol intake was not correlated with any of the ‘a posteriori’ patterns identified here (Pearson’s correlation coefficients: Western=0.020, Prudent=−0.020 and Mediterranean=0.33, respectively), nor modified the effect of these two patterns (data not shown).

Our results support past evidence regarding the stronger association between the Western pattern and BC risk in pre-menopausal than in post-menopausal women (Murtaugh et al, 2008; Agurs-Collins et al, 2009). These differences may be related to the greater adherence to the Western pattern in younger women as reported by Garcia-Arenzana et al (2012).

Finally, among the few studies comparing the results obtained using ‘a priori’ and ‘a posteriori’ methods, two of them were carried out mainly in Asian women (Wu et al, 2009; Butler et al, 2010) and failed to identify a Mediterranean pattern using PCA, making impossible a clear comparison. The third study, conducted in a Mediterranean population, did obtain similar results to ours (Demetriou et al, 2012). Regarding our differential results by tumour subtype, our findings imply a stronger protective effect against triple-negative tumours, supported by other studies also suggesting a protective effect of a diet characterised by high consumption of vegetables, fruits and/or legumes in ER−/PR− tumours (Baglietto et al, 2011; Buckland et al, 2013) and HER2− tumours (Woo et al, 2012).


As diet is a modifiable risk factor, the identification of harmful and beneficial dietary habits as well as the characterisation of the population most susceptible to such habits is essential for the design of BC prevention policies. Our results provide novel information in these two fronts; although the potential harmful effect of a Western diet on BC risk is widely known, the beneficial effect of a diet rich in fruits, vegetables, legumes, oily fish and vegetable oils over a diet low in calorie and fat intake is still disregarded, even by the scientific community. We also identified a higher detrimental effect of Western dietary habits in younger women, pinpointing them as a main target for future preventive policies. Younger women, at least in Spain, exhibit an unhealthier lifestyle profile than their older counterparts, including a clear departure from the traditional Mediterranean diet (Garcia-Arenzana et al, 2012).

Finally, according to our results, adherence to a Mediterranean diet is particularly beneficial against triple-negative tumours. As these tumours are more aggressive, this protective effect should be further explored.