In 2012, colorectal cancer (CRC) was the fourth most common cancer worldwide, with 1 360 000 cases diagnosed (Ferlay et al, 2015). Dietary and other lifestyle choices play a significant role in CRC development (Bingham, 2000; Willett, 2005; World Cancer Research Fund/American Institute for Cancer Research, 2007). The panel of the World Cancer Research Fund/American Institute for Cancer Research (WCRF/AICR) in the Continuous Update Project (CUP) judged the evidence that foods high in dietary fibre protect against CRC, and that consumption of red meat, processed meat and alcohol (especially in men) increase the risk of developing CRC as ‘convincing’ (World Cancer Research Fund/American Institute for Cancer Research, 2011). Milk and calcium have been related to a probable reduction in CRC risk, whereas evidence is limited regarding a possible protective effect of folate, selenium and vitamin D (World Cancer Research Fund/American Institute for Cancer Research, 2011).

In order to further advance our understanding of the relationship between diet and cancer development, the CUP panel recommended the use of new methods of investigation, such as patterns of diets in contrast to individual foods or nutrients (World Cancer Research Fund/American Institute for Cancer Research, 2007, 2011). Dietary patterns summarise a large number of correlated items (e.g., foods, ‘food patterns’, or nutrients, ‘nutrient patterns’) into fewer independent components capturing a large proportion of the dietary variability in a population (van Dam, 2005; O’Sullivan et al, 2011). This approach may be particularly relevant for multifactorial diseases such as CRC, where the aetiology possibly depends on more than a restricted list of dietary items (Hu, 2002; Jacobs and Steffen, 2003).

Most of the observational studies that have investigated the association between dietary patterns and CRC risk have focussed on food patterns, summarised in four literature reviews (Randi et al, 2010; Miller et al, 2010; Yusof et al, 2012; Fung and Brown, 2013). Despite notable differences in population characteristics, study design and methods used across the different studies, a plant-based diet with some dairy and fish was associated with a lower CRC risk, whereas a diet high in meats, refined grains and added sugar appeared to increase risk (Randi et al, 2010; Miller et al, 2010; Yusof et al, 2012; Fung and Brown, 2013). However, no studies published to date have examined associations between dietary patterns at the nutrient level and CRC risk in a large prospective cohort. Although results from pattern analyses conducted on foods are easier to translate into public health recommendations (Jacobs and Tapsell, 2007), nutrient pattern studies have several other advantages. Nutrients are universal exposures, that is, virtually everyone is exposed, and functionally not exchangeable. In contrast to food patterns, nutrient patterns characterise specific dietary habits more easily and in a more comparable way across populations and mirror better a combination of nutrients in complex biological pathways associated with diseases (Bravi et al, 2010; De Stefani et al, 2012; Moskal et al, 2014).

In the European Prospective Investigation into Cancer and Nutrition (EPIC) study, a large prospective cohort study across 23 centres in 10 European countries, we previously identified four main nutrient patterns based on dietary questionnaire data using principal component analysis (PCA) and successfully validated these patterns relative to standardised 24-h dietary recalls (Moskal et al, 2014).

Building on this previous methodological work, the main aim of this study was to investigate the relationship between four main nutrient patterns and the risk of CRC in the EPIC study. In addition, we investigated associations by anatomical subsite (colon, rectum) and location of the tumour within the colon (proximal, distal).

Materials and methods

Study population

The EPIC study is a multicentre prospective cohort designed to investigate the associations between diet, cancer and other chronic diseases across 10 European countries: Denmark, France, Germany, Greece, Italy, the Netherlands, Norway, Spain, Sweden and the United Kingdom. Participants were recruited between 1992 and 1998 and included 521 330 men and women aged 35–70 years. Details on recruitment and study design have been published previously (Riboli and Kaaks, 1997; Riboli et al, 2002). All participants gave written informed consent, and the project was approved by ethical review boards of the International Agency for Research on Cancer and local participating centres.

Dietary data and lifestyle questionnaires

Usual diet was assessed at study baseline using validated country/centre-specific dietary questionnaires (DQs) (Riboli et al, 2002). In most centres, DQs were self-administered, with the exception of Greece, Ragusa (Italy), Naples (Italy) and Spain where face-to-face interviews were performed. Extensive quantitative DQs were used in northern Italy, the Netherlands, Germany and Greece that were structured by meals in Spain, France and Ragusa. Semiquantitative food-frequency questionnaires (FFQs) were used in Denmark, Norway, Naples and Umeå (Sweden). In the United Kingdom, both a semiquantitative FFQ and a 7-day record were used, whereas a method combining a short nonquantitative FFQ with a 7-day record on hot meals was used in Malmö (Sweden) (Riboli et al, 2002).

Individual intakes of 23 nutrients, alcohol and total energy were estimated from the baseline country-specific DQs using a common harmonised food composition database (EPIC Nutrient Database, ENDB) (Slimani et al, 2007; Nicolas et al, 2016). Supplement use (vitamins/minerals) was not included in the calculation of nutrient intakes.

Information on physical activity, history of tobacco smoking, alcohol consumption and education was collected at baseline by questionnaires. Weight and height were measured at the baseline examination in all centres except from part of Oxford and France and Norway, where weight and height were self-reported (Riboli et al, 2002).

Cohort follow-up and identification of CRC cases

Population cancer registries were used in Denmark, Italy, the Netherlands, Norway, Spain, Sweden and the United Kingdom to identify participants with incident cancer. In France, Germany and Greece, cancer cases were identified through active follow-up, directly through study participants or next of kin, and confirmed by a combination of methods including health insurance records, cancer and pathology registries. All self-reported CRC cases were systematically verified using clinical and pathological records. Mortality data were obtained from mortality registries at the regional or national level. The participants were followed up from the date of enrollment (1992–1998) until the first date of diagnosis of cancer, death or until end of the follow-up period.

Cancer follow-up censoring dates varied among centres, ranging between 2005 and 2010, but completeness was >98.5% across all centres. For the current study, the end point of interest was the first occurrence of primary CRC. Cancer incidence data were coded in accordance with the 10th Revision of the International Classification of Diseases (ICD-10) and the second revision of the International Classification of Disease for Oncology (ICDO-2). All incident cases of colon (C18) and rectal cancer (C20) were included. Proximal colon cancer included tumours of the caecum, appendix, ascending colon, hepatic flexure, transverse colon and splenic flexure (C18.0–18.5). Distal colon cancer included those in the descending and sigmoid colon (C18.6–18.7). Overlapping (C18.8) and unspecified (C18.9) lesions of the colon were grouped among colon cancers only. Cancer of the rectum included cancer occurring at the recto sigmoid junction (C19) and rectum (C20). Anal canal tumours were excluded.

Exclusion criteria in analysis

Among the 521 330 EPIC participants, we excluded 11 345 subjects with missing dietary or other lifestyle information and 10 241 subjects in the lowest or highest 1% of the ratio of reported total energy intake to energy requirements (estimated from age, sex and body weight). In addition, 22 432 participants were excluded because they had a prevalent cancer at any site at baseline other than non-melanoma skin cancer or were lost during the follow-up. Statistical pattern analyses were therefore conducted on 477 312 participants.

Nutrient patterns

The identification, validation and interpretation of the patterns have been described previously (Moskal et al, 2014). Briefly, nutrient pattern analyses were performed on individual intakes of 23 harmonised nutrients as available in the ENDB estimated from the baseline DQs (Supplementary Table S1). Alcohol consumption was considered as a lifestyle factor and was thus not included in the PCA (Moskal et al, 2014). Independence of scale of the variances and co-variances was achieved by taking the natural log of the input variables. In order to capture variability of nutrient intakes independently from variation in energy intake, nutrients (log variables) were adjusted for alcohol-free energy before applying PCA using the nutrient density method (Willett et al, 1997). Nutrient patterns were then identified by PCA using the covariance matrix. The data were sufficiently similar across centres and hence the first four PCs from the combined data captured 67% of the total variation (Moskal et al, 2014). PCA were conducted on both sexes combined (Moskal et al, 2014). Based on the interpretability of the patterns, the percentage of total variance explained and the scree-plots of eigenvalues, four PCs or nutrient patterns were retained (Johnson and Wichern, 2007). The PC loadings represent how much a variable contribute to a pattern (Supplementary Table S1). Nutrients from plant food sources such as β-carotene, vitamin C, folate and dietary fibre loaded positively on PC1, whereas nutrients typically found in animal foods such as retinol, vitamin D, vitamin B12, cholesterol and saturated fatty acids loaded negatively. A variety of vitamins and minerals contributed to PC2; PC3 was characterised mainly by vitamin D; and PC4 by vitamin B12, riboflavin, calcium, cholesterol, total proteins and phosphorus (Supplementary Table S1). Individual PC scores were computed from each retained PC as the sum of products of the observed variables (nutrient intakes multiplied by its loadings; Moskal et al, 2014).

Statistical analyses

The association between the individual nutrient pattern scores was investigated for CRC, as the primary objective, and then by anatomical subsite (colon, rectum) and location of the tumour within the colon (proximal, distal). Hazard ratios (HRs) and 95% confidence intervals (CIs) were estimated using Cox proportional hazards models. Age was the primary time variable in all models; time at entry was age at recruitment and exit time was the age at whichever of the following came first: CRC diagnosis, diagnosis of a cancer other than CRC, death, emigration or the date at which follow-up was considered complete in each centre. To control for differences in questionnaires, follow-up procedures and other centre effects, all analyses were stratified by study centre. Models were also stratified by sex and age at recruitment (1-year intervals) (Ferrari et al, 2008). Nutrient patterns were modeled as continuous variables (ln (HR) per 1 s.d. increase) and as quintiles defined across the whole cohort. To test for linear trends, the medians of each quintile were entered as continuous terms in the Cox model. Graphs based on Schoenfeld residuals were used to assess proportional hazards assumptions that were satisfied.

All analyses were conducted for men and women combined as no significant interaction by sex was observed. First, models were adjusted for log-transformed (to improve normality) total energy intake (continuous, kcal per day) and mutually for each retained pattern score (minimally adjusted model). Then, in our fully adjusted model, we further adjusted for log-transformed alcohol intake (continuous, g per day, +1 was added before log transformation to take zero consumers into account), smoking status at baseline (never, former, smoker, unknown n=9733, 2%), height (sex- and centre-specific tertiles), weight (sex- and centre-specific tertiles) and physical activity level (inactive, moderately inactive, moderately active, active, missing n=66 265, 13.9%). Further adjustment for education (none/primary school; technical/secondary school; longer education incl. university degree; unknown n=10 707, 2.2%) and for BMI (sex- and centre-specific tertiles) instead of weight were also assessed, but led to virtually the same risk estimates (Supplementary Table S2).

Sensitivity analyses were performed by excluding CRC cases diagnosed within 2 or 5 years of follow-up and, second, by excluding underreporters of dietary energy intake according to Goldberg that has been shown to partially account for BMI-related biases (Freisling et al, 2012).

Likelihood ratio tests were used in nested models with and without multiplicative interaction terms to evaluate potential effect modification. A priori selected interaction terms entered into the statistical model were pattern scores as continuous variables with sex, total energy intake (log-continuous), alcohol intake ((+1)log-continuous), height (continuous), weight (continuous), BMI (continuous), smoking status (never, former, smoker, unknown) and physical activity (inactive, moderately inactive, moderately active, active, missing).

The heterogeneity across countries/centres was explored by using random-effect meta-analysis (Greenland and Longnecker, 1992) and quantified by I2 scores (Higgins and Thompson, 2002).

Statistical tests were two sided and P-values of <0.05 were considered statistically significant. Statistical analyses were performed with SAS version 9.4 (Cary, NC, USA) and Stata version 12.1 (StataCorp, College, Station, TX, USA).


After a mean follow-up of 11.3 (s.d. 2.5) years, 4517 incident CRC cases were identified among the 477 312 participants and 3 763 676 person-years. This included 2869 colon cancer cases (1266 distal, 1298 proximal and 305 overlapping or unspecified) and 1648 rectal cancer cases. The distributions of CRC cases and person-years by country/centre are shown in Table 1.

Table 1 Characteristics of study subjects and CRC cases (first tumour only) in the EPIC cohort by country

Characteristics of study population

Table 2 shows the main characteristics of the study population by quintiles of the four PCs. A higher proportion of participants with higher scores on PC1 and PC2 were women, had a university degree and were never smokers (all P<0.001). A higher proportion of participants with higher scores on PC3 and PC4 were men and had less than a university degree (all P<0.001). Food consumption across quintiles of the four nutrient patterns differed considerably for some food groups (Table 2). For example, mean fruit and vegetable consumption more than doubled comparing the highest with the lowest quintile of PC1. Subjects with high scores on PC2 showed higher mean consumption of fruits and vegetables, fish, and milk and yoghurt. The PC3 was characterised by a high consumption of fish, whereas PC4 was characterised by a high consumption of fish, and in particular of milk and yoghurt (Table 2).

Table 2 Characteristics of study subjects and their food consumption by lowest, middle and highest quintiles of the four main nutrient patterns (PC1–PC4)

Nutrient patterns and CRC risk

The PC1 was not associated with the risk of overall CRC or subsites, with the exception of a suggestive inverse association with cancer of the distal colon, when considered as a continuous score (Table 3). The HR for cancer of the distal colon for a 1 s.d. increase in PC1 was 0.93 (95% CI: 0.86–1.00). However, there was no indication of a linear trend when we compared the quintiles of PC1 scores (Ptrend=0.23). The PC2 was inversely associated with CRC risk (HR per 1 s.d. increase 0.94, 95% CI: 0.92–0.98) (Table 4). This association was confirmed when we compared the highest with the lowest quintile of PC2 (HR Q5 vs Q1 0.88, 95% CI: 0.80–0.98, Ptrend=0.02). The PC2 score was inversely associated with cancer of the colon and proximal colon, but not with distal colon and rectal cancers (Table 4). Significant associations of PC3 with risk of CRC and with subsites were largely absent (Table 5). The PC4 was inversely associated with CRC risk (HR per 1s.d. increase 0.96, 95% CI: 0.93–0.99) (Table 6). When comparing the highest with the lowest quintile of PC4, we observed a risk reduction by 10% (HR Q5 vs Q1 0.90, 95% CI: 0.81–0.99, Ptrend=0.02). Similar to PC2, the inverse association for PC4 was significant for colon and proximal colon but not for distal colon and rectum cancers (Table 6).

Table 3 HRs (95% CI) for CRC and subsites by quintiles (Q) of PC1 score and for 1 s.d. increase in PC1 score
Table 4 HRs (95% CI) for CRC and subsites by quintiles (Q) of PC2 score and for 1 s.d. increase in PC2 score
Table 5 HRs (95% CI) for CRC and subsites by quintiles (Q) of PC3 score and for 1s.d. increase in PC3 score
Table 6 HRs (95% CI) for CRC and subsites by quintiles (Q) of PC4 score and for 1 s.d. increase in PC4 score

Sensitivity analyses, effect modification and heterogeneity by country/centre

After excluding cancer cases diagnosed within the first 5 years of follow-up, the association between PC1 and cancer of the distal colon became stronger (HR per 1 s.d. increase 0.89, 95% CI: 0.81–0.98); results for associations between PC2–PC4 and CRC were largely unchanged (Supplementary Table S2).

Exclusion of participants with implausibly high or low total energy intake (n=163 084) led to virtually the same risk estimates (Supplementary Table S2).

No significant statistical interactions for the association of pattern scores and CRC risk, and the two subsites colon and rectum, were observed with total energy intake, alcohol intake, height, weight, BMI, smoking status and physical activity (all P>0.06) (data not shown).

Although there were no significant interactions for associations of pattern scores and CRC risk, and the two subsites colon and rectum, with sex (all P>0.07), results from analyses conducted separately for men and women are provided in Supplementary Table S2. In men, we observed a stronger association between PC2 and CRC (HR Q5 vs Q1 0.84, 95% CI: 0.71–0.99, Ptrend=0.02) as compared with the combined estimates, whereas associations were weaker in women and lost statistical significance (HR Q5 vs Q1 0.95, 95% CI: 0.83–1.11, Ptrend=0.36). Results for PC1, PC3 and PC4 remained virtually the same after stratification by sex.

There was an indication for moderate heterogeneity by country/centres for PC1 (I2=43%, Pheterogeneity 0.07) (Supplementary Figure S1) and PC3 (I2=42%, Pheterogeneity 0.07) (Supplementary Figure S3). The forest plot for PC1 (Supplementary Figure S1) suggested that the HR for Norway differed substantially from that of the other countries. After removing Norway from the analysis, no heterogeneity was observed (I2=22%, Pheterogeneity 0.23) and the estimated HR for a 1s.d. increase in PC1 was 0.97 (95% CI: 0.93–1.00). No heterogeneity by country/centre and CRC risk was observed for PC2 (I2=0%, Pheterogeneity 0.88) (Supplementary Figure S2) and PC4 (I2=0%, Pheterogeneity 0.77) (Supplementary Figure S4).


In this prospective analysis, we identified a nutrient pattern characterised by a high variety of vitamins and minerals (PC2) and a nutrient pattern characterised by vitamin B12, riboflavin, calcium, cholesterol, total proteins and phosphorus (PC4) to be independently and inversely associated with CRC risk. Associations for both PC2 and PC4 appear to be more pronounced for cancers of the colon, in particular of the proximal colon, than for distal colon or rectal cancer. The remaining two patterns studied were not significantly associated with the risk of CRC or its anatomical subsites.

To our knowledge, this is the first prospective cohort study relating dietary patterns at the nutrient level to CRC risk. Two case–control studies have identified a pattern labelled as ‘vitamins and fibre’ (Bravi et al, 2010; Turati et al, 2011), similar to the PC2 in our analysis that was found to be associated with a reduced risk of rectal cancer (Bravi et al, 2010). The results of a recent review of case–control and cohort studies on dietary patterns and CRC risk indicate that a plant-based pattern with moderate amounts of dairy, a pattern that is usually characterised by a high variety of vitamins and minerals, was associated with a lower CRC risk (Fung and Brown, 2013). Possible explanations for the apparent protective effect of PC2 may come from the antioxidant properties of nutrients with high loadings on PC2 including β-carotene and vitamin C that may reduce cell proliferation (World Cancer Research Fund/American Institute for Cancer Research, 2007; Stone et al, 2014). One of the most striking differences between PC1 and PC2 is the differential loading of vitamin B12. This would support the relevance of the one-carbon metabolism pathway contributing to colon carcinogenesis potentially through modification of DNA methylation (Kim et al, 2004; Davis and Uthus, 2004; Ulrey et al, 2005; Ulrich et al, 2008; Zhang et al, 2013). Interestingly, the patterns associated with lower CRC risk (i.e., PC2 and PC4) are those that are concomitantly associated with folate and vitamin B12. Furthermore, PC4, but also PC2, loaded on calcium (loadings on PC2 and PC4 were 13% and 28% respectively) that may have an antitumourigenic effect in the colon through reductions in cellular proliferation, by binding secondary bile acids and ionised fatty acids (Aune et al, 2012), and promotion of differentiation and apoptosis in both normal and tumour colorectal cells (Lamprecht and Lipkin, 2001). Riboflavin (vitamin B2), another nutrient with a high loading on PC4, is a micronutrient that plays a pivotal role in one-carbon metabolism and has already been implicated in a reduced risk of CRC in EPIC (Eussen et al, 2010) and in a large cohort of women (Zschabitz et al, 2013).

We did not find a significant association between PC1 and overall CRC risk. However, there was a small suggestive inverse association with cancer of the distal colon and also with overall CRC risk after excluding subjects from Norway. In a subanalysis stratified by country, we found that Norway contributed to most of the heterogeneity observed for associations between PC1 and CRC risk. A previous study within EPIC using a single 24-h recall has shown that intakes of nutrients with high loadings on PC1 such as dietary fibre, folate and vitamin C were consistently below the overall EPIC mean in Norway (Freisling et al, 2010). These nutrients therefore appear to be underrepresented in the Norwegian diet in comparison with the EPIC overall mean. As this was not apparent in the centre-wide PCA based on long-term dietary instruments, we decided to keep Norway in the overall PCA.

The PC1 shares some of the nutrient characteristics of PC2, with the exception of being poor in nutrients from animal food sources such as retinol and vitamin B12, and also in calcium, phosphorus and riboflavin. As reported previously, in terms of contributing foods, individuals with a high adherence to PC1, compared with individuals with high adherence to PC2, had slightly higher intakes of vegetables and fruits, cereals and cereal products, but substantially lower intakes of milk and yoghurt, and also of fish (Moskal et al, 2014). A high consumption of fruits and vegetables may not be sufficient to markedly reduce the risk of CRC. A pooled analysis of 14 prospective studies concluded that fruit and vegetable intake is not strongly associated with a decreased risk of colon cancer overall, but may be associated with a lower risk of distal colon cancer (Koushik et al, 2007), and this is in line with our finding of a 7% reduced risk of distal colon cancer per 1 s.d. increase in PC1. These findings deserve further investigation in future studies.

In a case–control study, a nutrient pattern similar to our PC3 was labelled ‘unsaturated fats (animal source)’, and was positively related to the risk of CRC (Bravi et al, 2010), but in our study no significant associations were found. This is also in line with a meta-analysis of dietary vitamin D, where no significant associations with CRC risk were found (Huncharek et al, 2009).

None of our nutrient patterns reflected a nutrient profile associated with a strikingly high consumption of red and/or processed meat (Table 2), and this is probably a main reason why we did not observe a higher CRC risk associated with any of the four patterns. For example, mean consumption of red meat in the highest quintile of PC4 was <50 g per day, and this is at or below the current dietary guidelines for cancer prevention, that is, <70 at population level or <40 g per day at individual level (Norat et al, 2015).

The main strengths of our study include its prospective design with a long follow-up, and the very large sample size combined with a large number of incident cancer cases. The analysis benefitted from these unique features of the whole EPIC cohort allowing us to explore differences in CRC risk by anatomical subsites and countries. We believe that reverse causation (changes in dietary habits because of early symptoms of an undetected cancer) is unlikely because our risk estimates remained virtually unchanged after excluding cases identified within the first 5 years of follow-up. On the other hand, we were not able to account for potential changes in diet during follow-up, because diet was assessed only at baseline. However, previous studies have demonstrated a reasonable stability of dietary patterns over time (Hu et al, 1999).

We used common nutrient patterns across 23 centres from 10 countries, accounting for a wide heterogeneity in the diet (Slimani et al, 2002; Freisling et al, 2010), allowing generalisation across these diverse populations. In this regard, we would like to highlight the consistency of findings, in particular for PC2 and PC4, across countries/centres.

Finally, by exploring macro- and micronutrients, the present study aimed to provide a holistic representation of individuals’ diet. It is likely that the combined intake of multiple dietary factors acts synergistically on a given health outcome and such cumulative effects may be easier to observe in a population setting (Hu, 2002; Jacobs et al, 2009). Individuals also vary considerably in their susceptibility to events that either inhibit or contribute to cancer development in response to the very same dietary component (World Cancer Research Fund/American Institute for Cancer Research, 2007). Therefore, the net ‘synergistic’ effect of a dietary pattern may also be that a greater proportion of individuals of a population are susceptible to at least one dietary component of a given pattern. One of the main advantages of dietary patterns at the nutrient level is the potential to identify combinations of nutrients that could reflect underlying biological mechanisms. Nutrient patterns can therefore be seen as complementary to food patterns and ideally confirm observed associations between foods and a health outcome. Furthermore, because different foods contributed to the same nutrient patterns, it is unlikely that results are confounded by other dietary compounds not captured by a given pattern, adding further strength to our findings. This appears to be particularly the case for nutrient patterns, where many nutrients contribute to a given pattern (i.e., PC2 and PC4). In contrast, PC3 seems to be mainly driven by vitamin D and PC1 shows sizeable positive associations only with β-carotene and vitamin C. Indeed, for these two patterns (PC1 and PC3), confounding by ‘healthier’ foods across countries cannot entirely be excluded and this may be one explanation why heterogeneity in associations with CRC across countries was larger for PC1 and PC3 as compared with PC2 and PC4.

Some caution is warranted with the interpretation of our findings. First, nutrient patterns identified by the exploratory techniques are meant to reflect dietary habits independently from any a priori hypotheses regarding known or unknown dietary effects on health outcomes. Indeed, PCA aims at maximising the fraction of variance explained by a weighted linear combination of the original nutrients. However, the nutritional components that are most variable might not be those that are most strongly associated with disease (McCann et al, 2001). As a consequence, the dietary patterns identified and used in diet–disease association studies are not necessarily relevant for cancer risk, and this could explain, at least in part, the lack of association reported in some studies (McCann et al, 2001; Randi et al, 2010). In future analysis, statistical methods of dimension reduction with respect to outcome such as PLS or latent class analysis could be of interest to link patterns to cancer outcomes.

Second, although several statistical tests were performed in this work, no Bonferroni correction was applied. As individuals do not belong exclusively to one pattern, but rather dietary habits can be spanned into different patterns, there exists an inherent degree of intercorrelation between some of the components of a given pattern (although the patterns itself are uncorrelated with each other). This is likely to make any multiple testing corrections too restrictive.

Third, this study relied on dietary questionnaires to assess nutrient intakes that are prone to measurement errors. In addition, questionnaires used to estimate food intakes were country specific and might have introduced systematic between-country differences in dietary assessment. However, our models were stratified by country/centre to account for such potential differences and adjusted for total energy intakes partly accounting for such differences and measurement errors (Ferrari et al, 2008). In addition, a harmonised nutrient database was used to convert food consumption into nutrient intakes (Slimani et al, 2007) that should improve the comparability of nutrient intakes across the countries participating in EPIC. A specific limitation of the ENDB is that not all essential nutrients are available because of the lack of available and reliable data. For example, in our nutrient patterns we could not separate the essential fatty acids linoleic and linolenic acids from total polyunsaturated fatty acids or account for sodium (salt) intake that are all key nutrients in dietary recommendations or cancer prevention guidelines (World Cancer Research Fund/American Institute for Cancer Research, 2007). However, we have good reasons to believe that the lack of sodium in our nutrient patterns did not affect our observed associations because it appears that salt intake is not associated with CRC risk (but with stomach cancer risk) (World Cancer Research Fund/American Institute for Cancer Research, 2007). Similarly, the evidence for an association of n-3 and n-6 fatty acids with CRC development is not very strong (Song et al, 2014; Chen et al, 2015) and total PUFAs did not explain a large part of the variation in nutrient intakes in EPIC.

Fourth, although we were able to adjust for most of the established CRC risk factors, we cannot exclude the possibility of residual confounding; for example, we were not able to adjust for use of nonsteroidal anti-inflammatory drugs (NSAIDs) (Rothwell et al, 2010) as this information was not collected in EPIC. Furthermore, genetic factors, which may play a role in CRC development, were not accounted for in this analysis. In future analysis, it would be worthwhile to investigate how genetic factors may have interacted with nutrient patterns on CRC risk.

Finally, a disadvantage of a nutrient-based approach is that nutrients are less directly related to dietary recommendations because, ultimately, nutrient intakes are largely determined by the choice of food sources. As many food sources exist for the same nutrient, it is challenging to make food-based dietary recommendations. However, there is accumulating evidence that health-conscious consumers are increasingly using nutrient-based information to make healthier food choices (Hersey et al, 2013). In addition, future studies may also look to integrating biological marker information with nutrient pattern analyses and further improve our understanding of the role of dietary factors in chronic disease aetiology, where nutrient patterns act as an interface between food patterns and the food metabolome (O’Sullivan et al, 2011). Nonetheless, in previous descriptive analyses (and briefly in Table 2), we were able to identify the related food sources that characterised each of the nutrient patterns in EPIC (Moskal et al, 2014).

If the inverse associations between PC2 and PC4 and CRC risk were causal and under the assumption that study subjects, for example, shift one quintile upwards in the distribution of nutrient patterns, a major effect on cancer incidence is unlikely. However, the results contribute to an improved understanding on how overall dietary patterns affect CRC development.


In summary, our study provides the first assessment of the relation between nutrient patterns and CRC risk in a large prospective multicentre cohort setting. Of the four nutrient patterns examined, the results were not statistically significant for PC1, a pattern characterised by nutrients from plant food sources such as vitamin C or β-carotene, and PC3, a pattern mainly driven by dietary vitamin D. Results for PC2, a nutrient pattern characterised by a high variety of vitamins and minerals, and PC4, a pattern driven by vitamin B12, riboflavin, calcium, phosphorus, cholesterol and total proteins, were both significantly and independently associated with decreased CRC risk. These findings suggest that analysing nutrient patterns may improve our understanding of how groups of nutrients consumed together might relate to CRC.