Introduction

CRC is the third most commonly diagnosed cancer in males and the second in females1. Over the past few decades, CRC incidence has been rapidly increasing, especially in developed countries2. The considerable geographic variation in incidence of CRC suggests that life style, especially dietary factors, may play vital roles in the development of CRC3,4,5. Various dietary factors have been related to the etiology of colorectal cancer, however, so far only the effects of alcohol and consumption of processed and red meat have been established6,7,8,9,10,11.

Legumes are a diverse group of foods, including soybeans, peas, beans, lentils, peanuts and other podded plants, which are widely cultivated and consumed. Soybeans are unique among the legumes because they are a concentrated source of isoflavones, which are structurally similar to endogenous estrogen and can bind to estrogen receptors. Previous studies suggested isoflavones might impact cancer initiation and progression through estrogenic and antiestrogenic activities12. Besides isoflavones, legumes are good sources of dietary protein, vitamin E, vitamin B, selenium and lignans, which may also have potential cancer-preventive effects13.

Despite such biological fitness14, epidemiological studies investigating the association between legumes intake and risk of CRC generated conflicting results. Recently, a meta-analysis of four cohort and seven case-control studies found that consumption of soy foods might be associated with a reduced risk of CRC risk among women but not among men15, however, case-control studies are prone to recall and selection bias. Another more recent meta-analysis of cohort studies did not find significant association between intake of legume fiber and CRC16. This study merely focused on the legume fiber and only four cohort studies were finally included and might not have sufficient power to detect modest associations. Therefore, we conduct a meta-analysis of currently available prospective cohort studies and assessed all kinds of legume foods, with aims to reach a consistent conclusion regarding association s between higher legume consumption and CRC risk.

Results

Study characteristics

We identified 21 potentially relevant full text publications17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37. Four conducted in duplicate publications24,26,34,36 and three regarding to colorectal adenoma or polyps17,18,19 were excluded. Thus, fourteen cohort studies20,21,22,23,25,27,28,29,30,31,32,33,35,37 were included in the meta-analysis, containing a total of 1903459 participants and 12261 CRC cases who contributed 11628960 person-years of follow-up. The flow chart of search and selection is presented in Figure 1. Food frequency questionnaire was used for dietary assessment in all of these studies. Seven of the fourteen studies involved US populations25,28,29,31,32,35,37, five were from Asia20,21,22,27,33, three were from Japan21,22,27, two were from China 20,33 and the other two were from Europe 23,30. Of the fourteen studies analyzed, nine provided data on women20,21,22,25,28,30,31,32,35 and six on men21,22,28,30,32,33, only five studies presented separate data for men and women21,22,28,30,32, one study provided data for men only 33 and four was conducted with women only20,25,31,35. Most studies provided relative risk estimates adjusted for smoking (n = 11), BMI (n = 10), red or processed meat (n = 10) and family history of CRC (n = 9), a few studies adjusted for fruit or vegetable (n = 3). Only five studies found a statistically significant inverse relationship between legume intake and CRC risk20,21,25,27,35. More detailed characteristics of the included studies are summarized in Table 1.

Table 1 Characteristics of included studies of the association between legume intake and CRC risk
Figure 1
figure 1

The flow chart of search and selection.

Overall association between legume intake and CRC risk

Fourteen cohort studies were included in the analysis of the highest versus lowest intake of legume and risk of colorectal cancer. The summary relative risk was 0.91 (95% CI = 0.84–0.98; P = 0.01) and test of heterogeneity I2 = 40.2% (P = 0.01) (Figure 2), indicating an inverse association between legume intake and CRC risk.

Figure 2
figure 2

Forest plot of legumes consumption and risk of colorectal cancer.

Meta-regression

We conducted a meta-regression to comprehensively explore the source of heterogeneity. Eleven factors such as country, gender, cancer site, study size, follow-up period, number of cases, whether adjusted factors such as energy, BMI, smoking, fruit, red/processed meat. were included in the meta-regression model. In this model, the Adj R-squared was 100.00% and Prob > F was 0.02, which indicated that the model was significant. After 100 times permutation, legume species, follow-up duration and whether controlled for red/processed meat intake appeared to be significant to explain the between-study heterogeneity.

Subgroup analyses

To identity underlying sources of heterogeneity among these studies, we performed subgroup analyses. In subgroup analyses defined by population, gender, cancer type, participants, number of cases and duration of follow-up, dietary legume consumption was not significantly associated with risk of CRC in most subgroups, excepted in Asia (RR = 0.82, 95% CI = 0.74–0.91) (Table 2). We further carried out the subgroup analyses according to adjustment, in the subgroups of studies that adjusted for age, body mass index, red or processed meat, inverse associations were significant. The RRs were 0.88 (95% CI = 0.81–0.96), 0.86 (95% CI = 0.78–0.95) and 0.89 (95% CI = 0.81–0.98) for analyses adjusting for age, BMI, red or processed meat, respectively. More detailed results of the subgroup analyses are summarized in Table 2.

Table 2 Results of subgroup analyses

Legume species

Stratified according to legume species, we found an inverse association between soybeans intake and CRC risk (RR = 0.85, 95% CI = 0.73–0.99). Legume fiber intake marginally associated with a decreased risk of CRC (RR = 0.85, 95% CI = 0.72–1.00); however, we did not observe this inverse association in subgroup of beans (RR = 1.00, 95% CI = 0.89–1.13) (Table 3).

Table 3 Stratified analysis according to legume species

Sensitivity analysis

When each study was excluded from the meta-analysis in turn, the pooled RRs did not change fundamentally, indicating that our results could not be solely attributed to the effect of a single study. The RR ranged from 0.89 (95% CI = 0.82–0.97) when the NIH-AAPR Diet and Health Study36 was excluded to 0.92 (95% CI = 0.85–0.99) when the Women's Health Study (WHS)29 was excluded.

Publication bias

The result of Egger's test (P = 0.16) or Begg's test (P = 0.31) indicated no evidence of substantial publication bias (Figure 3).

Figure 3
figure 3

Funnel plot of publication bias.

Discussion

We systematically reviewed fourteen published prospective cohorts on the relationship between legume consumption and CRC incidence. Our meta-analysis supports an inverse association between higher intake of legume and risk of colorectal cancer. Among all the legume species, soybeans and legume fibers revealed to be associated with a decreased risk of CRC. Higher consumption of legume reduced the risk of CRC among Asians needs extra validation.

The mechanism underlying a possible protective effect of legume intake on CRC risk might be complex because of a great variety of anti-carcinogens in legumes. The most important anticancer composition of legume food is flavonoids, especially isoflavones. Flavonoids from legume food not only inhibit the growth of tumor cells, but also induce cell differentiation38. The inhibitory effects of flavonoids on the growth of malignant cells might be a consequence of their interference with the protein kinase activities involved in the regulation of cellular proliferation and apoptosis39. In addition, legumes are rich in dietary fiber, which may increase stool bulk, decrease transit time and dilute potential carcinogens in the gastrointestinal tract. Further, fiber from legume stimulates bacterial anaerobic fermentation which results in production of short-chain fatty acids, such as butyrate, which inhibits growth, induces apoptosis and cell cycle arrest and promotes differentiation in CRC cells40. Furthermore, legumes are good sources of dietary protein, vitamin E, vitamin B, selenium and lignans with potential cancer-preventive effects. Legumes have a high content of vitamin B641 and vitamin B6 intake was reported to reduce risk of colorectal cancer42. In addition to its direct cancer preventive effects, legume intake may affect disease risk indirectly as well. For example, higher intake of legumes may replace other sources of protein in the diet such as meat43.

Based on the results of meta-regression analysis, we think legume species, follow-up duration and whether controlled for red/processed meat are the major source of between-study heterogeneity. In subgroup analyses, we found an inverse association between legume intake and CRC risk among Asian. Possible reason for this result is that dietary patterns containing higher levels of legumes in Asia population. Subgroups analyses according to legumes species revealed higher intake soybeans reduced risk of colorectal cancer. Soybeans are unique among the legumes because they are a concentrated source of isoflavones, such as genistein and daidzein, which may have cancer preventive properties. These compounds may compete with estrogens by binding to the estrogen receptor and thereby reduce cancer risk. More importantly, when stratified according to the confounders controlled, we found that combining those studies adjusted for BMI, vegetables and red meat intake revealed an inverse association between higher consumption of legume and risk of colorectal cancer. These three factors have been previously related to the risk of CRC44,45,46 and failure in adjustment for these factors might bias the associations. For the discrepancy in the subgroup analysis according to number of cases and duration of follow-up time, we think usually small sample size (<500) generate less stable results, so it is difficult to exclude the possibility that the positive association is due to chance. Referring to longer follow-up duration (≥10) lacked the significant association, we speculated that it might be due to small sample size without enough power to detect the association or because with longer follow-up time, the population might be older and other aging-related factors might contribute more to the incidence of cancer and therefore dilute the associations tested for the exposures tested.

We found legume fiber consumption is marginally associated with a decreased risk of colorectal cancer, which is inconsistent with a previous meta-analysis16. This discrepancy may be partly due to the larger sample size of our study than the others and exclusion of the studies without adjustment for the potential confounders. Regarding to gender, we did not find that legume consumption was associated with a reduced risk of CRC among women, but was marginally associated with a decreased risk of CRC among men, which is inconsistent with another previous meta-analysis15. The explanation for this disagreement might be that previous meta-analysis included both case-control and cohort studies.

Our meta-analysis has several strengths. First our current study is based on prospective cohort studies, which is unlikely to be influenced by recall bias and selection bias. Second, combining a large number of studies renders us sufficient power to detect potential modest associations. In addition, sensitivity analyses and publication bias indicated our findings were generally robust and reliable.

Several limitations of our study should also be acknowledged. First, we did not have sufficient data to conduct a dose-response meta-analysis, which made us unable to evaluate the precise relationship. Besides, it is possible that our results were affected by the unmeasured or residual confounding by other dietary or lifestyle factors. Furthermore, because these studies conducted in different countries and populations, the items they measured legume consumption varied. So our findings may be influenced by the misclassification of legume consumption and the inability of providing accurate measurement of intake also limited the impact of our study. In summary, our meta-analysis suggests that a higher intake of legume is associated with a reduced risk of colorectal cancer. Further studies with better dietary assessment tools and adjustment for appropriate confounding factors are warranted to confirm the associations.

Methods

Identification of studies

To get all the eligible studies relating to the legume consumption and risk of colorectal cancer, we conducted a systemic retrieval through Medline and Embase databases date to December 2014. We used the following terms as key words in combination for the literature search: legume, soy, beans, peas, soybeans, tofu, soymilk, vegetable, diet and colorectal cancer, restricted to English. In addition, reference lists of retrieved articles and current review articles were scanned manually for all relevant additional studies. When multiple studies pertained to the same or partially overlapping population, we used the results with the longest follow-up time or largest sample size.

Inclusion criteria

We systematically examined the identified studies, studies met the following criterion were included: 1) a prospective cohort design; 2) the exposure was legume consumption, including tofu or soybeans, peas, beans, lentils and other podded plants and all products made of them; 3) the outcome was risk of colorectal cancer, incidence of colorectal cancer; 4) provided or allowed calculation of RR with 95% CI. Studies were excluded if they 1) had a retrospective design; 2) were Non-human, in vitro research, case reports; 3) focused on the recurrence, growth; 4) focused on adenoma; and 5) did not adjust for confounders.

Data extraction

All data were extracted independently and cross-checked by two authors (YS and BBZ). For the eligible studies, the following data were extracted: first author, year of publication, geographic region, study name, follow-up period, number of participants/person-years of follow-up, number of cases, demographics of participants, cancer sites, species and amount of legumes consumption, relative risks and 95% CI for the highest versus the lowest intake and adjustment for confounders in the analysis. Any results stratified by sex or tumor site were treated as separate reports.

Statistical analysis

We extracted the maximally adjusted RR (95% CI) in order to control for confounding factors. We quantified the relationship between legumes consumption and CRC risk by pooling the RRs for the highest category compared with the lowest category. Q statistic test was applied to assess between-study heterogeneity47 and the degree of heterogeneity was further quantified using the I2 statistic48. I2 values of 25, 50 and 75% corresponded to low, moderate and high degrees of heterogeneity, respectively48. Statistically significant heterogeneity was considered when P < 0.05. We pooled the RRs in a random effects model described by DerSimonian and Laird used49, which takes into account both within- and between-study variability. We conducted a meta-regression to comprehensively explore the source of heterogeneity. Eleven factors such as country, gender, cancer site, study size, follow-up period, number of cases, whether adjusted factors such as energy, BMI, smoking, fruit, red/processed meat. were included in the meta-regression model. Subgroup analyses were further performed, if feasible, according to legume species, sex and site, geographic region, number of cases and duration of follow-up and confounders adjusted for. Sensitivity analyses were conducted by excluding each study in turn to evaluate the stability of the results. Publication bias was assessed using the funnel plot and Egger's test. Any asymmetry observed or P < 0.05 indicated potential publication bias. All analyses were performed with comprehensive meta-analysis50 and were carried out by Stata version 10.0 (STATA Corp, College Station, TX).