Different dietary patterns and reduction of lung cancer risk: A large case-control study in the U.S.

Reducing lung cancer risk by modifying diet is highly desirable. We investigated whether different U.S. dietary patterns were associated with lung cancer risk. Dietary patterns were derived using exploratory factor analysis for 2139 non-small cell lung cancer (NSCLC) cases and 2163 frequency-matched controls. Logistic regression was used to estimate odds ratios (ORs) and 95% confidence intervals (95% CIs). Highest adherence (highest vs. lowest quintile) to the “Tex-Mex”, “fruits and vegetables”, and “American/Western” patterns was associated with a 55% reduced (OR = 0.45; 95% CI = 0.37–0.56; P < 0.001), 32% reduced (OR = 0.68; 95% CI = 0.55–0.85; P = 0.001), and 45% increased (OR = 1.45; 95% CI = 1.18–1.78; P < 0.001) risk of lung cancer, respectively. The effects were stronger for squamous cell carcinoma and ever smokers for the “fruits and vegetables” pattern, and stronger for other non-small cell lung cancer and never smokers for the “American/Western” pattern. Among six genome-wide association (GWA) studies-identified lung cancer susceptibility loci assessed, a variant (rs2808630) of the C-reactive protein gene modified the associations for the “fruits and vegetables” (P for interaction = 0.03) and “American/Western” (P for interaction = 0.02) patterns. Our study first showed that the “Tex-Mex” dietary pattern was associated with a reduced lung cancer risk. Also, the “fruits and vegetables” and “American/Western” patterns affected lung cancer risk, and the effects were further modified by host genetic background.

role in host inflammatory response which is closely linked to lung cancer development 17 . Since dietary factors are known to have a major effect on host inflammatory response 18,19 , it is likely that the associations of dietary patterns with lung cancer risk will be modified by these inflammation-related loci.
In the present study, we investigated the associations between three dietary patterns derived by factor analysis ("fruits and vegetables", "American/Western", and "Tex-Mex") and non-small cell lung cancer (NSCLC) risk using data from a large ongoing Texas-based case-control study. Further, we investigated whether the associations differed by major histological types of NSCLC and whether the associations could be modified by host smoking status and lung cancer susceptibility loci identified in previous GWA studies.

Results
Identified dietary patterns and host characteristics. Exploratory factor analysis identified three dietary patterns that together accounted for 26% of the total variance, and the top 10 food items/groups contributing to each factor are listed in Table 1. The three dietary patterns were named "fruits and vegetables", "American/ Western", and "Tex-Mex" based on the food items/groups that were strongly correlated with each dietary pattern. The spearman correlation coefficients of the factor scores for the three dietary patterns with nutrient intake are shown in Supplementary Table 1.
Selected characteristics of the 2139 NSCLC cases and 2163 controls are presented in Table 2. Cases and controls differed in education, smoking pack-year, family history of lung cancer among first degree relatives, body mass index (BMI), physical activity, and daily intake of protein.
Associations of dietary patterns with non-small cell lung cancer risk. Age-and sex-adjusted and multivariable-adjusted associations of dietary patterns with lung cancer risk are presented in Table 3. In age-and sex-adjusted models, all three dietary patterns were associated with lung cancer risk (P for trend < 0.001). In multivariable-adjusted models, compared to the lowest quintile of the score on the "fruits and vegetables" pattern, the highest quintile was associated with a 32% decreased risk (OR Q5 vs. Q1 = 0.68; 95% CI = 0.55-0.85; P for trend = 0.001). Higher adherence to the "American/Western" dietary pattern was associated with an increased risk

Stratified associations by histological types of non-small cell lung cancer and smoking status.
The stratified associations by major histological type of NSCLC and smoking status are summarized in Tables 4  and 5, respectively. The three dietary patterns were associated with risks of all three major histological types. The protective effects of the "fruits and vegetables" pattern were more evident for squamous cell carcinoma. The harmful effects of the "American/Western" pattern were more pronounced for other NSCLC. The effects of the "Tex-Mex" pattern were similar across all histological types. The negative association of the "fruits and vegetables" pattern with lung cancer risk was present among current or former smokers, and not present among never smokers, and the P for interaction was 0.03. The "American/ Western" pattern was associated with an increased risk of lung cancer irrespective of smoking status; however, the association was stronger among never smokers, although the P for interaction (0.44) was not statistically significant. The association for the "Tex-Mex" pattern did not differ by smoking status (P for interaction = 0.87).
Stratified associations by GWA studies-identified susceptibility loci. The overall associations between dietary patterns and lung cancer risk in this subset of study population were similar to these in the total study population (Table 3). Among the six selected SNPs, four (rs1051730, rs2808630, rs7626795, rs6495309) were associated with lung cancer risk in this sample (P < 0.05). The stratified associations of dietary patterns with lung cancer risk by genotype at rs2808630 of the CRP gene are summarized in Table 6. The "fruits and vegetables" pattern was associated with a reduced risk of lung cancer only among those without a copy of the minor allele (OR Q5 vs. Q1 = 0.42; 95% CI = 0.26-0.69; P for trend = 0.001; P for interaction = 0.03). In contrast, the "American/Western" pattern was associated with an increased risk of lung cancer only among those with at least one copy of the minor allele (OR Q5 vs. Q1 = 1.93; 95% CI = 1.27-2.93; P for trend = 0.001; P for interaction = 0.02).  The "Tex-Mex" pattern was associated with a reduced risk of lung cancer irrespective of the genotype at CRP rs2808630 (P for interaction = 0.27). No statistically significant interactions (P for interaction > 0.05) were found between dietary patterns and the other five (rs1051730, rs3117582, rs7626795, rs402710, rs6495309) selected variants (Supplementary Table 2).

Discussion
In this large Texas-based case-control study, we identified three dietary patterns using factor analysis: "fruits and vegetables", "American/Western", and "Tex-Mex". Our study is the first to show that the "Tex-Mex" pattern was associated with a substantially reduced lung cancer risk. In addition, we found that the "fruits and vegetables" pattern was associated with a reduced risk and the protective effects were more evident for squamous cell carcinoma and among ever smokers. In contrast, the "American/Western" pattern was associated with an increased risk and the harmful effects were more pronounced for other NSCLC and among never smokers. Finally, for the first time, we found that the effects of the "fruits and vegetables" and "American/Western" patterns were further modified by a variant (rs2808630) of the CRP gene.
Our study is the first report to show that the "Tex-Mex" dietary pattern is associated with substantially reduced lung cancer risk, and the effects are consistent and stable across different sub-groups. Except our previous study with a smaller sample size which reported a non-statistically significant protective effect on renal cell carcinoma 20 , there has been no other studies on the "Tex-Mex" dietary pattern and cancer. The mechanism(s) linking high adherence to the "Tex-Mex" pattern with a decreased risk of lung cancer is unclear. "Tex-Mex" cuisine is characterized by its heavy use of legumes, spices, and shredded cheese. Legumes are rich sources of dietary fiber, a variety of micronutrients, and phytoestrogens with potential cancer-preventive effects 21 . In particular, our previous study showed that higher intake of phytoestrogens was associated with a decreased risk of lung cancer 22 . Also, spices or their bioactive components may prevent cancer through their anti-microbial, anti-oxidant, and inhibition of carcinogen bioactivation effects 23 .
Finally, high levels of cheese intake were found to be associated with a reduction in lung cancer risk in multiple studies [24][25][26] , and menaquinones and conjugated dienoic derivatives of linoleic acid in cheese were suspected to mediate the protective effects 27,28 .
Our findings on the "fruits and vegetables" and "American/Western" dietary patterns are consistent with findings from previous studies on dietary pattern and lung cancer. Previous studies using factor analysis 4-8 found that a "healthy" diet characterized by high vegetable intake was associated with a decreased risk of lung cancer, while a "Western" diet characterized by high fat and red meat intake was associated with increased risk. In addition to dietary patterns derived from factor analysis, several studies also investigated index-based dietary patterns. Three studies found that diet quality index was inversely associated with subsequent lung cancer risk 10,29,30 , and diet quality was assessed by the recommended foods score or dietary guideline index, which reflects compliance with  Table 5. Associations between dietary patterns (quintile) and non-small cell lung cancer risk by smoking status. a Adjusted for age, sex, education, family history of lung cancer among 1° relatives, body mass index, physical activity, and total energy intake. b Adjusted for age, sex, education, pack-years, family history of lung cancer among 1° relatives, body mass index, physical activity, and total energy intake. Abbreviations: OR, odds ratio; CI, confidence interval.
the current dietary guidance of increasing consumption of fruits, vegetables, whole grains, lean meats or meat alternatives, and low-fat dairy. Additionally, a Mediterranean dietary pattern was found to be inversely associated with lung cancer risk 9 . In our study, we found that the inverse associations of the "fruits and vegetables" pattern with lung cancer risk were only present among current or former smokers but not present among never smokers. This observation was consistent with the findings from previous dietary pattern studies on lung cancer that the beneficial effects of dietary patterns characterized by high consumption of fruits or vegetables were only evident in current or former smokers [4][5][6]9,10 . Smoking causes lung cancer in part through its pro-oxidant properties 31 . It is believed that the protective effects of fruits and vegetables are due to their rich collection of various antioxidants [32][33][34] , and this may explain why the protective effects were only seen in ever smokers. In addition, this is in line with our observation that the protective effects of the "fruits and vegetables" pattern were more evident for squamous cell carcinoma, which is most strongly associated with smoking among the main histological types of NSCLC 35 .
For the first time, we found that the "fruits and vegetables" and "American/Western" patterns interacted with a lung cancer susceptibility locus (rs2808630). The rs2808630 locus is mapped to the 3′ untranslated region of the CRP gene, which is a key gene in host inflammatory response 17 . Previous studies showed that higher fruit and vegetable intake was associated with lower circulating CRP levels 36,37 . Also, it is known that the "American/ Western" diet leads to increased systemic inflammation 19,[38][39][40] . According to a recent meta-analysis of 10 prospective studies 41 , circulating CRP levels, a marker for systemic inflammation, were positively associated with lung cancer risk. The CRP rs2808630 polymorphism has been shown to affect circulating CRP levels 42,43 . More importantly, one study showed that the GG genotype at CRP rs2808630, compared with the AA or AG genotype, was associated with a larger CRP increase in response to pro-inflationary stimuli, and this may explain our observation that the harmful effects of the "American/Western" pattern were only evident among those with the GG or AG genotype.
Our study has several limitations. First, the dietary data were collected for the year before diagnosis (cases) or enrollment (controls), and it may not represent the time window of interest (e.g., many years prior to lung cancer diagnosis when lung cancer has not been initiated yet). However, longitudinal studies showed that a single food frequency questionnaire measurement at one time point could characterize dietary habits for a period of at least 5-10 years 44 , and dietary patterns assessed with a food-frequency questionnaire were stable over time 45 . Second, our study is subject to recall bias because cases and controls may recall dietary intakes differently. Nevertheless, the direction and magnitude of the associations for the "fruits and vegetables" and "American/Western" patterns in our study were consistent with and comparable to those found in a prospective study where dietary data were collected around ten years prior to diagnosis 5 . Since the impact of the "Tex-Mex" dietary pattern on human health was not reported and therefore not publicized before, the findings for the "Tex-Mex" dietary pattern are less likely to be biased due to differential recall. Third, our analysis was limited to non-Hispanic whites, and caution should be taken when generalizing our results to other populations.  Table 6. Associations between dietary patterns (quintile) and non-small cell lung cancer risk by genotype at a lung cancer susceptibility locus (CRP rs2808630). a Adjusted for age, sex, education, smoking status, pack-years, family history of lung cancer among 1° relatives, body mass index, physical activity, and total energy intake. Abbreviations: CRP, C-reactive protein; OR, odds ratio; CI, confidence interval.
Scientific RepoRts | 6:26760 | DOI: 10.1038/srep26760 Despite these limitations, our study has several strengths. First, to our knowledge, this study is the first to show a protective effect of the "Tex-Mex" dietary pattern on lung cancer and the first to assess the interaction between dietary patterns and genetic variations on lung cancer risk. Second, our study had the largest sample size in terms of number of cases among the studies on dietary pattern and lung cancer risk using factor analysis. Third, the FFQ used in this study was previously validated. Finally, the two dietary patterns that explained the most variation in our study have been consistently identified in previous studies [4][5][6][7][8] , and we were able to assess a third dietary pattern (the "Tex-Mex" pattern) because consumption of "Tex-Mex" foods is relatively common in this Texas-based case-control study.
In summary, our study adds to the growing evidence that diet plays an important role in lung carcinogenesis which is thought by many to be caused solely by smoking. In particular, our study suggests that "Tex-Mex" cuisine may reduce lung cancer risk, and more studies are needed to confirm this novel finding and to explore the underlying mechanism(s). Also, our study together with previous studies supports that maintaining a "healthy" diet (increasing consumption of fruits and vegetables and limiting energy-dense and processed foods) may prevent lung cancer, and the beneficial effects are further modified by genetic background.

Methods
Study population. Cases and frequency-matched controls were accrued from a large ongoing case-control study of lung cancer. Cases were newly-diagnosed and histologically confirmed NSCLC patients from The University of Texas MD Anderson Cancer Center. There were no restrictions on age, sex, race/ethnicity, or stage. Healthy controls without a history of cancer (except for nonmelanoma skin cancer) were recruited from the Kelsey-Seybold Clinics, the largest private multispecialty physician group in the Houston metropolitan area with 18 clinics, more than 325 physicians and over 400,000 patients. The rationale of recruiting controls from the Kelsey-Seybold Clinic has been previously discussed 46 . When potential control participants visited the Kelsey-Seybold Clinic for annual physical exams, Kelsey-Seybold Clinic staff distributed a brief questionnaire to elicit the patients' willingness to be contacted by staff at MD Anderson and to collect preliminary demographic data for frequency-matching. For those who were willing to participate, staff at MD Anderson then contacted them by telephone to confirm their willingness to participate and to schedule an in-person interview at a Kelsey-Seybold Clinic convenient to the participant. Controls were frequency-matched on age (±5 years), sex, race/ethnicity, and smoking status (current, former, never). To date, the response rate among both cases and controls has been approximately 80%. All participants provided written informed consent prior to participation in the study. This study was approved by The University of Texas MD Anderson Cancer Center and Kelsey-Seybold institutional review boards, and all methods and analyses were conducted in accordance with this approval. Data collection. MD Anderson staff interviewers conducted interviews to collect epidemiological data on demographics, education, smoking, family history of cancer, height, weight, and physical activity. Additionally, dietary intake during the year prior to diagnosis (cases) or study enrollment (controls) was assessed with a previously validated food frequency questionnaire (FFQ, a modified version of the National Cancer Institute's Health Habits and History Questionnaire) 47 . The questionnaire has been shown to be a valid and reliable food frequency survey tool across various populations 48,49 . This questionnaire asks about the frequency and portion size of food and beverage items, ethnic foods commonly consumed in the Houston area, an open-ended section, and other dietary behavior questions regarding such factors as dining at restaurants and food preparation methods. From the dietary information obtained in the FFQ, total energy intake and amount consumed (g/day) for each food or beverage item were calculated based on the United States Department of Agriculture (USDA) National Nutrient Database for Standard Reference 50 . For multi-ingredient foods items not included in Standard Reference, calculations were based on the US Department of Agriculture Food and Nutrient Database for Dietary Studies 51 . Blood samples (40 mL each) were collected from the study participants for genotyping.
Exclusions and eligibility. The exclusions and eligibility criteria for this study were previously reported 46 .
For the present study, we limited the analysis to non-Hispanic whites because of the existing dietary variations between non-Hispanic whites and other racial/ethnic groups 52,53 as well as insufficient statistical power for other racial/ethnic groups. Also, we further excluded those (n = 138) with outlying total energy intake. The final analysis in this study included 2139 cases and 2163 controls.
Dietary pattern analysis. The details of dietary pattern analysis were reported in our previous study 20 .
Briefly, 117 out of 159 food and beverage items in the FFQ were included in the final dietary pattern analysis. We excluded foods with low frequency of consumption (<5%) in our study population. Also, several dietary items were grouped into predefined food groups according to current US Department of Agriculture food-group guidelines. For the remaining 117 dietary items, the daily intake was log-transformed and then energy-adjusted using the residual method 54 .
We conducted an exploratory factor analysis with the FACTOR command in Stata 13.0 (StataCorp LP, College Station, Texas) to reduce the number of dietary items into a small number of factors, and the factors were then rotated using a varimax rotation, an orthogonal rotation procedure. The Kaiser-Meyer-Olkin statistic was equal to 0.85, suggesting a "meritorious" sampling adequacy (relative to the number of dietary items) for conducting dietary pattern analysis. The number of factors that best represented the data was chosen on the basis of eigenvalues greater than one, identification of a break point in the scree plot, interpretability, and our previous dietary pattern analysis on renal cell carcinoma 20 . We identified three factors that best represented the dietary input data, and these three dietary patterns were identical to the three patterns identified in our previous study of renal cell carcinoma 20 . The final analysis was restricted only to the three chosen factors. For each participant, a factor score was computed for each of the three identified factors to indicate levels of adherence to one dietary pattern with Scientific RepoRts | 6:26760 | DOI: 10.1038/srep26760 higher scores indicating higher adherence. Factor scores were categorized into quintiles based on the sex-specific distribution in the control group.
Statistical Analysis. Baseline characteristics of cases and controls were compared using the Student's t-test for continuous variables and Chi-squared test for categorical variables. Unconditional multivariate logistic regression was used to calculate odds ratios (OR) with 95% confidence intervals (95% CI) after adjustment of potential confounders based on a priori knowledge. Patients with missing covariates were not included in the multivariate analyses. Trend tests were conducted by including the quintiles of the dietary pattern factor score as an ordinal variable. We also assessed whether smoking status and GWAS-identified lung cancer susceptibility loci modified the associations between dietary patterns and lung cancer risk. Multiplicative interaction was assessed by the likelihood ratio test. All statistical analyses were performed in Stata 13.0 (StataCorp LP, College Station, Texas). A P value < 0.05 (two-sided) was considered statistically significant.