Cancer is a leading cause of death, and research on its preventive factors has been in focus. Dietary factors and related nutrients are modifiable factors, and the associations between these factors and cancer have been investigated.

The association between vitamin D and total and site-specific cancer incidence/mortality has been investigated in many studies, including randomized controlled trials (RCTs), observational studies, and meta-analyses, as vitamin D may play a role in carcinogenesis1,2,3,4,5,6,7,8. However, the results of these studies are controversial. A recent meta-analysis that included high-dose vitamin D supplementation RCTs did not find a significant reduction in cancer incidence (Relative risk [RR] 0.98 [95% confidence interval (CI): 0.93– 1.03, p = 0.42]7, while a meta-analysis of eight prospective studies showed a marginal association between 25(OH)D (a biomarker of vitamin D status in humans) and a lower risk of cancer when the highest with the lowest categories of 25(OH)D were compared (Summary RR = 0.86; 95% CI 0.73–1.02)8. An umbrella review of diet and cancer showed that four meta-analyses revealed statistically significant results (p < 0.05) among 18 meta-analyses of observational studies on vitamin D and cancer, while heterogeneity was > 50% in nine meta-analyses9. Nine out of 18 vitamin D studies showed heterogeneity of > 50%, which was the highest percentage among nutrients included in the umbrella review.

Colorectal cancer is the third most common incident cancer in men and second in women worldwide10. This ranking is similar in Japan, and an increasing trend has been observed11, while a decline in the incidence of colorectal cancer was found in some countries, including the US12. Since the incidence of colorectal cancer is high and anti-tumor effect of vitamin D against colorectal cancer was shown13,14, the association between vitamin D concentration and colorectal cancer has been investigated and the present study was conducted under the same hypothesis. However, the previous results remain controversial15,16,17. A meta-analysis showed a significant reduction in colorectal cancer risk in Asians16, but not in the latest meta-analysis, which was based on prospective studies15. Therefore, in addition to the classical epidemiological method, a new approach is required to investigate the association between vitamin D and cancer.

Mendelian randomization (MR) is an analytical method in which genetic variants are recognized as instrumental variables, and the random allocation of genotypes is likened to randomized trials at conception18. Since observational studies have faced controversy regarding the association between vitamin D and cancer, this MR method, in which the effect of confounding factors might be reduced, has been applied to examine these associations. A study from the global network showed that there was not a significant association between genetically determined 25(OH)D and colorectal cancer (the odds ratios [ORs] per 25 nmol/L increment were 0.92 [95% CI 0.76–1.10])19, and another study from the UK Biobank also showed that genetically low 25(OH)D levels were not significantly associated with overall cancer risk20. Considering the differences in results in a meta-analysis of Asians and ethnic differences in genetic variants, MR studies in Asian populations are required.

To examine the effect of genetically predicted vitamin D concentrations on total and colorectal cancer risk, we conducted an MR analysis in a large-scale case-cohort of 4,543 cancers with a mean follow-up of 15 years among Japanese and 7936 colorectal cancer cases in the Japanese consortium and a combination of several other studies in Japan.


The basic characteristics of the included studies are presented in Table 1. For the SNP-vitamin D concentration association, 3978 individuals from two Japanese cohorts were included. Regarding SNP-total association, 4543 cancer cases (i.e. individuals with a newly diagnosed cancer) and 14,224 controls were included, and for SNP-colorectal cancer, 7936 colorectal cancer cases and 38,042 controls participated. The mean age of participants the included studies was 52–60 years old. The percentage of men varied among studies and between case and control groups: approximately 35% in the Japan Public Health Centre–based Prospective (JPHC) Study control group and 64% in the BBJ case group. The mean (standard deviation [SD]) of plasma vitamin D was 22.0 (7.2) ng/mL in JPHC and 18.0 (5.1) in the Japan Multi-Institutional Collaborative Cohort (J-MICC) Study.

Table 1 Baseline characteristics of participants.

Power calculations under the power of 80% and Type I error rate of 5% were conducted. Minimum detectable ORs per 1 SD increment calculated with the percentage of our explained variance and the number of cases and controls were 0.81 for total cancer and 0.86 for colorectal cancer.

110 single nucleotide polymorphisms (SNPs) were selected from previous studies and associations between SNPs and vitamin D concentrations are shown in Supplemental Table 1. Among the associations between these SNPs and vitamin D in our dataset, two SNPs (rs3755967 and rs10832254) reached genome-wide significance levels (p-value < 5.0 × 10−8), and 14 SNPs were statistically significant (p-value < 0.05) (Table 2). In these significant SNPs, GC (chromosome 4, rs3755967) and CYP24A1 (chromosome 20, rs8121940) which were related to vitamin D metabolism, were detected as nearby genes. D prime values of selected SNPs, rs10832254 (chromosome 11) and rs12803256 (chromosome 11), were both over 0.9 with SNPs near CYP2R1 and DHCR7/NADSYN1, respectively. Regardless of significance or type of nearby gene, we included selected 110 SNPs in MR analyses. The explained variance of 25(OH)D levels by the 110 SNPs was 7.0%. Associations between SNPs and total cancer, or SNPs and colorectal cancer are shown in Supplementary Table 2.

Table 2 Summary statistics of the significant SNPs (p-value < 0.05) in SNP-exposure association.

The MR results between vitamin D and total cancer or colorectal cancer are shown in Table 3, and their scatter plots with 110 SNPs are shown in Fig. 1. There were no significant associations between genetically predicted plasma vitamin D levels and total or colorectal cancer in any of the MR methods. ORs per 1 unit increase in log2-transformed vitamin D concentration (95% CI) were 0.83 (0.63–1.09) for total cancer and 1.00 (0.80–1.24) for colorectal cancer in random-effect inverse-variance weighted (IVW) method. Results were not significant in MR-Egger method (0.83 [0.57–1.19] for total cancer and 1.01 [0.75–1.37] for colorectal cancer) and weighted median method (0.91 [0.62–1.34] for total cancer and 1.08 [0.79–1.48] for colorectal cancer). When we only included significant 14 SNPs in Table 2, MR results were not significant similarly. MR-Egger intercepts were not significant in either the total cancer or colorectal cancer models. Since p-value of Q statistics was not significant for total cancer but marginal (p = 0.05) for colorectal cancer in the heterogeneity test, the random-effect IVW method was used. MR-PRESSO method was used to detect horizontal pleiotropic outliers, however, there were no outliers for total and colorectal cancer assessment. No single SNP changed the result according to the leave-one-out analysis. When we included strong instruments (p-value < 5 × 10−6 or approximate F-statistics > 10) only in MR analysis to avoid weak instrument bias, heterogeneity became non-significant, however, non-significant results were not changed (Supplemental Table 3). Moreover, JPHC-base sample for SNP-exposure association overlapped with JPHC-base in SNP-total and colorectal cancer associations. We conducted sensitivity analysis excluding JPHC-base samples from SNP-outcome association and perform MR analysis. The results for total cancer and colorectal cancer were not changed largely and remained non-significant (Supplemental Table 4).

Table 3 Mendelian Randomization estimates between plasma vitamin D concentrations and total or colorectal cancer.
Figure 1
figure 1

Analyses of vitamin D and total cancer risk (a) and colorectal cancer risk (b) with 110 SNPs.


In an MR analysis using a meta-analysis of large cohorts or case–control studies with genetic information in Japan, no significant association between genetically predicted plasma vitamin D and total or colorectal cancer was found. Although an observational study from a Japanese cohort showed a significant association between plasma vitamin D levels and a lower risk of total cancer4, we did not find a significant association between vitamin D and total cancer or colorectal cancer through the MR framework.

Although controversial results have been proposed in observational studies and some previous MR studies focused on vitamin D and cancer1,2,3,4,5,6,7,8,19,20,21, the primary reason for conducting this MR analysis is that vitamin D concentration was significantly associated with a lower risk of total cancer in an observational study in Japan4. Our null findings were, however, consistent with a previous MR study of European ancestries20, although we focused on Asian populations who may have a different association because of their different genetic backgrounds. Regarding overall cancer incidence, Ong et al. showed non-significant results (combined OR [95% CI] 0.97 [0.90–1.04]) from the UK biobank, including 46,155 cancer cases22. Although this study used six SNPs that explained 3.5% of the variation in vitamin D concentration, the authors conducted a reassessment for various types of cancer using 74 SNPs, which explained up to 4% of the variation. Nevertheless, they did not find a significant association between most types of cancer other than ovarian cancer21. Although we selected 110 SNPs, these were based on studies mainly from European ancestries and no study for East Asian only genome-wide association study (GWAS) was published so far. Because of this, a few instrumental variables were significantly associated with Vitamin D concentration. Explained variance was 7.0% in our samples, however, both analyses with all 110 SNPs and with significant SNPs only showed null associations. Considering the null results for most types of cancer in the large MR studies21, our null findings for total cancer were agreeable, although we included a relatively small sample size compared with European ancestry studies, which might be judged from the sample size calculation result.

Regarding colorectal cancer, MR analysis from the large consortium with 11,488 colorectal cancer cases showed that the OR was 0.92 (95% CI 0.76–1.10)19 and bidirectional MR analysis with 26,397 cases also did not show significant results for vitamin D and colorectal cancer risk23. Our results are consistent with these findings. Among observational studies, a study in Japan that showed a significant reduction in overall cancer showed null results in the case of colorectal cancer (OR in Q4 vs. Q1: 0.95 [95% CI 0.73–1.23], p for trend = 0.48)4. Inconsistent with this, a pooled analysis of 17 cohorts showed a significant reduction in the high vitamin D concentration (87.5– < 100 nmol/L) group24 and the meta-analyses from Asians showed significant dose–response reduction14. We observed a discrepancy between MR analysis and observational studies among Asians. A potential reason for this might be unknown or unmeasured confounders in the models used in observational studies or because our results were from a linear MR Further MR studies are required to confirm these results.

In vitro studies have shown that high concentrations of vitamin D inhibit tumor cell proliferation and induce differentiation13. The anti-tumor effects of vitamin D include pro-apoptosis, anti-proliferation, and pro-differentiation effects25. Based on these mechanisms of antitumor effects, epidemiological studies have shown the potential of vitamin D in cancer prevention. Genetic variations in vitamin D status, however, have not shown significant results, similar to those of MR studies26,27. As with our results, SNPs detected in GWAS studies explained a small percentage of vitamin D concentrations, and it might be difficult to detect the reduction in risk of cancer caused by the increased genetically predicted vitamin D levels. Other potential reasons that we did not find significant results were as follows: (i) the causal relationship between vitamin D level and cancer did not truly exist, and (ii) the power was was limited to detect an existing, but small effect. Because of the relatively low explained variance of vitamin D, GWAS of Vitamin D including more Japanese participants is required to allow MR with a larger sample size, and large randomized controlled trials to investigate the causal relationship between vitamin D and cancer in Japanese are needed.

A strength of this study is the application of a two-sample MR framework in a relatively large-scale Asian population to examine the association between vitamin D and cancer. The MR method can overcome the potential bias in observational findings. However, our study has few limitations. First, we assumed a linear association between vitamin D and colorectal cancer, and we could not investigate the nonlinear effect because we could not include individual-level data. Second, the sample size was small. In SNP exposure analyses, vitamin D was measured in a limited number of cohort studies because measurement of vitamin D concentration in many samples was not feasible in prospective cohorts. Based on the sample size calculation, a relatively weak association may not be detectable in this sample size. In addition, the measurement was conducted only once, and measurement errors could not be excluded. In the SNP outcome analysis, site-specific cancers other than colorectal cancer were not included because of the limited number of cases. Third, the selected SNPs were based on previously published studies that did not include the East Asian population. For this reason, we included both significant and non-significant instrumental variables in MR and this may cause weak instrument bias. Although sensitivity analysis with significant instruments showed similar results with main analyses, GWAS results for vitamin D in a Japanese population are required. Also, this study was conducted in Japan, so generalizability might be limited to Japanese. Fourth, because we could not collect individual genetic data and used summary results of GWAS as estimates, we did not assess the association of colorectal cancer by site. However, since the association between vitamin D concentration and cancer is controversial, and evidence from the Asian population is scarce, this study is worth reporting. Further studies with a larger sample size are required to confirm this hypothesis.

In conclusion, consistent with MR studies in European ancestries, there was no statistically significant association between vitamin D concentrations and total and colorectal cancer risk from MR analysis among the Japanese population.


We performed a two-sample MR analysis in which two types of estimates from two separate datasets were used to evaluate the objective association (vitamin D concentration and total or colorectal cancer in this study). One is the estimate between SNPs and the exposure of the objective association (i.e., vitamin D in this study), and the other is the estimate between SNP and the outcome (total or colorectal cancer). MR analysis should be based on the following three assumptions: (i) SNPs (as instrumental variables) are associated with the exposure, (ii) SNPs (as instrumental variables) are not associated with confounding factors that are supposed to exist in the association between the exposure and the outcome, and (iii) SNPs as instrumental variables are not directly associated with the outcome and are related to the outcome only through exposure18,22.

Methods for selecting SNPs associated with vitamin D

SNPs used as instrumental variables in the MR analysis were selected according to previously published papers. We used the National Human Genome Research Institute-European Bioinformatics Institute (NHGRI-EBI) database named GWAS Catalog ( to select SNPs. In September 2022, 713 SNPs, including duplicates, were shown to be associated with “vitamin D measurements” in the GWAS Catalog. We systematically chose SNPs for instrumental variables in the following criteria; (a) phenotype was vitamin D measurement (647 SNPs remained), (b) p-value of SNPs were < 5 × 10−8 (614 SNPs remained) (c) participants in the original papers were adults (609 SNPs remained), (d) duplicate of SNPs (475 SNPs remained), (e) minor allele frequency (MAF) > 0.01 in the East Asian populations based on 1000 genome project (325 SNPs remained), (f) clumping using “clump_data” (clumping 10,000 kb window and R2 > 0.001) in “TwosampleMR (version 0.5.6)” library of R software. Finally, 110 SNPs were used for this analysis. (Supplemental Table 1).

The data source of MR analysis in Japanese

We calculated the estimates of SNP-vitamin D association from two Japanese cohorts: JPHC Study and J-MICC Study. The details of each study are provided in Supplemental Table 5. Outliers of vitamin D measurement (70 ng/mL) were excluded, and vitamin D was log2 transformed to be close to a normal distribution. Estimates were calculated using a linear regression model among 3739 JPHC participants and 239 J-MICC participants. The models were adjusted for age, sex, season, PCA, and area (JPHC only). A meta-analysis was performed for each target SNP using a fixed effects model.

For the SNP-outcome association, we examined two types of cancer (total and colorectal cancer). We selected colorectal cancer as a site-specific cancer because it is one of the most common cancers, and we could include a sufficient sample size of cancer cases. For total cancer, 3541 cases and 10,536 controls were identified from the participants who answered the baseline questionnaire of the JPHC (JPHC-base) and provided blood samples. A total of 1002 cases and 3688 controls were identified from participants who answered the 5-year questionnaire of JPHC, provided blood samples, and were not included in the baseline analysis (JPHC-5 year). For colorectal cancer, colorectal cancer cases and controls from the JPHC Study, NAGANO Study, Hospital-based Epidemiologic Research Program at Aichi Cancer Centre (HERPACC Study), J-MICC Study, and non-restricted published genome-wide association study (GWAS) analysis data from BioBank Japan (BBJ) were gathered and meta-analyzed for target SNPs. The descriptions of these studies are shown in Supplemental Table 5, and the genotyping, imputation method, and details of the association studies in each study are described in Supplemental Table 6.

We performed power calculations with mRnd ( and a type-I error rate of 5% and power of 80% were set.

This study was approved by the review board of the National Cancer Center, Japan, and all the participating studies were approved by each institutional review boards and informed consent was obtained from all participants. All methods were performed in accordance with the Declaration of Helsinki and Japanese ethical guidelines. Details are shown in Supplemental Table 5. Our study protocol was approved among researchers in participated studies before analyses and this study is reported following the “Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization” (STROBE-MR)28. Statement in Supplemental Table 7.

Statistical analysis

In this MR analysis, estimates (β coefficients and 95% CI) of selected SNPs from previous studies were calculated in our dataset (both SNP-exposure and SNP-outcome associations) and used in MR analysis to avoid overestimation of the association (called the Beavis effect, or the winner’s curse)29,30. Estimates and other related information on targeted SNPs were collected and used for calculation. In SNP-exposure analysis, explained variance and F-statistics for each SNP were calculated. Approximate F-statistics were calculated from the formula (beta/standard error)2. After calculating the β coefficients for SNP-exposure and SNP-outcome associations in each study, we combined them with IVW in a fixed or random-effect model depending on the heterogeneity test. 110 SNPs identified from previous studies were regarded as instrumental variables. We selected them irrespective of the statistical significance of these SNPs in our data because overfitting or insufficient power in our data might cause bias in SNP selection.

We performed MR analysis with 110 SNPs and significant (p-value < 0.05) 14 SNPs using “TwoSampleMR” (version 0.5.6) available as R package. All these procedures were conducted using the Statistical software R version 3.5.0 (R Foundation for Statistical Computing, Vienna, Austria). To examine the association, the IVW, MR-Egger regression, and the weighted-median method were used. Because the IVW method is likely to be affected by horizontal pleiotropy, MR-Egger regression, in which the intercept reflects the pleiotropic condition, was conducted and MR-PRESSO method was used to detect and exclude horizontal pleiotropic outliers. The weighted-median method was used because consistent estimates were obtained from the weighted-median method when invalid instrumental variables were included. As sensitivity analysis, significant (p < 5 × 10−6) or strong (F-statistics > 10) instruments were selected and included in MR analysis. Moreover, we further conducted MR analysis without the result from JPHC base in SNP-outcome analysis to avoid sample overlapping. The significance of the association with MR was set at p < 0.05.