Main

Oxidative damage to DNA occurs with cell proliferation and increases with age. Certain organs such as the gut are heavily exposed to oxidising agents, which impacts on carcinogenic potential. Dysfunction of base excision repair, the major pathway for repairing oxidative damage, has been implicated as a risk factor for the development of multiple colorectal adenomas and colorectal cancer (CRC; Al-Tassan et al, 2002; Croitoru et al, 2004; Farrington et al, 2005). Bi-allelic mutations of the MUTYH gene seem to be responsible for a high proportion of the multiple adenoma phenotype families (termed MUTYH-associated polyposis (MAP)) unaccounted for by germline APC mutations (Al-Tassan et al, 2002; Sampson et al, 2003; Sieber et al, 2003; Gismondi et al, 2004; Venesio et al, 2004; Nielsen et al, 2009) and predispose to CRC per se (Enholm et al, 2003; Croitoru et al, 2004; Fleischmann et al, 2004; Kambara et al, 2004; Wang et al, 2004; Farrington et al, 2005; Peterlongo et al, 2005; Zhou et al, 2005; Moreno et al, 2006; Tenesa et al, 2006; Webb et al, 2006; Küry et al, 2007; Cleary et al, 2009; Lubbe et al, 2009). Although an increased CRC risk associated with bi-allelic MUTYH mutations is incontrovertible, the risk associated with one MUTYH mutant allele is controversial (Croitoru et al, 2004; Farrington et al, 2005; Jenkins et al, 2006; Tenesa et al, 2006; Webb et al, 2006; Cleary et al, 2009; Jones et al, 2009; Lubbe et al, 2009). A statistically significant or close to significant effect, for a MUTYH mono-allelic effect, has been reported in different studies with possible age specific effects present, but the rarity of the alleles associated with the small increased risk for CRC have made it difficult to replicate study findings. A recent risk analysis of MAP family members agreed with previous family based findings (Jenkins et al, 2006) that mono-allelic carriers are at a two-fold increase in risk of CRC (Jones et al, 2009) providing further evidence of a mono-allelic effect of the gene. However, family based studies can be subject to ascertainment bias and any mono-allelic effect could potentially be modified by other inherited factors, including alleles at other loci. Furthermore, environmental risk factors also show familial aggregation and hence, studies in which there has been selection of cases based on family history may be confounded. Bi-allelic carriers may develop CRC because of the predominant effect of MUTYH, whereas the environmental effect is greater in affected siblings with mono-allelic mutations but the risk is ascribed to the MUTYH allele. Thus further work is required to resolve the mono-allelic carrier risk question.

To clarify the role of MUTYH in disease risk, we initiated a multi-centre collaboration allowing large-scale meta-analysis of the individual MUTYH variants, with special interest in determining if there were age and sex-specific effects on CRC association with MUTYH variants (Farrington et al, 2006). In this study, we present the results of this collaborative meta-analysis.

Subjects and methods

Participating studies

Relevant case–control studies to be invited for inclusion in the meta-analysis of the effect of MUTYH on CRC risk were identified by a literature search in the ISI Web of Science (http://wok.mimas.ac.uk) and PUBMED bibliographic databases (http://www.ncbi.nlm.nih.gov/pubmed/), using the search terms ‘MYH or MUTYH and CRC’. In the initial search 55 studies were identified and eight of these were considered for our study (Enholm et al, 2003; Croitoru et al, 2004; Fleischmann et al, 2004; Kambara et al, 2004; Wang et al, 2004; Farrington et al, 2005; Peterlongo et al, 2005; Zhou et al, 2005). The inclusion criteria were as follows: the patients had to be diagnosed with CRC and the studies had to have genotype data for both cases and controls. Ten additional studies were identified during the progress of the project – Webb et al (2006), Moreno et al (2006), Küry et al (2007), Cleary et al (2009), Lubbe et al (2009); and unpublished data from Koessler T and Pharoah PD; and Tomlinson – personal communication. Colebatch et al (2006); Balaguer et al (2007); Avezzù et al (2008) were used in the pooled meta-analysis of all available published and unpublished datasets.

The principal investigators (PIs) of the selected studies were contacted and were asked to participate by providing a minimum dataset including variables necessary for the analysis (Supplementary Box 1: Study questionnaire; Supplementary Table 1: Data extraction table). In cases, in whom PIs failed to respond to our invitation to participate, reminder letters were despatched. It was not possible to include data from the following studies in the logistic regression analyses because (i) data was only available for cases that were heterozygous or homozygous for a MUTYH mutation (Enholm et al, 2003); (ii) co-variate data were only available for cases, as controls were anonymous blood donors (Zhou et al, 2005; Tomlinson, unpublished data); (iii) failure to communicate with us (Kambara et al, 2004 and Wang et al, 2004). The study by Fleischmann et al (2004) and Webb et al (2006) were not used because they had been superseded by a later study (Lubbe et al, 2009), which was included.

Statistical analysis

Data from all collaborating centres were checked for completeness, coded and merged to form a core database. MUTYH defects were considered pathogenic only if there was published evidence of their pathogenicity. Individuals reported to have two defects of MUTYH in the original report were classified as mutated/mutated (MM), those with one defect as wild type/mutant (WM) and those with no mutation as wild type/wild type (WW). Descriptive statistics were produced on all subject characteristics, risk factors and event data. All populations described in the case–control studies were tested for Hardy–Weinberg equilibrium in controls and the genotype distributions between all groups were compared by χ2-test.

Three logistic regression models were applied to address confounding co-variates (model I: crude, model II: including co-variates for age and sex, model III: including co-variates for age, sex and study) on the combined datasets investigating the effect of MUTYH defects (WM vs WW and MM vs WW) as well as of the individual mutations Y179C (c.536A>G/p.Tyr179Cys; AA=WW, GG=MM) and G396D (c.1187G>A/p.Gly396Asp; GG=WW, AA=MM; previously known as Y165C and G382D), to identify any variant specific associations. The three logistic regression models were applied after sex and age (over 55 years and under or equal 55 years) stratification as previously described (Farrington et al, 2005), to assess the effect of age and sex on the association of the variants with disease risk. In all the studies, interaction associations between the MUTYH variants and study code (for each individual study) were estimated and similarly between MUTYH variants and hormone replacement therapy (HRT) among female participants in three studies (the Scottish SOCCS studies and the studies – Croitoru et al, 2004; Cleary et al, 2009). Association between both genetic (i.e., one of the MUTYH mutations) and the study code or environmental factor (i.e., HRT) and disease was assessed and interaction was tested by fitting interactive and nested multiplicative models. To assess for any small study effects, we performed Funnel plot analysis and tested for significance using the Harbord test.

Finally, the relationship between the genotype and CRC was analysed by meta-analysis, combining the effect estimates of all published and unpublished datasets.

All statistic analyses were conducted using Intercooled STATA version 10.0 (Stata Corp, College Station, TX, USA). For the logistic regression analyses, it is necessary to add a whole number to any fields containing 0 (see model Ia in Table 2), which reduces the final OR value, however, by using the META command in the STATA meta-analysis programme, a lower value can be added (0.5 as indicated by model Ib in Table 2) thereby giving a more accurate assessment of risk. However, this is a grouped analysis and therefore cannot be adjusted for confounding co-variates, such as age/sex and study as in models II and III. To account for multiple testing we applied the Bonferroni correction method, and the P-value threshold for significance was estimated to be 0.003.

Results

Table 1 details summary data from the studies included in our combined analysis (comprising a total of 20 565 cases and 15 524 controls). The two variant alleles are rare with G396D variant allele having a frequency of 0.007 in controls and the Y179C variant allele a frequency of 0.002. Tests for deviation from Hardy–Weinberg equilibrium in controls were P=0.99 and P<0.00005 for G396D and Y179C variants, respectively.

Table 1 Summary tablea

Bi-allelic effect of MUTYH

All three models of the logistic regression analysis gave consistent results and so the results of the crude analysis (model I) are described below and presented in Table 2; results of the other two models can be found in Supplementary Table 2. Bi-allelic carriers for the MM genotype of the combined MUTYH defects, G396D and Y179C/G396D compound heterozygotes were associated with a significant increase in CRC risk (odds ratio (OR)=28.3, 95% confidence limits (CIs): 6.95–115; 23.1 (95% CI: 3.15–169) and 21.6 (95% CI: 2.94–159), respectively). These risks are conservative, concentrating on the significant logistic regression results – model Ib results presented in Table 2 are likely a better reflection of risk and tend to be two-fold higher. There was a greater CRC risk for the MM genotype for the earlier age individuals when compared with the older age group (OR=36.2 (95% CI: 4.98–263) for 55 years compared with 11.6 (95% CI: 2.77–48.2) for >55 years). However, their CIs overlapped and the results were not statistically significantly different. ANOVA analysis of mean age of carriers demonstrated that there are significant age differences between cases and controls when considering MM genotype carriers and Y179C bi-allelic carriers but not for G396D carriers (P<0.0005, P<0.0005 and P=0.27, respectively – Supplementary Table 3). Indeed there is a significant age difference between mean age of bi-allelic Y179C and G396D carriers (48.9 vs 56.7, respectively, P=0.003 based on t-test – Supplementary Table 4).

Table 2 Logistic regression analysis of the combined datasets; G396D analysis was conducted for individuals that were Y179C AA; Y179C analysis was conducted for individuals that were G396D GG; combined genotype analysis was conducted for individuals with data for both Y179C and G396D

Colorectal cancer risk associated with mono-allelic MUTYH mutations

The results of the combined analysis demonstrate that there are no significant mono-allelic effects for either G396D or for combined MUTYH variants (Table 2). However, the specific Y179C variant was shown to increase risk of disease in the heterozygous state (OR=1.34; (95% CI: 1.00–1.80)) in the whole sample set and also when stratified by sex, male sex demonstrated a mono-allelic effect (OR=1.70; (95% CI: 1.06–2.73)). However, after Bonferroni correction, these mono-allelic effects did not remain significant.

The role of study population and HRT in modulating CRC risk

We hypothesised that origin of the data, that is, study population might modify the association between the genotype and CRC risk. However, there was no evidence for an interaction between study code and MUTYH genotypes (Supplementary Table 5). Similarly, HRT intake, a known risk factor for CRC (Chan et al, 2006; Theodoratou et al, 2008), might be influenced by genotype and therefore modulate female risk. Both the Scottish and Canadian datasets had recorded data on HRT intake and these were used to test for an interaction between HRT and MUTYH genotype. Across both datasets there was no evidence of any interaction between HRT and MUTYH genotype (Supplementary Table 6).

Meta-analysis of published and unpublished datasets

The results of a meta-analysis of published and unpublished datasets submitted to us, estimating the effect of the MUTYH whole gene defects demonstrated a pooled fixed bi-allelic effect of 10.8 (95% CI: 5.02–23.2) for the MM and a pooled fixed mono-allelic effect of 1.16 (95% CI: 1.00–1.34) for WM genotype (Table 3; Figures 1 and 2). Analysis of the specific variants by pooled meta-analysis demonstrated bi-allelic effects for both G396D and Y179C (OR=6.47 (95% CI: 2.33–18.0) and OR=3.35 (95% CI: 1.14–9.89), respectively) and in agreement with the logistic regression analysis results, Y179C variant also demonstrated a very similar pooled fixed mono-allelic effect of 1.34 (95% CI: 1.01–1.77; Tables 4 and 5; Supplementary Figures 1–4).

Table 3 Meta-analysis of studiesa
Figure 1
figure 1

Meta-analysis of studies comparing MUTYH MM vs WW. SOCCS data include Farrington et al (2005), Tenesa et al (2006) and unpublished data from the SOCCS study obtained in 2008; Cleary data include Croitoru et al (2004); Lubbe data include Webb et al (2006) and Fleischmann et al (2004). Unpublished studies included are Tomilson I (2006) and Koessler T (2007).

Figure 2
figure 2

Meta-analysis of studies comparing MUTYH WM vs WW. SOCCS data include Farrington et al (2005), Tenesa et al (2006) and unpublished data from the SOCCS study obtained in 2008; Cleary data include Croitoru et al (2004); Lubbe data include Webb et al (2006) and Fleischmann et al (2004). Unpublished studies included are Tomilson I (2006) and Koessler T (2007).

Table 4 Meta-analysis of studies for the G396D genotypesa
Table 5 Meta-analysis of studies for the Y179C genotypesa

Assessment of study publication bias

Funnel plots for both the mono- and bi-allelic effect were created to assess whether study size was significantly influencing the results. These plots appeared asymmetric, but the Harbord's test for small study effect demonstrated that this was not statistically significant (Supplementary Figures 5 and 6).

Discussion

This large meta-analysis study refines the estimates of CRC risk associated with mutations in the MUTYH gene to date. Bi-allelic carriers of the combined MUTYH mutations (MM) are associated with a 28-fold (95% CI: 6.95–115) increase in CRC risk from the logistic regression analysis. Bi-allelic carriers of the G396D variant and Y179C/G396D compound heterozygotes were also significantly associated with a similar increase in CRC risk (OR=23.1 (95% CI: 3.15–169) and 21.6 (95% CI: 2.94–159), respectively). Although the risk estimate was slightly lower from the overall larger pooled meta-analysis of published and unpublished datasets (OR=10.8 (95% CI: 5.02–23.2)), both G396D and Y179C variants demonstrated bi-allelic effects in this pooled analysis (OR=6.47 (95% CI: 2.33–18.0) and OR=3.35 (95% CI: 1.14–9.89), respectively). A marginal significant mono-allelic effect was demonstrated for the specific variant Y179C (OR=1.34 (95% CI: 1.00–1.80)) and indeed a marginally significant result was also observed in the pooled meta-analysis for MUTYH WM (OR=1.16 (95% CI: 1.00–1.34)) and Y179C variant alone, 1.34 (95% CI: 1.01–1.77). The increased bi-allelic risk of CRC varied when stratified for age and sex but none of the differences were significant, although when stratified by sex, males showed a marginal significant mono-allelic effect for Y179C (OR=1.70; 95% CI: 1.06–2.73). The results from this large dataset indicate that the two variants may be acting mechanistically differently; G396D appears to be a true example of recessive Mendelian disease, whereas the results for Y179C are more complex and there is therefore some argument against combining the two variants. However, the results from the Y179C/G396D compound heterozygotes analysis demonstrates an increase in risk similar to G396D bi-allelic carriers, suggesting that the two variants are complementary and analysis of combined MUTYH mutations as historically performed, appears appropriate to assess risk for the whole gene. The rarity of the Y179C allele has made it difficult to truly assess its effect on disease risk, however the large numbers analysed in this report have resulted in the demonstration that both bi-allelic and mono-allelic Y179C variants are associated with disease risk.

The study population did not appear to modulate disease risk and although the study replicated the reported decrease in disease risk in MUTYH wild-type females associated with HRT intake (Chan et al, 2006; Theodoratou et al, 2008), we found no interaction with the MUTYH gene and its variants. Therefore, it is unlikely that HRT intake is an explanation for any sex variation in risk and other genetic factors may be involved in modifying CRC risk.

Evidence of a mono-allelic MUTYH effect on CRC has been reported in several case–control studies (Croitoru et al, 2004; Wang et al, 2004; Farrington et al, 2005; Zhou et al, 2005; Tenesa et al, 2006; Cleary et al, 2009) and family-based studies (Jenkins et al, 2006; Jones et al, 2009), but not in other studies (Kambara et al, 2004; Webb et al, 2006; Balaguer et al, 2007; Lubbe et al, 2009). Our large meta-analysis has demonstrated a marginal significant association for the specific variant Y179C, highlighting the possible increased phenotypic severity of this allele. This is in agreement with other studies and biochemical and model organism studies, which indicate that this variant shows an increased detrimental effect on protein function (Al-Tassan et al, 2002; Parker et al, 2005; Lubbe et al, 2009; Nielsen et al, 2009; D’Agostino et al, 2010). The pooled analysis of published studies and unpublished datasets submitted to us also indicated a marginally significant mono-allelic MUTYH effect, as well as a mono-allelic Y179C effect.

However, there are a number of caveats that need to be considered; if any of the studied datasets contain cases recruited because of the familial clustering of disease, there may be ascertainment bias, artificially inflating the number of MUTYH WM variant allele carriers; secondly the screening of the MUTYH gene has predominantly been performed on the two most common pathogenic variants Y179C and G396D – in some studies, the rest of the gene may be explored in cases with a heterozygous allele for these variants but not usually in the controls, hence there is an overall screening bias and bi-allelic carriers may well have been missed in both cases and controls.

The demonstration of a mono-allelic effect specifically for Y179C should be considered with further caution, as analysis of the control datasets for the Y179C allele demonstrated that it was not in Hardy–Weinberg equilibrium. This may be because of several factors, the rarity of the allele and the fact that both female control subjects with bi-allelic mutations carry Y179C variants. One of these control subjects was shown to have polyps on colonoscopy (Cleary et al, 2009) and may therefore be considered a case. The other is relatively young, less than 60 years old (Lubbe et al, 2009), so potentially may develop cancer over the next few years. However, in this large dataset, we have also shown that bi-allelic carriers of Y179C predisposes to an earlier onset of disease than G396D, consistent with previous reports (Lubbe et al, 2009; Nielsen et al, 2009) and highlights a severer disease phenotype of this variant.

In conclusion, inactivation of the MUTYH gene is a recessive risk factor for CRC, with possible modifying effects indicated by increased risk in cases with early age of onset, although not significantly different in the current dataset. An increased risk associated with mono-allelic MUTYH mutation is indicated, albeit small and not currently clinically relevant, and likely specific for the variant Y179C. Despite the size of this study it has not been possible to definitively establish whether there are significant age and sex effects of increasing disease risk for G396D and Y179C carriers. The evidence presented raises the possibility of a mono-allelic effect for Y179C, but the effect is low (OR 1.34; 95% CI: 1.00–1.80) and is sensitive to variations in population allele frequency because of the rarity of the variant (allele frequency 0.002), as well as potential issues of subgroup analysis and multiple testing (indeed the overall significance is lost after Bonferoni correction). Nonetheless, it does appear that this study is the first to demonstrate that the Y179C variant does impart an increased risk of CRC.