Comprehensive analysis of common mitochondrial DNA variants and colorectal cancer risk

Several lines of evidence implicate mitochondrial dysfunction in the development of cancer. To test the hypothesis that common mtDNA variation influences the risk of colorectal cancer (CRC), we genotyped 132 tagging mtDNA variants in a sample of 2854 CRC cases and 2822 controls. The variants examined capture ∼80% of mtDNA common variation (excluding the hypervariable D-loop). We first tested for single marker associations; the strongest association detected was with A5657G (P=0.06). Overall the distribution of association P-values was consistent with a null distribution. Next, we classified individuals into the nine common European haplogroups and compared their distribution in cases and controls. This analysis also provided no evidence of an association between mitochondrial variation and CRC risk. In conclusion, our results provide little evidence that mitochondrial genetic background plays a role in modifying an individual's risk of developing CRC.

Approximately 35% of colorectal cancer (CRC) can be ascribed to inherited susceptibility (Lichtenstein et al, 2000). Mendelian predisposition syndromes associated with mutations in known genes (APC, DNA mismatch repair (MMR) genes, MYH, SMAD4, ALK3 and STK11/LKB1); however, account for o6% of the overall incidence of the disease (Aaltonen et al, 2007). The recent advent of genome-wide association studies has lead to the discovery of several common, low-penetrance susceptibility loci for CRC Tomlinson et al, 2007Tomlinson et al, , 2008Tenesa et al, 2008), thereby providing incontrovertible evidence for common genetic variation as a basis for CRC susceptibility.
There is increasing evidence that common variation in the mitochondrial DNA (mtDNA) may be functionally relevant to the development of a range of common diseases. Notably, mtDNA polymorphisms have been implicated in a variety of late-onset diseases, including type 2 diabetes (Lowell and Shulman, 2005), Alzheimer's, and Parkinson's disease (Schapira, 1999).
Mitochondria play an essential role in energy metabolism, the generation of reactive oxygen species (ROS) and the regulation of apoptosis (Wallace, 2005), all of which have been implicated in the development of a number of different cancers (Benhar et al, 2002). Low levels of ROS regulate cellular signalling and are essential for normal cell proliferation; ROS production is increased in tumour cells causing oxidative stress and DNA damage, which can lead to genetic instability (Burdon, 1995). Thus, ROS are thought to play multiple roles in the initiation, progression, and maintenance of tumours.
Somatic mtDNA mutations can be identified in a wide variety of malignancies, including CRC (Chatterjee et al, 2006), although it is unclear whether these are causal or a consequence of the neoplastic process. Given the essential role of mitochondria in ROS generation and regulation of apoptosis, it is however plausible that variant mitochondrial function may directly contribute to an individual's risk of developing cancer. Such an assertion is supported by a recent report implicating polymorphic mtDNA variants in susceptibility to breast cancer (Bai et al, 2007).
To date no comprehensive evaluation of the hypothesis that common mtDNA variants influence the risk of developing CRC has been conducted. To address this we have genotyped 132 tagging mtDNA variants, which capture B80% of all of the common mitochondrial variation and compared their frequencies in 2854 CRC cases and 2822 controls.

Subjects and samples
A total of 2863 CRC cases (1196 men, 1667 women; mean age at diagnosis 59.3 years; s.d. ± 8.7) were ascertained through the National Study of Colorectal Cancer Genetics (NSCCG). A total of 2838 healthy individuals were recruited as part of ongoing National Cancer Research Network genetic epidemiological studies, NSCCG (1219), the Genetic Lung Cancer Predisposition Study (GELCAPS) (1999 -2004; n ¼ 911), and the Royal Marsden Hospital Trust/Institute of Cancer Research Family History and DNA Registry (1999 -2004; n ¼ 708). These controls (1136 men, 1702 women; mean age 59.8 years; s.d. ± 10.8) were the spouses or unrelated friends of patients with malignancies. None had a personal history of malignancy at the time of ascertainment. All cases and controls were British and of European descent, and there were no obvious differences in the demography of cases and controls in terms of place of residence within the United Kingdom. Collection of blood samples and clinico-pathological information from patients and controls was undertaken with informed consent and the ethical review board approval in accordance with the tenets of the Declaration of Helsinki.

Variant selection and genotyping
DNA was extracted from EDTA venous blood samples using conventional methodologies and quantified using PicoGreen (Invitrogen, Paisley, UK). We excluded the B0.8 kb of the hypervariable mtDNA D-loop promoter region/control region from the study, as variation in this region can only realistically be addressed by sequencing because of the high mutation rate associated with this region of the mitochondrial genome. A recent study identified 144 variants with frequency 41% in Europeans and defined a set of 64 single nucleotide polymorphisms (SNPs), which tag all common variants with r 2 40.8 (Saxena et al, 2006). On the basis of these data and designability scores for the genotyping platform, we selected 132 tag SNPs, which maximally capture common mtDNA variation.
Genotyping was conducted using Illumina Infinium Bead Arrays according to the manufacturer's protocols. A DNA sample was deemed to have failed if it generated genotypes at fewer than 95% of loci. An SNP was deemed to have failed if fewer than 95% of DNA samples generated a genotype at the locus. To ensure quality of genotyping, a series of duplicate samples were genotyped.
The nucleotide positions presented are taken from the NC_001807 mitochondrial reference sequence in dbSNP. The mapping between this sequence and the revised Cambridge reference sequence for each of the 132 variants tested is detailed in Supplementary Table 1. European mtDNA haplogroups H, I, J, K, T, U, V, W and X were classified according to the published references and the Mitomap database (Torroni et al, 1996;Macaulay et al, 1999;Herrnstadt et al, 2002) (Table 1).
Microsatellite instability in CRCs was determined using the following methodology: 10 mm sections were cut from formalinfixed paraffin-embedded tumours, lightly stained with toluidine blue, and regions containing at least 60% tumour micro-dissected. Tumour DNA was extracted using the QIAamp DNA Mini kit (Qiagen, Crawley, UK) according to the manufacturer's instructions and genotyped for the mononucleotide microsatellite loci BAT25 and BAT26, which are highly sensitive markers of MSI. Samples showing novel alleles at either BAT26 or BAT25 or both markers were assigned as MSI (corresponding to a high level of instability, MSI-H (Boland et al, 1998)).

Statistical and bioinformatic methods
For several of the SNPs, the rare variant was observed in less than 1% of samples. These variants were excluded from further analysis. We employed the program Tagger (de Bakker et al, 2005) to estimate the approximate proportion of common mitochondrial variation defined by the 144 variants described by Saxena et al (2006), which was captured by the variants genotyped in our study.
For each individual SNP and haplogroup, comparison of genotype frequencies (or presence/absence of haplogroup frequencies) in cases and controls was initially undertaken using a w 2 -test with one degree of freedom and unadjusted odds ratios (ORs) were calculated. We used logistic regression to calculate ORs adjusted for age and gender, and their associated 95% confidence intervals. For each SNP, a one-degree of freedom likelihood ratio test comparing the model including covariates age and gender with the model including covariates age, gender and SNP genotype was performed.
Correction for multiple testing in association studies using a simple Bonferroni correction may be conservative due to the assumption of independence between tests. We therefore adopted an empirical simulation approach based on 10 000 permutations, thus allowing for correlations between mtDNA variants. At each iteration case and control labels were permuted at random and the maximum likelihood ratio test statistic calculated. The significance level for each SNP was estimated as the proportion of permutation samples for which this maximum was larger than the observed value.
We assessed the possibility of interactive effects between each pair of SNPs that displayed some evidence of association (Po0.1) by computing the likelihood ratio test statistic for the saturated model against the main effects model. We also assessed the possibility that the effect of each SNP on CRC risk was modified by age by computing the likelihood ratio test statistic for the model with a genotype-age interaction against the model with genotype and age terms only.
A number of additional covariates were available for the CRC cases, including family history of CRC (at least one first-degree relative with CRC), site of tumour (colon/rectum) and MSI status. For each SNP and haplogroup, we assessed the association with CRC risk restricted to case subgroups defined by these covariates. For each subgroup, logistic regression was used to estimate ORs adjusted for age and gender and likelihood ratio test statistics were calculated. All statistical analyses were undertaken in R v.2.4.

RESULTS
Out of the 5701 DNA samples submitted for genotyping, 5676 samples were successfully processed. Genotyping failed in 25 individuals, leaving genotype data for 2854 cases and 2822 controls.
Of the 132 variants for which genotyping were attempted, 125 were satisfactorily genotyped (94.7%), with mean SNP call rates of  Haplogroup  G1721A  T4217C  G4581A  T10035C  G10399A  A12309G  T14471C  T14767C   H  ----A  --C  I A 99.9 and 99.8% in cases and controls, respectively. Of these 125 SNPs, eight were monomorphic and an additional 54 had the minor variant observed in less than 1% of samples and were excluded from further analysis, leaving 63 polymorphic variants. Only one SNP (A15925G) was triallelic in samples analysed with one heterozygote observed among cases. This genotype was treated as missing for the analysis. Genotypes from duplicate samples displayed 100% concordance. One variant (G10590A), polymorphic in our samples was not observed by (Saxena et al, 2006), and nine variants observed to be polymorphic in their study were either monomorphic or had very low frequency in our samples (Supplementary Table 2). Given these caveats, our data indicated that 79.3% of polymorphic variants were captured with r 2 40.8, whereas 92.2% of variants with MAF45% were captured with r 2 40.8.
Four SNPs showed nominal levels of association with CRC risk (Po0.1; Table 2). The most strongly associated was A5657G, with a P-value of 0.06; non-significant after adjustment for multiple testing by permutation. All nine common European haplogroups (H, I, J, K, T, U, V, W and X) were observed in both cases and controls. Haplogroup J was slightly over-represented in cases, whereas haplogroup K was slightly under-represented, although these observations were statistically non-significant (Table 3). Adjustment for age and gender did not impact on the findings.
Interactions between the four SNPs that showed an association with CRC risk at the 10% level of significance were examined by fitting full logistic regression models for each pair, generating six models, and comparing with the main effects model for each pair. Owing to small MAFs, it was only possible to evaluate the interaction for three of the pairs. For each of these there was no significant evidence of interactive effects. Furthermore, there was no evidence of any differential effect of genotype by either age or gender.
For all 2854 genotyped cases, information was available on site of CRC (1743 colonic, 1111 rectal tumours) and family history (398 individuals with at least one first-degree relative affected by CRC, 2456 with no recorded family history), and 1222 of the cases had been evaluated for MSI status (151 MSI, 1071 MSS cases). Subgroup analysis by site indicated stronger evidence of association between mtDNA variants and colon cancer, with five variants showing significant association (Po0.05) whereas there was no evidence for an association between any variant and rectal cancer (P40.1 for all variants). The variant A5657G was most strongly associated with the risk of colonic tumour (P ¼ 0.02), albeit nonsignificant after adjustment for multiple testing. Stratification by MSI status showed that three variants were associated with risk of CRC for MSI cases, with the strongest association for T4562C (P ¼ 4.6 Â 10 À3 ), non-significant after adjustment for multiple testing. There was no evidence for association between any SNP and CRC in MSS cases (P40.05 for all variants). Stratification by family history status did not alter the overall findings.

DISCUSSION
It is entirely plausible that genetic variation in mitochondrial genome might influence cancer risk given the increasing evidence implicating hypoxia in the development of cancer and the pivotal role of mitochondrial function in cellular energy metabolism.
Previous studies have tested small numbers of mtDNA variants for an association with a variety of traits, typically focusing on the nine canonical haplogroups, with limited tagging coverage generally capturing o40% of common variation (r 2 40.8). To generate a more comprehensive analysis of the relationship between mitochondrial variation and CRC risk we have analysed variants that capture 79% of all polymorphic variants (MAF41%) and 92% of variants with MAF45% (r 2 40.8). A further strength of our study is that our analysis has been based on a large casecontrol series. We genotyped 132 mtDNA variants and analysed data from the 63 variants with frequencies 41%. Under the assumption that the 63 tests were independent, our study therefore had 70% power to detect a variant with a frequency of 5% conferring a 1.5-fold increase in risk of CRC. Moreover, for variants with MAFs of 10% or greater, our study had 480% power to identify variants conferring a 1.3-fold increase in risk.
Despite our study being a well-powered evaluation capturing the majority of common variation in mtDNA, our findings do not support the hypothesis that common mtDNA variants play a significant role in inherited CRC. Specifically, results from our association tests of all common mtDNA variants and the risk of CRC show that there is no single common coding-region mtDNA variant or haplogroup that strongly influences risk of developing CRC. It is however, entirely possible that any genetic variation in mitochondria influencing CRC risk may be in the form of low frequency variants, although we have no evidence from our data that this is the case. Alternatively the impact of variants may be restricted to a subset of CRC, as there are differences in the biological basis of CRC according to site.
Observations based on post hoc analyses are inherently prone to generating spurious associations. Accepting such caveats it is, however, noteworthy that we found a stronger relationship between A5657G and colonic rather than rectal disease. There was also evidence for an association between risk of MSI CRC and T4562C. Tumour hypoxia has been reported to cause a functional loss of DNA mismatch repair system as a result of downregulation of MMR genes, principally involving MLH1 (Mihaylova et al, 2003;Bindra et al, 2007;Nakamura et al, 2008) thereby in keeping with the observation. Although attractive, such a postulate requires validation in additional independent datasets. As A5657G is noncoding and T4562C is a synonymous change, any effect is likely to be indirect, which is possibly mediated through an untyped SNP. A limitation of our study is that it does not address the role of mtDNA heteroplasmy in CRC. Typically, blood DNA exhibits much less heteroplasmy than non-dividing tissues. Indeed in the 5676 DNA samples genotyped, only one heterozygote call was observed although it is possible that this is because of analytical limitations of the platform employed. However, as the known rare mitochondrial diseases exhibit pronounced heteroplasmy, it is unlikely that mtDNA heteroplasmy for such variants will have significantly influenced our findings.
In conclusion, our results provide no support that common mtDNA variation plays a role in inherited predisposition to CRC. It is however, possible that mitochondria may be involved in genegene and gene-environment interactions that may affect disease risk. To address such hypotheses requires studies based on very large sample sizes that incorporate data on non-genetic covariates.