Introduction

Breast cancer, prostate cancer, and colorectal cancer are very common cancer types that together account for more than 40% of all new cancer cases in Finland (Finnish Cancer Registry. Cancer Statistics at www.cancerregistry.fi). Most of the cases are sporadic, but in an estimated 5–10% of the patients the disease is mainly caused by inherited predisposition. In addition, hereditary factors are involved to a lesser extent in a much higher proportion of cancer cases, and a large twin study from the Nordic countries showed that there is an inherited component in up to 42% of the prostate, 35% of the colorectal, and 27% of the breast cancer cases, respectively.1 As mutations in the known high-penetrance cancer predisposition genes explain only a small fraction of cancer cases, a polygenic model has been proposed where several low-penetrance alleles may have an additive effect and account for a substantial proportion of the familial aggregation of cancer. Low-penetrance alleles that predispose to breast, prostate, and colorectal cancer have been identified for example in the CHEK2 gene.2, 3, 4

ADP-ribosylation factor-like tumour suppressor gene 1 (ARLTS1), also known as ADP-ribosylation factor-like protein 11 (ARL11), is a member of the ADP-ribosylation factor (ARF) family. ARFs are guanine-nucleotide-binding proteins which are critical components of several different eukaryotic vesicle trafficking pathways. As with other members of the Ras superfamily, ARFs function as molecular switches by cycling between inactive GDP- and active GTP-bound conformations.

Recently, the frequency of a nonsense variation Trp149Stop (G446A, rs34301344) in the ARLTS1 gene was found higher among patients with a family history of cancer than among sporadic cancer patients and healthy controls.5 Functional analyses of the truncated protein indicated that the Trp149Stop variant may affect apoptosis and tumour suppression.5 In a consecutive case/control study a possible association of the variant with high-risk familial breast cancer was also reported.6 Another ARLTS1 variant, missense alteration Cys148Arg (T442C, rs3803185), and especially the CC homozygous genotype was also found associated with high-risk familial breast cancer.7 The same variant was reported to predispose to melanoma,8 and possibly also to colorectal cancer.9, 10 The role of ARLTS1 variants has also been studied in chronic lymphocytic leukaemia, but no associations have been detected.11 Downregulation of ARLTS1 has been observed in a large proportion of ovarian carcinomas.12

The objective of this study was to screen the whole ARLTS1 gene for germline alterations in large sets of both familial (n=855) and sporadic (n=1169) breast, prostate, and colorectal cancer patients, and to analyse whether any of the observed variants associate with the corresponding cancer risk. Two of the analysed single nucleotide polymorphisms (SNPs), Pro131Leu (C392T) and Cys148Arg (T442C) were analysed for the first time in prostate cancer. In addition, we found a novel Gly65Val (G194T) variant and examined its possible association with cancer for the first time. Here, we also present results in familial prostate cancer cases for which detailed diagnostic factors were taken into account.

Subjects and methods

Study population

The whole coding region of ARLTS1 was screened in a total of 855 Finnish familial cancer patients who have at least one first or second degree relative affected with the same tumour type (1 patient per family; 598 familial breast cancer patients, 164 familial prostate cancer patients, and 93 familial colorectal cancer patients). Recruitment of the families, verification of the cancer diagnoses, and exclusion of the BRCA1 and BRCA2 mutations in breast cancer families and MLH1 and MSH2 mutations in colorectal cancer families have been previously described.13, 14, 15, 16, 17, 18

The frequencies of all the observed ARLTS1 variants were further studied in 1169 sporadic/unselected cancer patients (644 breast cancer,13 377 prostate cancer,14 148 colorectal cancer patients15, 16) as well as in 809 Finnish blood donors who served as healthy population controls. The control samples were collected at The Finnish Red Cross Blood Transfusion Service in Tampere (n=381) and Helsinki (n=428). The two control sets were analysed with the Pearson's χ2-test for independence and deemed homogeneous enough to be combined together into one control set (data not shown).

Patient information and samples were obtained with full informed consent. The study has been performed under appropriate research permissions from the Ethics Committees of the Tampere University Hospital and Helsinki University Central Hospital, Finland, as well as the Ministry of Social Affairs and Health in Finland and the Institutional Review Board of the National Human Genome Research Institute, National Institutes of Health, USA.

Mutation analysis

Mutation analyses were performed by direct sequencing. The whole coding region and exon–intron boundaries of ARLTS1 were sequenced on genomic DNA from 855 index patients. As all the observed variants reside in one exon, new primers were designed to allow the examination of all these variants in a single amplicon in sporadic cancer patients and healthy controls. All primers and PCR conditions are available on request.

Statistical analysis

In the preliminary analyses, the possible association between the ARLTS1 genetic alterations and cancer risk was calculated by the χ2-test or by Fisher's exact test. As no association was observed between any of the ARLTS1 variants and colorectal cancer risk, these samples were excluded from further analyses.

Prostate cancer cases and controls, breast cancer cases and controls, as well as breast cancer and prostate cancer cases together versus controls were analysed at each variant using the Haploview program (www.broad.mit.edu/mpg/haploview). First, a χ2-test was performed to test each SNP locus for deviations from Hardy–Weinberg equilibrium in the control data and also in the case/control combined datasets. Then the method of Gabriel et al19 was used to test for significant linkage disequilibrium (LD) between the marker loci and to detect haplotype blocks in the data. Haplotype frequencies were then estimated using an accelerated expectation-maximization algorithm similar to the partition/ligation method described in Qin et al.20 This provides accurate population frequency estimates of the phased haplotypes in the cases and controls based on the maximum likelihood as determined from the unphased input. Finally, Haploview was used to test for differences between cases and controls in the frequencies of the single locus alleles and haplotypes, which is a simple test for association under a model with no dominance effects. This test compares frequencies of each allele or haplotype between cases and controls using a χ2-test, and 10 000 permutations were then used to obtain empirical P-values (adjusting for both multiple testing and possible deviations of the test from the χ2 distribution).

Some of the initial allelic/haplotypic frequency analyses yielded nominally significant P-values but none of the empiric P-values were significant at the P=0.05 level. Therefore, in an attempt to increase power and to assure that our failure to replicate previously published associations was not simply due to testing the wrong models, we performed additional analyses assuming other genetic models for the risk alleles and examining the effects of these alleles when incorporating clinically important covariates into the analyses. The case/control data were thus analysed using STATA by conducting the Pearson's χ2-test and Fisher's exact test for independence under several conditions: (1) three distinct genotypes 1/1, 2/1, and 2/2; (2) pooling genotypes with 1 as the dominant allele; and (3) pooling genotypes with 2 as the dominant allele. Multinomial logistic regression was used to further analyse the prostate cancer data incorporating WHO (World Health Organization) score in three categories (WHO: 1, 2, or 3), Gleason's score in two categories (Gleason ≥2 and <7 versus Gleason ≥7), prostate-specific antigen (PSA) level in two categories (PSA at diagnosis <20 versus PSA at diagnosis ≥20), and t-score in two categories (t≤2 versus t>2). Data pertaining to only one of the types of cancer, such as the WHO grade for prostate cancer, were not included in the combined analysis, as there was no comparable statistic for the other cancer type. In addition, logistic regression was used to calculate ORs and corresponding 95% CI.

Results

Altogether five germline variants, Gly65Val, Ser99Ser, Pro131Leu, Cys148Arg, and Trp149Stop, were observed in the initial mutation screening of the familial cases (Figure 1). The frequencies of each genotype in breast, prostate, and colorectal cancer cases as well as in controls are summarized in Table 1 (data for the synonymous variant Ser99Ser (rs3803186), for which no statistical analyses were performed, is not shown). The observed genotype frequencies at the four tested loci did not differ significantly from Hardy–Weinberg equilibrium in the controls or in the complete data set, and all the samples were successfully genotyped for all SNPs (no genotyping failures). Since there was no significant association of any of the SNPs with colorectal cancer risk, and as the small number of colorectal cancer cases limited the power of these analyses, the colorectal cancer cases were not included in further statistical calculations.

Figure 1
figure 1

Schematic representation of the ARLTS1 gene and the locations of the variants observed in the Finnish data.

Table 1 Observed germline ARLTS1 variants and their frequencies in prostate, breast, and colorectal cancer patients and in healthy population controls

Initial allelic/haplotypic frequency analyses were performed using the Haploview program. Single marker associations were calculated for each of the SNPs (Supplementary Table 1). The Gly65Val (G194T) variant was found to be significantly associated with familial prostate cancer (P=0.014), as well as nominally significant with combined familial and sporadic prostate cancer cases (P=0.039). The Cys148Arg (T442C) variant was found to be associated nominally with all prostate cancer cases (P=0.032), sporadic breast cancer cases (P=0.052), and all breast cancer cases (P=0.035). When familial and sporadic breast and prostate cancer cases were considered as a combined data set, the Cys148Arg (T442C) variant showed a significant association (P=0.016). However, after performing 10 000 permutations, none of these associations were significant.

Haploview analysis revealed one haplotype block (Pro131Leu (C392T) and Cys148Arg (T442C)) which exhibited significant intermarker LD (Supplementary Figure 1, Supplementary Table 2). There was no evidence of significant LD between the other two loci with each other or with their neighbouring SNPs. The CC haplotype (consisting of the common C allele at the Pro131Leu SNP and the rare C allele at the Cys148Arg SNP) was nominally significantly associated with breast and prostate cancer (familial and sporadic together, P=0.032 and P=0.037, respectively). When breast and prostate cancer cases were considered as a combined data set, the CC haplotype variant showed a nominal significant association with familial (P=0.040), sporadic (P=0.034), and all cancer cases (P=0.017), respectively. Since the Pro131Leu SNP did not show evidence of association with cancer status in the single locus analysis, it appears most likely that this haplotypic association is probably due to the Cys148Arg C allele. However, again after performing 10 000 permutations, these results were shown not to be statistically significant. Since none of the simple allelic or haplotypic association tests had significant empirical P-values after performing the permutation tests, we performed additional, more complex tests in an attempt to increase power. The results of these are given below.

The previously cancer-associated Trp149Stop (G446A) alteration was seen in 6 out of 598 (1.0%) familial breast cancer, in 3 out of 164 (1.8%) familial prostate cancer, and in 1 out of 93 (1.1%) familial colorectal cancer cases, respectively (Table 1). It was also present in low frequencies (0.5–1.1%) among all the sporadic cancer patient series (Table 1). The highest frequency of 1.6% (13 of 809) was however seen among healthy population controls. No association of the Trp149Stop variant with any cancer type was found in any of the analyses (Table 1, Supplementary Table 3). Additional samples were analysed from all three prostate cancer families carrying the variant, revealing currently unaffected elderly relatives who were hetero- or homozygous for the protein truncating alteration (Figure 2a). Thus co-segregation of the Trp149Stop allele is not seen with disease status in these families. Taken together, our data suggests that the Trp149Stop is not a predisposition allele in breast, prostate, or colorectal cancer in the Finnish population.

Figure 2
figure 2

Examples of segregation patterns in one ARLTS1 Trp149Stop variant positive (a) and two ARLTS1 Gly65Val variant positive (b and c) families with prostate cancer. Squares represent males; circles represent females. Open symbols indicate no neoplasm, and filled symbols denote prostate cancer cases. Symbols (+ or −) indicate the presence or absence of the variant in the DNA sample of the family members, respectively, followed by the actual genotype. An arrow indicates the individual initially screened for ARLTS1 mutations. Current age of the unaffected members or age at diagnosis for prostate cancer patients (in years) is indicated below the symbol for each family member. An asterisk (*) denotes the persons with no sample available.

Another ARLTS1 variant that has been associated with a possible cancer risk, the Cys148Arg (T442C), shows nominally significant association with prostate cancer risk (allelic OR, 1.19; 95% CI, 1.02–1.39; χ2 P=0.020), breast cancer risk (allelic OR, 1.15; 95% CI, 1.01–1.30; χ2 P=0.004) and pooled cancer risk (allelic OR, 1.16; 95% CI, 1.03–1.30; χ2 P=0.002), when no dominance is assumed (Table 2). For prostate cancer cases, this significance does not hold when the data are subdivided into familial and sporadic cases. However, for breast cancer cases and the pooled cancer data, there is nominally significant association for both the familial and sporadic data when using the χ2-test with two degrees of freedom, but the results of the logistic regression analyses of the breast cancer subsets were not significant. When the T allele is assumed to be dominant (C allele recessive), all data (familial and sporadic, prostate, breast, and pooled) show significant association of the CC genotype with cancer risk, and with higher ORs and lower χ2 P-values (χ2 P=0.005, 0.001, and 0.001 for the all prostate cancer cases, all breast cancer cases, and pooled cases versus controls, respectively) than when an additive mode of inheritance is assumed. When the C allele was assumed to be dominant to the T allele, there was no evidence of association of the CC/CT genotypes with breast or prostate cancer risk. Thus, it seems that the recessive genotype CC confers modest increased risk for cancer in these data. The possible co-segregation of the risk genotype (CC) with the disease has not been investigated within the families since only the index cases were analysed.

Table 2 Association of the ARLTS1 Cys148Arg (T442C) variant with prostate cancer, breast cancer, and pooled breast and prostate cancer

A novel ARLTS1 variant that showed a small effect on cancer risk is the Gly65Val (G194T) substitution. This variant was seen in 8 out of 164 familial prostate cancer patients (4.9%, Tables 1 and 3). This is somewhat higher than the frequency among healthy controls (13 of 809, 1.6%). When the T allele is assumed to be dominant to the G allele, a nominally significant association of the GT/TT genotypes is observed in the familial prostate cancer cases compared to the controls (OR, 3.14; 95% CI, 1.28–7.70; χ2 P=0.016) and in all prostate cancer cases compared to controls (OR, 2.23; 95% CI, 1.09–4.55; χ2 P=0.028). When no dominance is assumed this variant still shows nominally significant association with prostate cancer risk (OR, 1.99; 95% CI, 1.00–3.94; χ2 P=0.024). The T allele also shows borderline significance for the sporadic breast cancer risk and for pooled breast and prostate cancer risk under both the dominant or additive models. This association is driven by the familial cases for prostate cancer, and the sporadic cases for breast cancer. The frequency of the Gly65Val variant did not differ between the unselected prostate cancer patients and healthy controls or between any other cancer patient cohort and healthy controls (ie none of the tests of association in these subsets of the data were even nominally significant). In addition, no clear co-segregation of the variant with the cancer phenotype was seen in any of the seven prostate cancer families that carried the variant and for which additional familial DNA samples were available (examples of such families are shown in Figures 2b and 2c).

Table 3 Association of the ARLTS1 Gly65Val (G194T) variant with prostate cancer, breast cancer, and pooled breast and prostate cancer

The other germline variants Pro131Leu (C392T; Supplementary Table 4) and Ser99Ser (G297A; data not shown) that were observed in the mutation screening were found at similar frequencies in various cancer patient cohorts and healthy controls, and none of the tests for association were significant. These alterations are most likely neutral polymorphisms as suggested also in a previous study.5

When WHO, Gleason, PSA, and t-scores were included in the analyses of the prostate cancer data, the CC genotype at the Cys148Arg (T442C) locus was nominally associated with high Gleason's score (P=0.01) and with high PSA at diagnosis (P=0.05), but none of the other analyses showed significant associations.

After Bonferroni's correction for the total number of tests performed, none of the results presented above are significant. The number of association tests performed was slightly over 300, which would imply that a P-value of 0.00016 or less would be required to declare significance, and none of the observed P-values were this small. However, the Bonferroni's correction is known to be conservative in the case when tests are assumed to be independent but, in fact, are correlated as is the case here. Therefore, even if we only adjust for three independent markers because two markers are in strong LD (ignoring all haplotype tests), three models for each marker (additive, dominant, and recessive), and four logistic regression analyses (WHO, Gleason, PSA, and T-scores) and two datasets (breast cancer and prostate cancer, thus ignoring the seven subsets that were actually analysed), we have a total of only 56 tests. Using this as the number of tests requires a P-value of 0.00089 or smaller to achieve significance. However, even using this much reduced estimate of the number of independent tests, we did not observe any significant results.

Discussion

ARLTS1 was recently identified as a possible new tumour suppressor gene in human cancer when variants such as the nonsense change Trp149Stop and the missense alteration Cys148Arg were observed at higher frequency in cancer patients than in healthy controls.5, 7 The mechanism by which ARLTS1 suppresses tumour formation might be through apoptosis.5, 12 In the present study, we have studied the effect of ARLTS1 variants on breast, prostate, and colorectal cancer risk in the Finnish population.

The first ARLTS1 alteration that was associated with a possible cancer risk was a nonsense change Trp149Stop.5 In this study, no association of the variant was observed with any of the cancer types studied, but instead the highest frequency of this variant was observed among healthy population controls. Analyses of the segregation pattern of the Trp149Stop allele in families where at least one affected person carried this allele also revealed that there were unaffected elderly relatives who were hetero- or homozygous for the protein truncating alteration. Taken together, our data suggests that the Trp149Stop is not a predisposition allele in breast, prostate, or colorectal cancer in the Finnish population. One possible explanation for the conflicting results between this and the previous study is that whereas all the cases and controls in our study are from the genetically homogenous Finnish population, the cases and controls were not stratified by population in the study by Calin et al,5 but instead samples from five heterogeneous populations were pooled, possibly inflating the type I error rate.

During the course of replication studies of ARLTS1, the Cys148Arg (T442C) alteration was associated with an increased breast cancer risk.7 The variant was reported to associate with high-risk familial breast cancer in a dose-dependent manner (OR, 1.47; 95% CI, 1.04–2.06; P=0.03; Ptrend=0.007). Here, a trend towards higher frequency of the CC homozygotes was found both among familial and non-familial breast and prostate cancer cases when compared to healthy controls. As pointed out by Frank et al,7 the in silico analyses indicate that the substitution of cysteine to arginine may change the secondary structure of the protein, affect solvent accessibility, and is predicted to be probably damaging. A nearby Asp151Gly substitution in a yeast homologue ARL1 that corresponds to Asp146 in human ARLTS1 has also been shown to inhibit apoptosis progression.21 The results obtained here as well as in a study by Frank et al7 suggest that the Cys148Arg alteration may modestly increase the risk for both breast and prostate cancer. In a recent study, the variant was also suggested to associate with an increased colorectal cancer risk both in the familial and sporadic setting.10 A similar trend, although not statistically significant, was also reported by Frank et al.9 No such association was observed in this study. The discrepancy between the results may be due to the limited sample size in our study. Power analyses, calculated by the QUANTO program (http://hydra.usc.edu/gxe),22 using the actual sample sizes, the observed allele frequencies and observed population lifetime risks of the three cancers in Finland suggest that power was inadequate for colorectal cancer and only adequate for 80% power at a P-value of 0.0001 to detect effect sizes (ORs) ranging from 1.5 to 5 (depending on the SNP) for prostate cancer, 1.4–3.8 for breast cancer, and 1.35–3.6 for breast and prostate cancer combined (Supplementary Table 5). Power would be lower for these same effect sizes when analysing the familial or sporadic subsets of each cancer type shown above. Only one SNP analysed had a common minor allele frequency, which limits power of association studies. Further analyses with larger sample sets as well as functional analyses are still needed to confirm whether the variant is a true cancer susceptibility allele.

A novel ARLTS1 variant that may have an effect on cancer risk is a Gly65Val substitution. The alteration was seen at higher frequency in familial prostate cancer patients as well as in sporadic breast cancer cases when compared to healthy controls. The change in significance from familial cases in prostate cancer risk to sporadic cases in breast cancer (and pooled cancer) risk is intriguing. It may be due to the small sample sizes of the cases and controls with one or two T alleles or alternatively, it could possibly reflect different effects of the variant in the two genders, or it could be a false positive result since it is only mildly significant. Further study into this association is necessary to validate the significance of this result. This residue is highly conserved both among species and among the gene family (www.ensembl.org), and functional analyses have indicated that it may be important for the communication of the N terminus of the protein with the nucleotide-binding site.23 Overall, these results suggest that the Gly65Val variant may be a prostate cancer predisposition allele in the Finnish population, although it would be interesting to see whether its frequency is increased in an independent, larger series of familial prostate cancer patients. It is important to note however that all P-values above have not been adjusted for multiple testing. If Bonferroni's correction is made for the number of tests performed, then none of these results would remain significant. However, these were all highly correlated tests, and the Bonferroni's theory assumes that all tests are uncorrelated. Thus, it is likely that correction for a smaller number of tests would be more appropriate. While none of the results were significant even after using a fairly small subset of the actual tests (56) to make the correction, it is possible that even this correction is too stringent.

Loss of heterozygosity (LOH) is a frequent event in breast, prostate, and colorectal cancer. The locus 13q14, where ARLTS1 also resides, is among the most frequently deleted chromosomal regions especially in prostate cancer, where LOH in 13q14 has been detected both in sporadic24, 25 and hereditary setting.26 Interestingly, the same area was also seen in a recent multi-center genome-wide linkage search for new prostate cancer susceptibility genes in families with at least five affected members, suggesting that this locus may be linked to aggressive disease.27 Here however the variant positive prostate cancer families had only two or three affected members and none of the variant-carrying unselected prostate cancer patients could be classified as having high-grade disease. Of interest, suggestive evidence for a novel breast cancer predisposition gene has been also reported in a nearby chromosomal region of 13q21 in Nordic breast cancer families.28

In conclusion, our findings suggest that the novel Gly65Val (G194T) and previously identified Cys148Arg (T442C) variants of ARLTS1 may be associated with prostate or breast cancer risks in the Finnish population. However, neither variant alone seems to explain the familial clustering of prostate or breast cancer and none of the clustering seen in colorectal cancer. Furthermore, contrary to previous results the Pro131Leu (C392T) and especially the Trp149Stop (G446A) variants do not show associations to any of the cancer types studied here. Accordingly, these present results warrant further studies of the role of the variants in unselected and familial cancer cases in other populations and with much larger sample series.