Introduction

Breast cancer (BC) is the most prevalent cancer in women. Several polymorphic loci were recently identified by two independent genome-wide association (GWA) studies as being associated with BC risk.1, 2 One of the strongest associations was found in the FGFR2 gene. Both GWA and replication3, 4 efforts suggest that the associated locus is limited to LD extending over 20 kb in intron 2. However, the question of refining the boundaries of association and finding the causative variant within intron 2 of FGFR2 remains open. In the study by Easton et al.,1 the SNP rs2981582 located at the upstream boundary of intron 2 was the most significantly associated. In an attempt to identify the functional variant, additional 21 SNPs in LD with rs2981582 were genotyped in European (4426 cases, 4444 controls) and Asian (3918 cases, 3100 controls) datasets. The most significant association was observed for rs7895676 located at the downstream boundary of the LD block; this SNP provided a much better (odds>100) fit to the data than the original upstream SNP rs2981582.

The aim of this work was to study the association between FGFR2 polymorphisms and BC in Russian women from West Siberia, Russian Federation. Specifically, we aimed at replicating previously described associations and refining the boundaries of association using an ethnic group, which has not been investigated for such an association before.

Materials and methods

The case group included 766 and the control group included 665 women. All the women signed an informed consent to participate in the study. The study protocol was approved by Medical Ethics Committee of Altai Oncological Centre. The case group included patients with sporadic BC. The control group included women randomly selected from blood bank donors. All women were ethnic Russian residents of the Altai Krai, Russian Federation.

For this study, we chose seven SNPs. Three of these (rs3135718, rs2981582 and rs7895676) were in strong LD and showed an association with BC in the study by Easton et al.1 To study whether in the Russian population, the LD block is the same as in the other European populations, the other four SNPs were chosen from these flanking the European LD block at the 5′- (rs4647913 and rs3135715) and 3′-ends (rs2981428 and rs10510097).

Two SNPs, rs7895676 and rs2981428, were genotyped in part by allele-specific real-time PCR and in part by direct sequencing. Other SNPs were genotyped by real-time PCR using TaqMan probes (ICBFM, Novosibirsk, Russia).

All statistical analyses were carried out using the free R-2.6.0 software (http://www.r-project.org) and associated libraries GenABEL,5 haplo.stats,6 genetics and rmeta.

For further details, see Supplementary Materials and methods.

Results

We identified genotypes of rs4647913, rs3135715, rs3135718, rs2981582, rs7895676, rs2981428 and rs10510097 in Siberian cases and controls (Table 1). Of the seven typed SNPs, four are present in the HapMap database (http://www.hapmap.org/). For the SNP rs2981428, allelic frequencies observed in the West Siberian population were close to those observed in the HapMap Asian samples (Table 1); rs2981582 has shown a frequency that was intermediate between European and Asian populations; the frequency of rs10510097[T] was significantly higher in the Siberian compared with that in the Asian (P=0.02 with HCB, P=10−5 with JPT) and the European (P=10−8) populations and the frequency of rs3135715[C] was significantly lower in the Siberian compared with that in the Chinese (P=0.003) and the European (P=0.01) populations, and marginally lower (P=0.06) compared with the Japanese population.

Table 1 Association between breast cancer and SNPs from the FGFR2 region tested in the population of Western Siberia

We estimated pairwise LD for the seven SNPs using our data (Figure 1, Supplementary Table 1a and b). Similar to the HapMap results, in the Russian population, SNPs rs4647913, rs2981428 and rs10510097 were not contained in the intron 2 LD block that was previously reported to be associated with BC risk. According to HapMap, rs3135715 and rs2981582 were in LD (r2=0.44 for Caucasians and r2=0.77 for Asians) and, therefore, rs3135715 might be located inside the same LD block. In our data the LD between SNPs rs3135715 and rs2981582 was lower (r2=0.24).

Figure 1
figure 1

Association analysis of SNPs across intron 2 of FGFR2 locus. (a) Association between breast cancer and SNPs in the FGFR2 region in the West Siberian population. (b) Linkage disequilibrium in the FGFR2 region in the HapMap CEU population and (c) the West Siberian control population.

Table 1 and Figure 1 show that a significant association was observed between the risk of BC and three SNPs: rs3135718 (G allele odds ratio (OR)=1.43, P=6 × 10−6), rs2981582 (T allele OR=1.46, P=2 × 10−6) and rs7895676 (C allele OR=1.28, P=1.7 × 10−3); this association was still highly significant when multiple testing corrections for the number of SNPs studied were made. On the basis of Akaike's information criterion, the additive model was the best for all three SNPs (Supplementary Table 2). The other four SNPs (rs4647913, rs3135715, rs2981428, rs10510097) did not show a significant association with BC (Table 1, Figure 1).

We have estimated the frequencies of haplotypes, including rs3135718, rs2981582 and rs7895676, in the case and control groups (Table 2). A statistically significant association was shown only for the GTT haplotype consisting of the risk alleles of each locus identified in a single-locus analysis (Table 2). Thereafter, we tested whether all three SNPs are necessary to explain the observed association. For this, we have contrasted models, including each one of the significant SNPs and all pairs of SNP to the general model, including all three SNPs (rs7895676, rs2981582 and rs3135718). Supplementary Table 3 shows that the model, including rs7895676, only was significantly (P=0.003) worse than the model, including all three SNPs, whereas models, including rs2981582 and rs3135718 (individually or together), did not show a significant deviation from the general model.

Table 2 Frequency of FGFR2 intron 1 and intron 2 haplotypes in cases and controls and association with breast cancer risk

Discussion

In this work, we studied the association between seven FGFR2 SNPs, including rs2981582 and rs7895676, and BC risk in the Russian population of cases and controls from the Altay region of Siberia, Russian Federation. The study was sufficiently powered to carry out a replication study; for example, the power to detect the association with rs2981582 at a P-value of 0.05 was 93%. Of the seven SNPs typed, four were present in HapMap. In our population, allelic frequencies were different from those observed in the European HapMap population, although the minor allele was always the same (Table 1). These differences may reflect the history of the Russian population, which geographically lies between Western Europe and Asia and is known to form the “eastern end” of the European geographic and genetic distributions.7 Although the LD pattern observed in our sample is similar to that observed in HapMap and other studies, the magnitude of LD in FGFR2 intron 2 generally seems to be lower in our population compared with that in the HapMap European population.

Three SNPs—rs7895676[C] (OR=1.28 (1.12–1.43), P=1.7 × 10−3), rs2981582[T] (OR=1.46 (1.30–1.62), P=2 × 10−6) and rs3135718[G] (OR=1.43 (1.27–1.58), P=6 × 10−6)—were significantly associated with BC in our study. In the study by Easton et al,1 the SNP rs7895676 located at the downstream boundary of intron 2 had a stronger association with BC risk than rs2981582 (odds>100), which is located at the upstream boundary. On the contrary, we found that in our sample, rs2981582 explained the association much better than rs7895676 (odds>1000). If we hypothesize that the actual causative variant lies in the LD block, including rs7895676 and rs2981582/rs3135718, the apparent contradiction may be explained by different LD structures in different populations. Furthermore, for rs2981582, the OR observed in the Russian population was significantly higher than that observed for Asians (P=0.02), but did not differ significantly (P=0.1) from the estimates obtained from European samples (see Supplementary Materials and methods). This finding may be explained by assuming that the LD structure in the Russian population resembles that of Europeans more closely than that of Asians, which is also consistent with recent studies of gene-geography.7

Meyer et al8 have recently shown that the haplotype marked by the minor allele of rs2981582 is associated with a higher level of FGFR2 transcription both in cell lines and in tumors. It was concluded that two SNPs, rs7895676 and rs2981578, located in the transcription factor binding sites (C/EBPb and Oct-1/Runx2) could affect the binding efficiency of these factors. The authors hypothesized that this could lead to an alteration of the transcription level of the corresponding allele and proposed that rs7895676 and rs2981578 are the functionally significant SNPs. Our study suggested that rs7895676 could not explain all the observed associations. However, our results, although strong, are based on statistical evidence, and the functionality of rs7895676 cannot be excluded. Further functional studies are required to elucidate the actual causative variant located at intron 2 of the FGFR2 gene.

In summary, we have shown that SNP polymorphisms within the LD block in intron 2 of the FGFR2 gene are associated with BC risk in the Russian female residents of West Siberia, Russia. SNPs rs2981582 and rs3135718, located at the upstream boundary of intron 2 had a stronger (odds>1000) association with BC risk than rs7895676, which is located at the upstream boundary of the block. We conclude that rs7895676 is not likely to be the causative variant responsible for the association in the region. Our study underlines the utility of different populations—including Russian—to fine-map causative variants underlying the variation of complex traits.