Introduction

Breast cancer (BC), a substantial global public health concern, is one of the most common malignancies diagnosed in women, with an estimated 1.4 million new cases and over 450 000 deaths annually worldwide.1 BC is primarily a hormone-dependent disease that can be regulated by the status of steroid hormones such as estrogen and progesterone.2 BC can be divided into five subtypes, which vary in their treatment options and survival outcomes based on gene expression profiles.3, 4, 5 Among the five subtypes of BC, estrogen receptor (ER)-positive and progesterone receptor (PR)-positive tumors account for ~70% of all cases.6, 7

Genetic factors have an important role in the etiology of both sporadic and familial BC.8, 9 High-penetrance BC susceptibility genes, such as BRCA1 and BRCA2, account for only a small proportion of BC in the general population because of their low mutation rates.10 To date, genome-wide association studies (GWAS) in European, African-American and East-Asian descendants have identified common variants associated with BC risk at multiple genetic loci.11, 12, 13, 14, 15, 16, 17, 18 Above all, accumulated epidemiologic data suggest substantial heterogeneity in breast tumor subtypes, defined by hormone receptor status, for association with gene polymorphisms.12, 15, 17, 19, 20, 21, 22 Therefore, detailed stratification of tumors may deepen our understanding of BC etiology, facilitate the discovery of novel risk factors and potentially enable risk prediction for specific tumor types. Although all observations indicate that inherited risk variants may vary in diverse BC subtypes, the underlying susceptibility of some variants for BC subtype has not been well investigated in the Chinese population.

In this study, we selected 23 GWAS-identified single-nucleotide polymorphisms (SNPs) to investigate and verify their putative association with BC and specific tumor subtypes defined by ER and PR status in Chinese women. We conducted an association analysis in a case–control study in women from the northwest of China (Shaanxi province). Our data provided considerable evidence for the association among common SNPs and the overall risk of BC as well as tumor subtypes. These results may eventually further improve prevention, early detection and treatment of BC.

Subjects and methods

Subjects

In this case–control study, all participants were Chinese women. A total of 551 unrelated subjects with BC (mean age, 49±11) were recruited from the First Affiliated Hospital of the Medical College of Xi'an Jiao tong University from January 2011 to November 2014. Within BC cases, there were 292 patients with ER-positive tumor, and 136 patients with ER-negative tumor; additionally, there were 247 patients with PR-positive tumors and 180 patients with PR-negative tumors. BC was defined according to the patient’s surgical and pathological symptoms, and their disease information was obtained from their medical files.

In sum, 577 healthy blood donors (mean age of 49±8) were recruited from Han origin women living in the city of Xi’an and its surrounding areas. The control group was matched for age and ethnicity with patients, without a history of cancer. Additionally, we selected patients with a body mass index (BMI=weight (kg)/height (m2)) in the normal range of 18.5–24.9 in both the case and control groups.

Body size is an important modifiable risk factor for BC. The participants were not genetically related within three generations. Written informed consent was obtained from all participants, and the study protocol was approved by the Ethical Committee of the Medical College, Xi’an Jiaotong University.

SNP selection and genotyping

In all, 23 SNPs with minor allele frequencies greater than 0.05 were selected from GWAS based on a review of published literature and a search of HapMap and dbSNP (Han Chinese population).11, 12, 16, 23, 24, 25, 26, 27, 28, 29 Table 2 lists the 23 SNPs that were selected and outlines other relevant characteristics. Genomic DNA was extracted from peripheral blood using the Qiagen Blood Kit (Qiagen, Chatsworth, CA, USA) according to the manufacturer’s protocol. Amplification and extension primers were designed using MassARRAY® Assay Design 3.0 software (Sequenom, San Diego, CA, USA). SNP genotyping was performed using matrix-assisted laser desorption ionization-time of flight (MassARRAY system, Sequenom Inc.) mass spectrometry. Genotype calling was performed in real time with the MassARRAY RT software version 3.0.0.4 and analyzed using the MassARRAY Typer software version 3.4 (Sequenom). The experimenters were blinded to the case/control status of the samples.

Statistical analysis

Differences among cases and controls in demographic characteristics, including age and BMI, were evaluated by the Student’s t-test. The allele and genotype frequencies for each SNP were compared, and Hardy–Weinberg equilibrium was evaluated using the Chi-squared (χ2) test among the controls. Associations between SNPs and BC were assessed by the Pearson χ2 test or the Fisher’s exact test. The four genetic models (codominant, dominant, recessive and additive) were applied by PLINK software (http://pngu.mgh.harvard.edu/purcell/plink/) to assess the association of each locus with the risk of BC. Akaike’s Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to determine the best-fitting model for each SNP. These measures can weigh the estimated complexity of the model and the goodness of fit to the data. The model with the smallest AIC or BIC value should be selected. In stratified analyses, case patients were classified into subgroups of ER-positive, ER-negative, PR-positive and PR-negative tumor types. We further assessed the association of the genotypes of these SNPs with the risk of tumor subtypes under four different genetic models (codominant, dominant, recessive and additive models). Unconditional logistic regression with adjustments for age and BMI was used to calculate the odds ratio (OR) and 95% confidence interval (CI) in independent association between each SNP and the overall BC risk as well as subtypes. In this study, all P-values were two-sided and P0.05 was the standard for statistical significance. Statistical analyses were performed using Microsoft Excel and SPSS 17.0 software (SPSS Inc., Chicago, IL, USA).

Results

Subject characteristics

Basic characteristics of the 551 case patients and 577 healthy controls were presented in Table 1. Age (age at diagnosis for cases and age at recruitment for controls) was equally distributed among BC cases (overall BC and subtype defined by ER/PR status) and controls. However, comparing with controls, overall cases, ER-positive cases and PR-positive cases, respectively, showed distribution differences in terms of BMI using Student’s t-test, while no significant distribution differences of BMI were found between remaining cases and controls.

Table 1 Basic characteristics of the cases and controls

Association between SNPs and overall BC risk

In the current study, 23 SNPs were genotyped, and the average SNP call rate was 99.78% (98.16–100%) in both cases and controls. Among them, SNPs rs13281615 and rs2380205 deviated from Hardy–Weinberg equilibrium test with a P-value of <0.05 (Table 2). We analyzed the association of SNP genotypes and the overall susceptibility to BC by χ2 test, finding four significant SNPs (rs616488, rs6678914, rs17530068 and rs6001930) at a 5% level. The frequency of the C allele in rs616488 (P=0.0003, odds ratio (OR)=0.722, 95% confidence interval (CI)=0.605–0.861) and the A allele in rs6678914 (P=0.029, OR=0.796, 95% CI=0.648–0.977) in BC cases was significantly lower than that in controls (Table 2). The rs17530068 C allele frequency (P=0.009, OR=1.289, 95% CI=1.064–1.561) and the rs6001930 C allele frequency (P=0.041, OR=1.209, 95% CI=1.008–1.450) in BC cases were significantly higher than that in the controls (Table 2). Seventeen additional SNPs yielded negative results (Table 2). To reduce the potential of spurious findings due to multiple testings, a strict Bonferroni correction analysis was applied; we found that one SNP (rs616488) was significant, while SNPs rs678914, rs6001930 and rs17530068 were not significant (Table 2).

Table 2 Basic information of candidate SNPs in this study

We then analyzed the genotype effects of these SNPs by unconditional logistic regression analysis with adjustments for age and BMI under four different genetic models (codominant, dominant, recessive and additive). Four significant SNPs (rs616488, rs6678914, rs17530068 and rs6001930) were shown in Table 3. The SNP rs616488 C allele provided a protective effect against BC (P=0.0015 for codominant model, P=0.0016 for dominant model, P=0.0077 for recessive model and P=3 × 10−4 for additive model). As for SNP rs6678914, the A allele also conferred a protective effect for tumor (P=0.028 for additive model). On the contrary, the SNP rs17530068 C allele increased BC risk (P=0.026 for codominant model, P=0.028 for dominant model, P=0.035 for recessive model and P=0.009 for additive model). Likewise, the SNP rs6001930 C allele increased disease risk (P=0.037 for dominant model and P=0.043 for additive model).

Table 3 Relationship between significant SNPs and overall breast cancer risk (adjusted by age+BMI)

Associations among SNPs and the risk of tumor defined ER status

As listed in Table 4, 292 patients exhibited ER-positive tumors, 136 patients exhibited ER-negative tumors and 3 SNPs (rs616488, rs17530068 and rs6001930) exhibited significant associations with ER-positive tumor risk. The SNP rs616488 C allele (P=0.014 for codominant model, P=0.007 for dominant model and P=0.003 for additive model) decreased the ER-positive risk; the SNP rs17530068 C allele (P=0.039 for additive model) and rs6001930 C allele (P=0.016 for dominant model and P=0.019 for additive model) increased the ER-positive risk. At the same time, only one SNP (rs3817198) exhibited a significant association with ER-negative tumor risk. The SNP rs3817198 C allele (P=0.024 for recessive model) increased the ER-negative tumor risk.

Table 4 Relationship between significant SNPs and ER status tumor risk (adjusted by age + BMI)

Associations between SNPs and the risk of tumor defined PR status

Table 5 illustrates that 247 patients had PR-positive tumor, and 177 patients had PR-negative tumors and 3 SNPs (rs616488, rs4784227 and rs6001930) had significant association with PR-positive tumor risk. The SNP rs616488 C allele (P=0.015 for codominant model, P=0.013 for dominant model, P=0.026 for recessive model and P=0.004 for additive model) decreased risk, whereas the SNP rs4784227 T allele (P=0.037 for dominant model) and rs6001930 C allele (P=0.011 for codominant model, P=0.003 for dominant model and P=0.004 for additive model) increased risk. In contrast, only one SNP (rs17530068) exhibited significant association with PR-negative tumor risk. The rs17530068 C allele (P=0.024 for dominant model and P=0.018 for additive model) increased the risk.

Table 5 Relationship between significant SNPs and PR status tumor risk (adjusted by age + BMI)

Discussion

In this study, we conducted a thorough association analysis among 23 GWAS-identified SNPs and BC as well as its subtypes in Chinese women by using matrix-assisted laser desorption ionization-time of flight mass spectrometry. Among 23 SNPs, 4 SNPs (rs616488, rs6678914, rs17530068 and rs6001930) were identified to be significantly associated with overall BC risk. Stratified analyses identified that rs616488 and rs6001930 were specific to ER positive and PR positive, rs17530068 was specific to ER positive and PR negative, rs3817198 was specific to ER negative and rs4784227 was specific to PR positive. Overall, these findings provided strong evidence of genetic susceptibility to overall BC and BC subtypes.

The C allele of rs616488 (1p36.22/PEX14) and the A allele of rs6678914 (1q32.1/LGR6) were protective factors for BC in the current study. PEX14 is one of ~15–20 genes implicated in the biogenesis of mammalian peroxisomes. Peroxisomes are part of several metabolic pathways; they perform oxidative reactions catalyzed by amino-acid oxidases and catalases, most notably the β-oxidation of very long chain fatty acids, which are then fully degraded in mitochondria.30, 31 The PEX gene fragment codes for the degradation of the C-terminus of matrix metalloproteinase-2, and studies have shown that this fragment can inhibit matrix metalloproteinase-2 extracellular matrix degradation and tumor angiogenesis.32 Extracellular matrix degradation and tumor angiogenesis have a vital role in tumor cell growth, invasion and metastasis. PEX gene in vivo can inhibit tumor cell growth. The rs6678914, on chromosome 1q32.1 is located in intron 1 of the LGR6 gene, which is expressed in breast tumors along with several other genes in this region, including UBE2T and PTPN7.33 Previous studies have reported that the SNP rs616488 has statistically significant association with BC risk in European12 and East-Asian18 descendants but not in African-American women,14 which indicated that there are large differences in genetic architecture between the African-ancestry genome and genomes of Asians and Europeans. Interestingly, our subtype analyses showed that the C allele of this locus was a protective factor for both ER-positive and PR-positive cancers. However, Zheng et al.18 found that this SNP had no statistically significant association with BC defined ER status in East-Asian descendants. This contradiction requires further investigation and validation in a larger population. As for rs6678914, Garcia-Closas et al.12 confirmed that this SNP was associated with ER-negative but not ER-positive BC in populations of European ancestry. Subsequently, Sawyer et al.34 suggested that rs6678914 was more strongly associated with lobular carcinoma in situ than with invasive lobular breast cancer. Along with our findings, these reports contribute to an association between LGR6 rs6678914 and BC susceptibility, our stratified analysis was not significant in this study because the sample size was relatively small.

In this study, the C alleles in the SNPs rs17530068 (6q14/unknown) and rs6001930 (22q13.1/MKL1) were risk factors for BC. The SNP rs17530068 at 6q14 is located in a gene desert with no evidence of an open/active regulatory region in human microvascular endothelial cells (HMEC). The closest gene (262 kb), family with sequence similarity 46, member A (FAM46A/C6orf37), encodes a protein of unknown function. The FAM46A gene is located at chromosome 614.1 and was first identified and cloned from human retina tissue as a retinal disease candidate gene.35 Studies showed that the FAM46A gene is expressed in ameloblast nuclei of developing teeth and hypothesized that it might act together with morphogenetic factors involved in cell proliferation, apoptosis and differentiation activities in tooth buds, and perhaps in enamel production.36 Data suggest that patients with C allele of rs17530068 are associated with increased BC. On the basis of these facts, we hypothesized that the FAM46A gene may be involved in BC. However, there might be other biological pathways that are not yet reported, and further functional studies will be necessary to elucidate the precise role of the FAM46A gene. A previous meta-analysis of GWAs, with regard to Japanese, Latino and European descendants, showed that rs17530068 was associated with BC, and with both ER-positive and ER-negative diseases.16 In East-Asian women (Chinese, Korean and Japanese) population, Zheng et al.18 found that SNP rs6001930 had no statistically significant association with BC, but subgroup analysis revealed that rs6001930 had a stronger association with ER-positive than ER-negative cancer. These results were partially consistent with ours, and to our knowledge, our data demonstrated the relationship between rs17530068 and PR-negative risk and between rs6001930 and PR-positive disease risk in the Chinese population for the first time.

Most intriguingly, although no overall association of BC was found for rs3817198 (11p15.5/LSP1) and rs4784227 (16q12.1/CASC16), analyses by ER/PR status revealed a statistically significant association between rs3817198 and ER-negative tumors, as well as between rs4784227 and PR-positive tumors. In contradiction with our results, rs3817198 showed stronger associations with ER-positive than ER-negative tumors in populations of European ancestry.19 Located in the 16q12.1,28 rs4784227 has been predicted to interfere with the affinity of FOXA1, an essential component of ESRα signaling,37 to its binding site.38 Its position in a regulatory region that interacts with the TOX3 promoter enables it to disrupt the expression of this gene, which in turn alters chromatin structure and DNA-protein binding patterns essential for cell survival.28 Although rs4784227 had a statistically significant association with BC risk in European and Asian women, Lin et al.39 did not find a correlation between rs4784227 and PR-positive tumors in a stratified interaction analysis of the Chinese population. Although the evidence for these associations was not very strong, additional analyses, involving a much larger number of BC patients, are needed to independently confirm these associations and assess whether their risks vary by tumor subtype.

Population stratification is an important factor to consider when conducting human genetic surveys.40 In the current study, cases and controls were matched for ethnicity by enrolling subjects from a homogeneous population. In addition, our results were in line with the hypothesis that ‘different biological mechanisms underlie diverse BC subtypes’, and indicated that tumor stratification might help in the identification of novel susceptibility markers for diverse BC subtype.41 Our results eventually help improve BC prevention, early detection and treatment. However, our analysis data for ER and PR were available for only a portion of the subjects. Therefore, the statistical power of our study was limited in the stratified analyses because of the small sample sizes. Some of the null associations observed in this study could be due to inadequate statistical power.

In summary, our study provided powerful new evidence for the relationship among SNPs and the risk of overall BC as well as the subtypes defined by ER and PR status in Chinese women. Our results shed light on the heterogeneity of different tumor subtypes according to protein expression of ER and PR, and the SNPs we detected can be applied in clinical diagnosis. However, the exact biological mechanism of how the polymorphisms regulate overall as well as tumor subtypes needs subsequent functional studies.