Introduction

Breast cancer is the most common cancer among women worldwide.1 In the past decade, China’s urban cancer registries have documented increased incidence rates of between 20 and 30% for breast cancer.2, 3 Epidemiological investigation of breast cancer has identified a number of environmental and lifestyle risk factors, such as age at menarche, menopause and first birth and exogenous hormone use.4, 5, 6 Furthermore, accumulative studies also indicate that a substantial proportion of breast cancer is due to inherited susceptibility. Not only the known high-penetrance genes (e.g., BRCA1, BRCA2, ATM, etc.) but also the other lower penetrance genes contribute to breast cancer risk.7

Recently, a genome-wide association study conducted in European ancestry (EA) population by Stacey et al.8 identified two genetic susceptibility loci, rs4415084 and rs10941679, at chromosome 5p12 were associated with breast cancer risk. Later, another study directed by Thomas et al.9 confirmed the single-nucleotide polymorphism (SNP) rs4415084 might affect breast cancer risk in the EA population. By examining the HapMap data, we found that, in HapMap CEU (Utah residents with ancestry from northern and western Europe)samples, the two SNPs rs4415084 and rs10941679 resided in a linkage disequilibrium block (chr5:44666047...45078551) containing the MRPS30 gene, which is also known as PDCD9 (programmed cell death protein 9) and encodes a component of the mitochondrial ribosome. A few studies reported that it was previously implicated in mammalian cells apoptosis and estrogen receptor (ER)-positive breast tumors.10, 11, 12 Therefore, it seems plausible that genetic variations, such as SNPs on 5p12, that affect MRPS30 gene expression or protein function would be associated with the risk of breast cancer.

The exciting results encouraged us to test the hypothesis that if SNPs on chromosome 5p12 also have a role in susceptibility to breast cancer in the Chinese population. Through the HapMap database, we found that the risk SNPs implicated in the EA population have different allele frequencies and linkage patterns in Chinese. Thus, we applied a fine-mapping study in the susceptible region of 5p12 identified by Stacey et al.8 in their genome-wide association study of breast cancer. Finally, we identified 10 SNPs with minor allele frequency0.05 in the Chinese population and genotyped these SNPs in a case–control study with 878 cases and 900 controls.

Materials and methods

Study subjects

The case–control study was approved by the Institutional Review Board of Nanjing Medical University. A total of 878 breast cancer cases and 900 cancer-free controls were included. The cases were incident breast cancer patients and were consecutively recruited from the First Affiliated Hospital of Nanjing Medical University, the Cancer Hospital of Jiangsu Province and the Drum-tower Hospital, Nanjing, China, from January 2004 to April 2010. All the breast cancer cases were histopathologically confirmed without restrictions of age or histological type. Exclusion criteria included self-reported previous cancer history, metastasized cancer from other organs, and previous radiotherapy or chemotherapy. Cancer-free women controls, frequency matched to the cases on age (±5 years) and residential area (urban or rural), were randomly selected from a cohort of >30 000 participants in a community-based screening program for noninfectious diseases conducted in Jiangsu Province, China. All the subjects were genetically unrelated, ethnic Han Chinese women. After providing informed consent, each woman was personally interviewed face-to-face by trained interviewers using a pre-tested standard questionnaire to obtain information on demographic data, menstrual and reproductive history, environmental exposure history and family history of cancer in first-degree relatives (parents, siblings and children). After the interview, each subject provided 5 ml of venous blood. The ER and progesterone receptor status of breast cancer were determined from the results of immunohistochemistry examinations according to the medical records of the patients.

SNPs selection and genotyping

By using information from the HapMap database (http://www.hapmap.org/, phaseII Nov 08) and the HaploView 4.2 software (Broad Institute, Cambridge, MA, USA), we conducted a block-based tagging strategy to find tagging SNPs (minor allele frequency0.05 in the Chinese population, Hardy–Weinberg equilibrium P0.05) on the basis of pair-wise linkage disequilibrium r2 threshold of 0.8 in this region (chr5:44666047...45078551). The two index SNPs from the genome-wide association study, rs4415084 and rs10941679, were forced into the set. Together, nine tagging SNPs (rs13357090, rs4415084, rs10941678, rs3747479, rs11951760, rs9790896, rs7703618, rs2067980, rs11742346 and rs4533894) were chosen. In addition, we also added a common nonsynonymous SNP rs3747479 in exon 1 of MRPS30. As a result, a total of 10 SNPs (rs13357090, rs4415084, rs10941678, rs3747479, rs11951760, rs9790896, rs7703618, rs2067980, rs11742346 and rs4533894) were included in our study.

Genotypes were determined using the middle-throughput TaqMan OpenArray Genotyping Platform (Applied Biosystems Inc., Foster City, CA, USA) and genotyping was performed without knowing the case or control status. Two blank controls in each plate were used for quality control. Samples were analyzed with AutoCaller Software (Applied Biosystems Inc.). 96 samples were also randomly selected from the Openarray platform and re-genotyped by using TaqMan allelic assays for the 10 SNPs and the results were all consistent.

Statistical analysis

Differences between the cases and controls on the demographic characteristics, selected variables and frequencies of the genotypes of 10 SNPs were calculated by using the Student’s t-test (for continuous variables) or χ2-test (for categorical variables). The Hardy–Weinberg equilibrium was tested by a goodness-of-fit χ2-test to compare the observed genotype frequencies to the expected ones among the control subjects. The associations between the 10 SNPs genotypes and the risk of breast cancer were estimated by computing the odds ratios (ORs) and their 95% confidence intervals (CIs) from logistic regression analyses, with and without the adjustment for age, age at menarche and menopausal status (natural menopause as one statue while unnatural menopause and premenopausal status were merged to the other). The potential gene–environment interaction was also evaluated by logistic regression analyses and tested by comparing the changes in deviance (−2 log likelihood) between the models of main effects with or without the interaction term. We used the PHASE 2.0 program (Manchester University, Manchester, UK) to infer haplotype frequencies based on the observed 5p12 genotypes. All of the statistical analyses were performed using the Statistical Analysis System software (9.1.3; SAS Institute, Cary, NC, USA). All tests were two-sided and the significance level was set at P<0.05.

Results

The characteristics of 878 cases and 900 controls were summarized in Table 1. There were statistically significant differences between the cases and controls in the distribution of age at menarche, age of first live birth, menopausal status and age at natural menopause (P<0.01).

Table 1 Distributions of select variables in breast cancer cases and cancer-free controls

The genotype distributions of the 10 SNPs in cases and controls were shown in Table 2. The observed genotype frequencies of 10 SNPs were all in Hardy–Weinberg equilibrium in the controls (P>0.05 for all SNPs). In single-locus analyses, the genotype distribution of SNP rs4533894 was significantly different between cases and controls (P=0.04). Multivariate logistic regression analyses revealed that the variant GG genotype of rs4533894 was associated with a significantly increased risk of breast cancer (adjusted OR=3.13, 95% CI=1.31–7.48), compared with the wild-type AA. However, no overall significant associations were observed between the other nine SNPs, including the two risk SNPs (rs4415084 and rs10941679) previously identified in the EA population and breast cancer risk in our study population. Furthermore, the association between rs4533894 and breast cancer did not remain significant after controlling multiple comparisons.

Table 2 Logistic regression analyses on associations between the 10 SNPs in 5p12 and risk of breast cancer

We further evaluated the effect of rs4533894 on breast cancer risk stratified by age, menopausal status (premenopausal and natural menopausal), age at menarche, age at first live birth and ER/progesterone receptor status on the basis of the dominant model. The increased risk of rs4533894 AG/GG was more pronounced only among women with ER− (OR=1.51, 95% CI=1.06–2.16; data not shown). Interestingly, stratified analyses of the other nine SNPs showed that the variant genotypes of rs4415084 (TC/CC) and rs11742346 (CT/TT) were significantly associated with breast cancer risk among subjects with younger menarche age (rs4415084 OR=0.67, 95% CI=0.46–0.97, P for heterogeneity=0.04; rs11742346 OR=1.52, 95% CI=1.08–2.15, P for heterogeneity=0.02; data not shown).We conducted interaction analyses between all these 10 SNPs and four nongenetic factors (age, menopausal status, age at menarche and age at first live birth), however, no significant association was observed after controlling multiple comparisons (data not shown).

We also conducted haplotype-based risk assessment for these 10 SNPs. Linkage disequilibrium among the 10 SNPs were shown in Supplement Table 1. However, no association was observed between different haplotypes and risk of breast cancer (Supplement Table 2).

Discussion

In our present case–control study consisted of 878 breast cancer cases and 900 controls, we examined the effect of 10 SNPs (rs13357090, rs4415084, rs10941678, rs3747479, rs11951760, rs9790896, rs7703618, rs2067980, rs11742346 and rs4533894) at 5p12 on breast cancer risk in the Chinese population. No significant results were found between these 10 SNPs and breast cancer risk after the multiple test adjustment.

Recently, Stacey et al.8 have identified two low-penetrance loci rs4415084 (T>C) and rs10941679 (A>G) on 5p12 conferring susceptibility to ER-positive breast cancer in the EA population. Subsequently, the effect of SNP rs4415084 on breast cancer risk in the EA population was confirmed by Thomas et al.,9 but not SNP rs10941679. In addition, the associations between these two SNPs and breast cancer risk were also investigated in African Americans.13, 14 For example, the SNP rs10941679 was evaluated in a recent study of African Americans and had no association with breast cancer risk in a combined group of 810 cases and 1784 controls from two separate studies conducted in the Southern United Sates.13 An independent case–control study with 886 cases and 1089 controls conducted in African Americans women14 reported a borderline significant association of rs4415084 with overall risk of breast cancer (Per-allele OR=1.13, 95% CI=0.99–1.28). Furthermore, a fine-mapping analysis of 5p12 from the Shanghai Breast Cancer Study conducted by Long et al.15 reported that rs10941679 showed null association with breast cancer risk, which was consistent with our finding.

The biological mechanism through which genetic variations in 5p12 influences breast cancer risk remains unclear. We examined the four SNPs (rs4415084, rs10941679, rs11742346 and rs4533894) using the UCSC genome browser database (build 36 assembly, hg18; Figure 1). From Figure 1, we can find that the only gene in this region is MRPS30 (also known as PDCD9, programmed cell death protein 9), which encodes a component of the small subunit of the mitochondrial ribosome and has been implicated in apoptosis.10, 11 Some studies have showed that MRPS30 is not expressed in normal breast luminal epithelial cells, but it is upregulated in infiltrating ductal carcinomas.16 Moreover, it was also a part of a gene expression profile that differentiated ER-positive from ER-negative breast tumors.11 Interestingly, a SNP rs3761648 in 5' near of MRPS30 is highly correlated with the SNP rs4415084 (r2=0.83 in HapMap CHB, r2=0.72 in HapMap CEU) and located at a site of H3K4Me3 histone modification marks. Therefore, we speculated that rs4415084 might be a proxy for some potentially functional SNPs, such as rs3761648, that influence the expression of MRPS30 and in consequence the cancer risk.

Figure 1
figure 1

Overview of the linkage disequilibrium (LD) block and the four breast cancer susceptibility SNPs(rs4415084, rs10941679, rs11742346 and rs4533894)and rs3761648 that was highly correlated with the SNP rs4415084. (a) A view of the genomic region of 5p12 (44 666 047–45 078 551) from the UCSC browser build 36 assembly (hg18). (b)The analyzed LD block, its flanking region, related genes, ENCODE enhancer and promoter associated histone mark (H3K4Me1) on eight cell lines (Gm12878, H1ES, HMEC, HSMM, HUVEC, K562, NHEK and NHLF) and ENCODE promoter associated histone mark (H3K4Me3) on nine cell lines (Gm12878, H1ES, HepG2, HMEC, HSMM, HUVEC, K562, NHEK and NHLF). A full color version of this figure is available at the Journal of Human Genetics journal online.

In conclusion, as a fine-mapping study on the associations of SNPs at 5p12 with breast cancer risk in China, none of genetic variants on 5p12 was independently associated with risk of breast cancer. Larger well-designed epidemiological studies as well as functional evaluations are warranted to confirm our findings.