CHST9 rs1436904 genetic variant contributes to prognosis of triple-negative breast cancer

Triple-negative breast cancer (TNBC) refers to one aggressive histological subtype of breast cancer with high heterogeneity and poor prognosis after standard therapy. Lack of clearly established molecular mechanism driving TNBC progression makes personalized therapy more difficult. Thus, identification of genetic variants associated with TNBC prognosis will show clinic significance for individualized treatments. Our study is aimed to evaluate the prognostic value of the genome wide association study (GWAS)-identified CHST9 rs1436904 and AQP4 rs527616 genetic variants in our established early-stage TNBC sample database. Cox regression was used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). CHST9 rs1436904G allele was significantly associated with decreased disease-free survival time (DFS) (8.5 months shorter in GG genotype carriers compared to TT genotype carriers, HR = 1.70, 95% CI = 1.03–2.81, P = 0.038). Stratified analyses showed an increased risk of cancer progression in CHST9 rs1436904G allele carriers harboring larger tumor (tumor size > 2 cm), without lymph-node metastasis, being premenopausal at diagnosis or with vascular invasion (P = 0.032, 0.017, 0.008 or 0.003). Our findings demonstrate that the GWAS-identified 18q11.2 CHST9 rs1436904 polymorphism significantly contributes to prognosis of early-stage TNBC, suggesting its clinical potential in the screening of high-risk TNBC patients for recurrence and the possibility of patient-tailored therapeutic decisions.

Breast cancer is one of the most common malignancies worldwide, with an estimated 255,180 new cases and 41,070 deaths in the United States in 2017 1 . Notably, the incidence and mortality of breast cancer have increased tremendously in developing countries including China. Breast cancer is responsible for 15% of all new Chinese women cancer patients 2 . With similar carcinoma genetic mechanisms to other kinds of solid tumor, development of breast cancer is a chronic and multiple-step process involving accumulation of genetic and epigenetic alterations 3,4 . Risk factors for breast cancer include obesity, lack of physical exercise, alcohol abuse, hormone replacement therapy during menopause, ionizing radiation, early age at first menstruation, having children late or not at all, older age, and family history [3][4][5][6][7][8][9] . Triple-negative breast cancer (TNBC) refers to one type of breast cancer that does not express estrogen receptor (ER), progesterone receptor (PR) and Her2/neu [10][11][12][13][14] . TNBC patients usually have relatively poor outcomes due to its intrinsically aggressive behaviors and requires combination therapies instead of common chemotherapies because loss of target receptors. Thus, more effective and sensitive prognostic markers are instantly needed to guide clinical management of TNBC precisely.
Many genetic variants have been described to contribute to breast cancer risk since the discovery of BRCA1 and BRCA2 in 1990s [15][16][17][18][19][20][21][22] . These genetic inheritable variants are associated with familial breast cancer. Thus, identifying new genetic variants will show great significance in breast cancer prediction and treatments. Recent genome-wide analyses based on large consortia do avoid false positive identification of candidate genes 23 . Two 18q11.2 genetic variants (CHST9 rs1436904 and AQP4 rs527616) were identified as novel breast cancer susceptibility components based on GWAS 24 . One nested case-control study based on female Chinese patients within Singapore Chinese Health Study were also performed to verify the roles of these SNPs. It has been demonstrated that genetic variants on top of conventional risk factors did improve the risk prediction of breast cancer in Chinese women 25 , but not clear enough to declare whether CHST9 rs1436904 and AQP4 rs527616 affect prognosis of TNBC. To test this, we conducted a hospital-based cohort study of early-stage TNBC to further illustrate the role of these two genetic variants in breast cancer progress. We found that the CHST9 rs1436904 polymorphism might be a potential prognostic biomarker for early-stage TNBC, especially in the patients harboring larger tumor (tumor size > 2 cm), without lymph-node metastasis, being premenopausal at diagnosis or with vascular invasion. A total of 381 TNBC patients were recruited between January 2008 and December 2015 at Cancer Hospital, Chinese Academy of Medical Sciences (Beijing, China). These patients were followed until May 6, 2016 in order to collect data on clinicopathological characteristics, treatments, and vital status, such as recurrence and death. Disease free survival (DFS) was defined as the time from the date of diagnosis until the date of the first locoregional recurrence, first distant metastasis, or death due to any cause. Patients known to be alive with no evidences of disease progression were censored at the last follow-up date or on May 6, 2016 (whichever came first). All subjects were ethnic Han Chinese. As we can see from histopathological data, most majority of recruited TNBC patients were at early-stage. At recruitment, the informed consent was obtained from each subject. This study was approved by the institutional Review Boards of Cancer Hospital, Chinese Academy of Medical Sciences and Shandong Cancer Hospital affiliated to Shandong University.

Study subjects.
Immunohistochemistry (IHC) of formalin-fixed, paraffin-embedded breast cancer tissue samples obtained from the patients was used to evaluate ER or PR status with anti-ER and anti-PR antibodies. A positive ER or PR status was defined by nuclear staining of more than 1% based on guidelines of American Society of Clinical Oncology (ASCO) and College of American Pathologists (CAP) in 2010. To determine the HER2 status, IHC or gene amplification was performed by fluorescence in situ hybridization (FISH). Tumors negative for ER, PR, and HER2 were defined as TNBCs.
Statistics. The differences of patient clinical characteristics were calculated using Student's t test or χ 2 test. DFS was calculated as the time to progression or death without progression from the date of diagnosis. Survival distributions were estimated with the Kaplan-Meier method and were compared using log-rank test. The multivariate Cox proportional hazards model was applied to estimate effects of prognostic factors on DFS, using proverbial clinical factors, including age of onset, body mass index (BMI), tumor size, lymph-node metastasis, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy, where it was appropriate. References for multivariate analyses were without family history of breast cancer or ovarian cancer for breast cancer or ovarian cancer history, postmenopausal at diagnosis for menopausal status at diagnosis, modified radical mastectomy for operation method,histological grade I for histological grade, without vascular invasion for vascular invasion, tumor size ≤ 2 cm for tumor size, without lymph-node involvement for lymph-node involvement, without acceptance of chemotherapy for taxane/anthracycline-based chemotherapy, and without acceptance of radiotherapy for radiotherapy. P value of less than 0.05 was used as the criterion of statistical significance. All statistical procedures were conducted using SPSS software (version 16.0).

Comparison of survival according to baseline characteristics of TNBC patients. To test whether
various clinical characteristics contribute to DFS, patients were grouped according to age of onset, BMI, tumor size, lymph-node metastasis, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy, respectively. DFS was compared between (or among) different sub-groups. As shown in Supplementary Table 2, BMI, histological grade, vascular invasion, lymph-node metastasis and radiotherapy can significantly influence DFS independently (P < 0.05). However, other baseline characteristics did not affect DFS (P > 0.05). After adjustments for other clinicopathologic factors, only BMI and vascular invasion showed statistically significant impacts on patient prognosis (Supplementary Table 2 Effects of CHST9 rs1436904 and AQP4 rs527616 polymorphisms on TNBC DFS. It has been demonstrated that CHST9 rs1436904 and AQP4 rs527616 are breast cancer susceptibility single nucleotide polymorphisms (SNPs) 24,25 . However, the role of those two genetic variations in TNBC patients' outcome has not been examined. Genotype frequencies of CHST9 rs1436904 and AQP4 rs527616 SNPs among patients were summarized in Table 1. Interestingly, only CHST9 rs1436904 polymorphism was significantly associated with DFS of TNBC patients. The mean DFS of TNBC patients with the CHST9 rs1436904 GG genotype (46.8 months) or the GT genotype (50.1 months) was significantly shorter than that of the TT group (55.3 months). Moreover, both univariate and multivariate Cox proportional hazards model indicated that the CHST9 rs1436904 genetic variation was significantly associated with disease progression of TNBC patients (Table 1 and Fig. 1). After adjustments of multiple clinical factors, the CHST9 rs1436904 GG genotype was still significantly associated with disease progression compared to subjects with the TT genotype (HR = 1.70, 95% CI = 1.03-2.81, P = 0.038). Similarly, the risk of early recurrence for TNBC patients carrying the CHST9 rs1436904 G allele (GT and GG genotype) increased about 1.51-folds (95% CI = 1.03-2.22) in comparison with TT genotype patients (P = 0.033).
Stratified analyses of the effects CHST9 rs1436904 on DFS of TNBC patients. The association between CHST9 rs1436904 polymorphism and DFS of TNBC patients was further examined by stratifying for age of onset, BMI, tumor size, lymph-node metastasis, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy, respectively (Table 2 and Supplementary Table 3-7).
In the subgroup of TNBC patients harboring large tumors (tumor size > 2 cm), the CHST9 rs1436904 G allele (GT and GG genotype) was associated with a significantly increased risk of disease progression (HR = 1.88, 95% CI = 1.06-3.35; P = 0.032) compared to the TT genotype. The mean DFS of the G allele carriers was obviously shorter compared to the cases with the TT genotype (46.4 months vs. 55.9 months; P = 0.015) ( Table 2 and Fig. 2). However, such differences were not observed in patients with small tumors (tumor size ≤ 2 cm), indicating that the CHST9 rs1436904 polymorphism was an independent prognostic marker of TNBC cases with large tumors.  Among the TNBC patients without lymph-node metastasis, the mean DFS of the CHST9 rs1436904 GG genotype was significantly shorter than that of the TT genotype patients (48.1 months vs. 62.6 months; P = 0.033) ( Table 2 and Fig. 3). However, there was no such association between the polymorphism and DFS in patients with lymph-node metastasis (mean DFS of the TT, GT and GG genotypes: 48 months, 72 months and 60 months, respectively; P = 0.620). HRs, calculated from the multivariate Cox proportional hazards model, demonstrated that patients without lymph-node metastasis harboring CHST9 rs1436904 GG or GT/GG genotype showed 2.27-fold or 2.01-fold increased risk for disease progression (P = 0.033 or 0.017, respectively) compared to the TT genotype patients (Table 2).  Table 2. DFS of TNBC associated with CHST9 rs1436904 genotypes by tumor size, lymph-node involvement, menopausal status as well as vascular invasion Note: DFS, disease-free survival time; HR, hazard ratio; CI, confidence interval. Hazard ratios (HRs) and 95% confidence intervals (CIs) for the association between SNP and disease-free survival time (DFS) were estimated by Cox regression adjusted by age of onset, BMI, tumor size, lymph-node involvement, histological type, histological grade, menopausal status, vascular invasion, breast or ovarian cancer history, surgical method, taxane/anthracycline-based chemotherapy and radiotherapy.
SCieNtiFiC RepoRts | 7: 11802 | DOI:10.1038/s41598-017-12306-6 Among the patients who were premenopausal at diagnosis, the median DFS of either the rs1436904 GT or GG genotype patients (48.0 months or 36.0 months) was shorter than that of the rs1436904 TT genotype patients (48.0 months). However, there was no such association between the polymorphism and DFS in patients being postmenopausal at diagnosis (Table 2 and Fig. 4). In the multivariate Cox proportional hazards model, premenopausal patients with the rs1436904 GG genotype showed 2.18-fold increased risk for disease progression (95% CI = 1.12-4.22, P = 0.021) compared to subjects with the TT genotype (Table 2). Similar results were observed  among premenopausal patients with the rs1436904 GT genotype (HR = 2.07, 95% CI = 1.18-3.65, P = 0.011) ( Table 2).
In the subgroup of TNBC patients with vascular invasion, the CHST9 rs1436904 G allele was significantly associated with an increased risk of disease progression compared to the TT genotype (HR = 6.51, 95% CI = 1.89-22.36; P = 0.003). Especially, TNBC patients with GG genotype showed 39.37-fold increase risk for disease progression (P < 0.001). However, such differences were not observed in patients without vascular invasion (Table 2 and Fig. 5).

Discussions
Development of breast cancer are multiple-process consequences of combined genetic and epigenetic changes 3,4 . About five to ten percent of breast cancer cases are believed to be hereditary and associated with certain gene mutations [20][21][22] . Although multiple breast cancer susceptibility genes have been identified, new sets of susceptibility genes should also be identified. TNBC accounts for 12-24% of breast cancers associating with early recurrence and poor outcome. Additional efforts should be made to discover specific loci or genetic variants related to TNBC risk, which will expand our understanding of the etiology of this aggressive breast cancer and improve its prevention and clinical diagnosis. Currently, genome-widely analysis was performed to discover and validate genetic variants that are associated with breast cancer risk using large consortia [15][16][17][18] . Multiple novel breast cancer genetic susceptibility loci were identified and validated base on this approach 24 . Based on a breast cancer GWAS, which identified AQP4 rs527616 and CHST9 rs1436904 genetic variants, we explored their involvement in early-stage TNBC and concluded that the CHST9 rs1436904 SNP is an independent prognostic genetic variant in Chinese TNBC patients. These results will provide new prevention and diagnosis targets in TNBC therapy.
In Michailidou et al. 's study 24 , the G (minor allele) in relative to T (major allele) is a relative "protective" risk against breast cancer (OR = 0.96, 95% CI = 0.94-0.98). However, we did observe the significant role of the same SNP on prognosis of TNBC. There might be multiple reasons. First of all, the purpose of our study is to identify prognostic markers, but the GWAS study was aimed to identify susceptibility SNPs. As a result, the observation of different results due to different study purpose might be possible. For example, ERCC1 C118T was associated with lung cancer risk. The OR was 0.90 (95% CI = 0.81-0.99, P = 0.043) in an additive genetic model (C allele vs. T allele) and 0.77 (95% CI: 0.63-0.95, P = 0.013) in a recessive genetic model (CC/CT vs. TT) 30 . However, ERCC1 C118T was proved to be a risk SNP of overall survival for platinum-based chemotherapy in Asian NSCLC patients (CT + TT versus CC: HR = 1.24, 95% CI = 1.01-1.53) 31 .
The two SNPs examined in this study locate in AQP4 and CHST9 gene. AQP4 belongs to AQP family and functions in water maintaining and ion homeostasis 32 . It is located at membrane and cytoplasmic fraction and markedly decreased in tumor tissues compared to paired-adjacent tissue, thus indicating its pathogenic role during cancer development 33 . CHST9 belongs to the N-acetylgalactosamine 4 sulfotransferase (GalNAc4ST) family, which transfers sulfate to position 4 of nonreducing terminal GalNAc residues 34 . Sulfate group I carbohydrates play important roles in conferring highly specific functions on glycoproteins, glycolipids, and proteoglycans 35 . It plays an important role in hematologic malignancies because CHST9 copy number variants (CNV) are associated with acute myelogenous leukemia (AML) 36 . Additionally, CHST9 CNV and amplification are also found in the brain of schizophrenia patients and gastric cancer patients with metastatic lymph node 37,38 . Nevertheless, the role of CHST9 as well as its genetic variations in breast cancer, especially TNBC, has not been determined. Our results show that CHST9 rs1436904 SNP significantly contributes to early-stage TNBC progression risk. Notably, our results also show that the CHST9 rs1436904 G allele is a "risk" genetic variant for outcome of TNBC patients, significantly associated with shorten DFS in TNBC patients harboring big tumors (>2 cm), without metastasis, being premenopausal at diagnosis or with vascular invasion.
In all, to the best of our knowledge, our study for the first time identified an inherited variation in CHST9 which was significantly associated with DFS of TNBC patients, especially in TNBC patients harboring big tumors, without lymph-node metastasis, being premenopausal at diagnosis or with vascular invasion. Our findings might have potential clinical implications on precision treatment of TNBC, and eventually affect the therapeutic efficacy.
Ethical approval. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.