Introduction

In 2011, breast cancer was the most common cancer not only in Canadian women, representing 28% of all new cancers and 14.4% of cancer death, but also in Western countries.1, 2 In mid 1990s, the two major genes BRCA1/2 were identified as strongly associated with breast cancer susceptibility in high-risk breast cancer families.3, 4, 5 Variations in several genes of lower penetrance/frequency, such as TP53, PTEN, ATM, CHEK2, PALB2 and BRIP1, are also associated with breast cancer risk but together with BRCA1/2, these genes would explain only 25% of the familial breast cancer risk.6 A significant portion of the unexplained cancer predisposition could be associated, among others, with variations in BRCA1-interacting partners resulting in reduction of BRCA1 activity and accumulation of mutations and alteration of the genome integrity. In addition to a key role in homologous recombination repair through its interactions with Rad51 and FANCD2,7, 8, 9 BRCA1 is also involved in cell cycle G2/M checkpoint by acting as a co-repressor of GADD45A (growth arrest and DNA damage gene 45) transcription in association with the zinc-finger protein 350 (ZNF350).10, 11, 12

ZNF350 protein, also known as ZBRK1 (zinc-finger and BRCA1-interacting protein with a KRAB domain 1), has been shown to regulate the expression of many genes by binding the GGGxxxCAGxxxTTT consensus sequence.12 In particular, ZNF350 is a transcriptional repressor of GADD45A occurring in a BRCA1-dependent manner, which involves a binding site in intron 3 of GADD45A gene.12 Moreover, given that ZNF350 DNA recognition motif sequences have been found in many BRCA1-targeted genes, a common function of ZNF350 in cellular DNA damage repair response has been suggested.12

ZNF350 has been shown to be involved in the tumorigenesis development of several human cancers. The under-expression of ZNF350 gene is observed in breast and colon carcinogenesis as well as in cervical tumor cells.13, 14, 15 Moreover, the inhibition of malignant growth, invasion and metastasis in cervical cells is correlated with high levels of ZNF350 gene, therefore suggesting a role of tumor suppressor gene.15 This upregulation leads to the increased expression of several genes involved in gene expression, cellular growth and proliferation.15

In particular, the co-repressor complex ZNF350/BRCA1/CtIP (CtTB-interacting protein) is implicated in the repression of high-mobility group AT-hook 2 (HMGA2) and angiopoietin-1 (ANG1) genes, which are involved in increased proliferation, mammary acini formation, anchorage-independent growth and vascular formation in breast tumors.16, 17 In addition, the ZNF350 gene overexpression led to an increase of ataxin-2 (ATXN2) mRNA levels (spinocerebellar ataxia type 2: SCA2 gene),18 which is involved in RNA metabolism and endocytic processes.19, 20, 21, 22, 23 In breast cancer cells, ZNF350 was also identified as a transcriptional repressor of p21 when associated with the KRAB domain-associated protein 1.24, 25 Furthermore, the expression of the ZNF350 gene may be repressed through an E2F1-recognition sequence in its promoter region, which allows the binding of the RB/E2F1/CtIP/CtBP complex responsible for this repression.26 Deregulation of this repression leading to an increase of ZNF350 levels could result in cellular sensitivity of DNA damage and ultimatety in carcinogenesis.

Based on the potential association of ZNF350 gene variations with breast cancer susceptibility described previously27 and its role in DNA repair and carcinogenesis development as well as the importance of the fine regulation of ZNF350 gene expression described above, the analysis and characterization of promoter variants regulating the expression of the ZNF350 gene became of great interest. We therefore characterized the sequence variations in the ZNF350 promoter region and evaluated their association with breast cancer risk in the French Canadian population. To our knowledge, this is the first analysis of the ZNF350 gene promoter describing the effect of genomic variants on gene expression.

Materials and methods

Ascertainment of families and genomic DNA extraction

All 96 non-BRCA1/2 individuals from high-risk French Canadian breast and ovarian cancer families (that is, families in which multiple cases of breast/ovarian cancer are present in close relatives—three cases in first- or four cases in second-degree relatives—or with strong evidence of a familial component) participating in this study were originally part of a larger interdisciplinary program termed INHERIT BRCAs.28 All participants were at least 18 years of age, mentally capable and had to sign an informed consent form. Ethics committees reviewed the research project at the participating institutions from which the patients were referred. The details regarding selection criteria of the breast cancer cases as well as the experimental and clinical procedures have been described previously.28, 29, 30 A subset of 97 high-risk French Canadian breast/ovarian cancer families was drawn from the initial study based on the absence of detectable BRCA1/2 mutation (so-called BRCAX) and constituted the cohort used for another study specifically aiming at the identification of other susceptibility loci/genes to breast cancer. One individual affected with breast cancer per family was selected for analysis, with a selection preference for the youngest subject available in the family. In all instances, diagnosis of breast cancer was confirmed by pathology reports. Lymphocytes from breast cancer individuals were isolated and immortalized as previously described.27, 31, 32, 33 Genomic DNA extraction of the 96 breast cancer cases as well as 94 healthy individuals from the same population has been performed as previously described.30 The healthy blood samples were obtained from Dr Damian Labuda at the Centre de Cancérologie Charles Bruneau, Hôpital Ste-Justine, Montreal, QC, Canada. The individuals who provided these samples were recruited on a non-nominative basis, in the framework of long-term studies aiming at the characterization of the genetic variability in human populations, approved by the Institutional Ethic Review Board.

ZNF350 promoter sequencing, sequence analysis and variant characterization

Based on the genomic position of the human ZNF350 gene (chromosome 19 GRCh37.p2; 52490079-52467593), the PCR amplification was performed on breast cancer cases and controls using primers designed to amplify the 2-kb region upstream the ZNF350 gene with the primer pairs listed in Supplementary Table 1. Direct sequencing and sequence analysis were performed as described previously.30 The open-source toolset PLINK was used to determine the deviation from Hardy–Weinberg equilibrium (HWE), and to calculate Fisher’s exact test and odds ratio with 95% confidence interval for each variant. Identification of potential transcription factor (TF)-binding sites were predicted using the MatInspector34 and Transcription Element Search softwares.35

Haplotype and linkage disequilibrium (LD) block estimation

To estimate the pattern of LD, all 12 variations identified in our breast cancer case series have been genotyped. The LDA program36 was used to calculate pairwise LD for each SNP pair. Lewontin’s |D’| measures were used to illustrate a graphical overview of LD between variants.36, 37

LD block identification was performed with the variants having a minor allele frequency (MAF) >2% using the Haploview software38 based on the algorithm of confidence intervals. Tagging SNPs (tSNP) from each LD block were then identified using the same software. The Caucasian HapMap data from the CEPH/CEU cohort was used to compare with the French Canadian population.

Haplotype analysis was performed using PHASE 2.1.1 software.39 The PHASE program estimates haplotype frequencies with a Bayesian-based algorithm and then uses a permutation test to determine the significance of differences in inferred haplotypes between both sample sets. All association tests were run under default conditions with 100 000 permutations. Haplotype frequencies were estimated using the promoter and gene variants having a MAF >2% (in at least one series).

Luciferase promoter assays

A 2410-bp fragment of the human ZNF350 promoter region including the untranslated exon 1 was PCR-amplified using genomic DNA from breast cancer individuals carrying the haplotypes H4, H8, H10 and H12 using primers introducing a XhoI or HindIII restriction site. The primers used for PCR were: 5′-GACGACCTCGAGGAGAAGCCCGAGCTAGGAAG-3′ (XhoI) and 5′-GACGACAAGCTTGGCCGTTGATCACTACAGACCC-3′ (HindIII). PCR products were then digested and introduced into the pGL3-basic vector (Promega Corporation, Madison, WI, USA). Single-promoter variant haplotypes (p1, p4, p5, p7, p11 or p12) were generated using the wild-type ZNF350 promoter haplotype H8/luciferase reporter construct as a template and the Quickchange II site-directed mutagenesis kit (Stratagene, La Jolla, CA, USA) according to the manufacturer’s protocol. Following transformation and plasmids extraction, plasmid constructions integrity were confirmed by sequencing. Transient transfection in ZR-75-1 and MCF-7 cells and Dual-Luciferase Reporter assays were performed in five replicates. The human breast adenocarcinoma cell line MCF-7 was grown in DMEM/F12 (Wisent, St-Bruno, QC, Canada) supplemented with 5% FBS, 1% Penicillin-Streptomycin and E2 10 nM to enhance cell growth. The human breast adenocarcinoma cell line ZR-75-1 was maintained in RPMI1640 (Wisent) supplemented with 10% FBS, 1% penicillin-streptomycin and E2 10 nM to enhance cell growth. Cells were seeded in 24-well culture plate at a density of 50–70% and incubated overnight. Using ExGen500 according to the manufacturer’s protocol (Fermentas Canada Inc., Burlington, ON, Canada), each well was transfected with 800 ng of pGL3-promoter haplotype-specific construct (or the empty pGL3 vector) encoding a modified firefly luciferase gene and co-transfected with 200 ng of pRL-null vector (Promega) encoding the renilla luciferase gene as an internal standard. The pGL3-basic vector and pGL3-SV40 control vector were used as negative and positive controls, respectively. Following a 24-h incubation, cells were assayed for the luciferase reporter gene activities measured with the Dual-Luciferase Reporter Assay System according to the manufacturer’s instructions (Promega) in a MicroLumat Plus luminometer (EG&G Berthold, Bad Wildbad, Germany). Promoter activities were expressed as a ratio of firefly luciferase to renilla luciferase luminescence in each well. The empty pGL3-basic vector was used to measure basal expression levels in each cell line.

Results

Direct sequencing of the ZNF350 promoter region in 96 BRCA1/2-negative breast cancer subjects from high-risk French Canadian breast/ovarian cancer families and 94 healthy controls led to the identification of 12 variants. Among the promoter genomic variants identified, five of them, namely c.-1775 T>A, c.-1769 T>A, c.-895delATCA, c.-873C>T and c.-856insAA are novel variations not reported in databases. As shown in Table 1, 5 out of the 12 variants display a MAF lower than 2% in either of the series. The variants c.-1775 T>A, c.-1769 T>A, c.-872 G>C and c.-856insAA are found exclusively in the control group, whereas the c.895delATCA is observed in one breast cancer case at the heterozygous state. Only the c.-874 G>A variant displays a significant deviation from HWE due to an excess of homozygotes (HWE P=0.032) among the healthy individuals.

Table 1 Sequence variations in ZNF350 gene and genotype frequencies in familial breast cancer cases and controls

To combine genotype data from the promoter region and those from the ZNF350 gene published by Desjardins et al.27 for LD and haplotype analyses, a subset of 67 healthy controls, which were commonly genotyped in the previous and current publication was selected. In addition, the ZNF350 gene variants from Desjardins et al.27 having a MAF >2% have also been included in Table 1 as reference for LD and haplotype analyses (denoted g2 to g17). Considering all sequence variations, the c.425 T>C, c.466+46 A>T, c.936C>T and c.1900C/T variants located in the gene region displayed a modest significant protection against breast cancer, whereas none of the promoter variants showed any significant difference of MAF between both series.

A graphical representation of the pairwise LD between the 18 ZNF350 promoter and gene variants having a MAF >2% in at least one series, as measured by Lewontin’s |D’| values, is shown in Supplementary Figure 1. As demonstrated, the majority of the genomic variations are in strong LD with each other. Although the ZNF350 gene is comprised in a relative short genomic region, strong LD was found between the two most distantly separated promoter and gene variants (p1 and g17: inter-marker distance 25 kb, |D′|= 0.95), which suggested that LD at the ZNF350 locus did not decrease significantly with distance. However, few intragenic variants showed lower LD with other variations, namely g10 and g11, located in the coding region of the gene, whereas a clear breakage of LD seemed to occur between the promoter and gene regions, excepting for g2, which demonstrated a strong LD with all other promoter variants.

PHASE analyses identified 13 different haplotypes in the promoter region of ZNF350 in breast cancer and control individuals (Table 2a). According to PHASE, the promoter haplotype H8 was the major haplotype with a frequency of 80.6% in both series combined. The four most frequent haplotypes (H4, H8, H11 and H12) represent 91.8% of all haplotypes identified. The H1, H3 and H10 haplotypes were found exclusively in breast cancer cases, whereas four haplotypes were unique to the control group (H2, H6, H9 and H13). No significant difference of global haplotype frequencies was identified between both series (P=0.619). However, PHASE analyses performed with both series including the variants from both the promoter and the gene regions (with MAF >2%) revealed a strong significant difference with a P-value of 0.00092 (Table 2b). The haplotypes Hpg1 and Hpg33 were significantly over-represented in breast cancer cases, having P-values of 0.036 and 0.011, respectively.

Table 2 (a) Estimated haplotype frequencies of ZNF350 gene using promoter variants having a frequency>2% in the breast cancer case and control seriesa. (b) Frequencies of estimated haplotypes of ZNF350 gene using promoter and gene variants (Hpg) having a frequency>2% in the breast cancer case and control seriesb

The identification of tSNPs was then carried out in two subsequent steps, firstly by determining haplotype blocks, followed by the identification of tSNPs in each LD block. Based on the algorithm from Gabriel et al.,40 three LD blocks encompassing the ZNF350 gene have been identified in the French Canadians by the Haploview software (expectation maximization algorithm) (Figure 1). To confirm the reliability of our data, HapMap data (from caucasian population) have also been analyzed and although using a different panel of SNPs, three LD blocks were also identified. The composition and regions of recombination of the three blocks was relatively similar between the French Canadian and CEPH/CEU data sets. The promoter region is included in the first LD block, whereas the gene region is divided in two LD blocks. Thereafter, considering haplotypes having a frequency 1%, seven tSNPs were identified in the three LD blocks, namely variants p4, p5 and p11 found in block 1, g2 and g4 in block 2, whereas block 3 consists of variants g10 and g11.

Figure 1
figure 1

Haplotype blocks and tSNPs identified in the ZNF350 gene. (a) Predicted haplotype blocks using promoter (p) and genomic (g) variants identified in the case series showing a MAF higher than 2% (18 variants) in the French Canadian population. (b) Predicted haplotype blocks using HapMap data from CEPH/CEU cohort. The distance between the variants are similar than in (a). tSNPs identified on a block-by-block basis are denoted with an asterisk (*) above the variation number and have been selected based on haplotypes showing a frequency higher than 1%. Population haplotype frequencies are displayed on the right of each haplotype combination while the level of recombination is displayed above the connections between two blocks. Thick connections represent haplotypes with frequencies higher than 10%, whereas frequencies below 10% are represented by thin lines.

As shown in Supplementary Table 2, in silico analysis using MatInspector and Transcription Element Search software indicated that the variant c.-1779 T led to the creation of a new binding site for two octamer-binding proteins (POU5F1, POU3F3), an autoimmune regulatory element binding factor (AIRE) as well as neurofibromin 1 (NF-1). Several new binding sites are generated by the variant c.-1171C namely for SOX-9, CART-1, Delta factor, GATA-1 and NF-E, whereas binding element sequences for the heat shock factor HSF2 and zinc-finger transcriptional repressor ZNF217 are abolished. As for the variant c.-922C, binding sites for the glucocorticoid receptor and GKLF are created. Interestingly, the c.-874A variant abolished binding sites for TFs involved in transcription such as E-box binding factors and RNA polymerase II TFIIB and created new binding elements for c-myc myelocytomatosis viral oncogene homolog (c-Myc) and myogenin. In particular, the c.-201 G variant generated a binding site for the MAF and AP1 related factors (AP1R). Finally, c.-85 G resulted in new binding sites for the cellular and viral myb-like transcriptional regulators (MYBL) and abolished a Sp1 site.

To analyze the effect of promoter variants on ZNF350 transcription, the four promoter haplotypes showing the highest frequency namely H4, H8, H11 and H12, were analyzed using luciferase assays in the ZR-75-1 and MCF-7 breast cancer cell lines. The haplotypes that were present exclusively in one or two breast cancer cases (H1, H3 and H10) were not used in the analysis because they are under represented in the analyzed population (Table 2a). To discriminate the individual effect of each variant on transcription, single-variant haplotypes were generated by directed mutagenesis. The commonest haplotype H8 was used as reference for statistical analysis. H4 did not induce any significant difference of transcriptional activity, whereas the haplotypes H11 and H12 increased significantly the expression of the luciferase gene in both cell lines (Figure 2). Compared with the common haplotype H8, the only difference with haplotype H11, which revealed a significant higher luciferase activity (by more than 2.5 fold on its own), resides in the presence of the variant c.-874 G>A variant (p7).

Figure 2
figure 2

Luciferase assays. Effect of multiple promoter variants on ZNF350 gene promoter activity using luciferase reporter assay. ZR-75-1 cells were transiently co-transfected with the Renilla reporter plasmid (pRL) as a transfection control. Each data represents mean±s.d. of five replicates. Data are shown as relative induction compared with the activity of cells transfected with the empty pGL3-basic luciferase reporter vector. (**P<0.01).

As for the haplotype H12, it is composed of multiple variants not found in H8, namely p1, p4, p5, p8 and p12 and showed also a significant increase of luciferase expression. In addition, each single variant on its own led also to significant increased expression of luciferase activity.

Discussion

The vast majority of genes identified so far and showing a high/moderate penetrance in breast cancer susceptibility are directly involved in DNA repair mechanisms and cell cycle control (BRCA1, BRCA2, RAD51C, PALB2, BRIP1, TP53, PTEN, ATM and CHEK2),41, 42, 43, 44, 45 therefore proteins involved in these mechanisms and/or interacting with BRCA1 or BRCA2, such as ZNF350, represent excellent candidate genes to be studied regarding their potential implication in breast cancer predisposition. The 96 non-BRCA1/2 breast cancer cases included in our cohort have been selected based on their strong family history and come from 96 high-risk French Canadian breast and ovarian cancer families displaying multiple individuals affected with breast cancer. This study design has been demonstrated to substantially decrease the number of cases and controls to achieve the same magnitude of power compared with studies based solely on breast cancer cases unselected for family history.46

In a previous analysis, we identified a potential association of genomic variations located in the 5′-part of the ZNF350 gene with breast cancer predisposition.27 However, as described in the previous section, further analyses of the variants located within the promoter region (2 kb) revealed no significant association of these variants with breast cancer susceptibility based on their MAF observed in breast cancer and control individuals. Moreover, haplotype analyses using exclusively the promoter variants support this observation. Nonetheless, when using a combination of promoter and gene variants for haplotype prediction, the analyses revealed a potential significant over-representation of Hpg1 and Hpg33 in breast cancer cases. Hpg1 is considered the common allele, whereas Hpg33 is characterized particularly by the presence of several nucleotide changes, such as the p12 (c.-85C>G) and g17 (c.1900C>T) variants (Table 1 and Figure 2b). Of interest, p12 is the closest promoter variant of the 5′-UTR region of the ZNF350 gene, which supports the association described previously.27

As seen in Figure 2, determination of the haplotype blocks including promoter and gene variants clearly identified a strong LD breakage between the promoter and gene regions. This LD breakage could explain the absence of breast cancer association for the promoter variants in contrast to previous results regarding the involvement of gene variants located in the 5′portion of the ZNF350 gene.27 Moreover, the g2 and g4 variants have been identified as tSNPs by the Haploview program. Although the D’ value observed between p12 and g2 was high (Supplementary Figure 1) and did not confirm the first LD breakage observed between the promoter and gene regions as illustrated in Figure 1, the second LD breakage predicted in the vicinity of g8 and g10 variants is confirmed by the low D′ values associated with both variants.

Considering that the expression of the ZNF350 gene is crucial for cell cycle control and that this expression has been reported to be altered in cancer, we evaluated the impact of promoter variations on gene expression using luciferase assays. Given that p4-5-8-11-12 are included in H4 and that this haplotype did not trigger any significant modulation of transcriptional activity, we can conclude that p1 and p7 are likely the variants responsible for the upregulation of luciferase expression. As illustrated in Figure 2, each single variant studied (p1, p4, p5, p7, p8, p11 and p12) possesses the capacity on its own to increase the transcriptional activity of the ZNF350 gene promoter, with the c.-874 G>A variant producing the highest increase in expression. However one has to keep in mind that obviously a complex combination of variants is likely involved in the specific expression of each observed haplotype.

In the same line of thoughts, regarding the TFs potentially involved in the modulation of transcriptional activity related to the presence of the p1 (c.-1779C>T) and p7 (c.-874 G>A) variants specifically, the c.-1779 T variation leads to the creation of a new binding site for POU5F1 (OCT3/4), whereas the c.-874A abolished binding elements for MYC-MAX and TFIIB and creates a new binding sequence for c-Myc. The octamer-binding protein POU5F1 is known to control pluripotency of embryonic stem cells and is required for the initial formation of a pluripotent founder cell population in the mammalian embryo.47 POU5F1 is a member of the POU family of transcriptional activators, which control the expression of its target genes through binding of an AGTCAAAT consensus motif sequence.48, 49 Of interest, POU5F1 has been shown to be expressed exclusively in human breast cancer cells, being not detected in normal breast tissue. In addition, the overexpression of this TF has been demonstrated to induce the expression of the endogeneous fibroblast growth factor-4 gene in human breast cancer cells.50 The potential activation of ZNF350 gene expression potentially triggered by the binding of the POU5F1 protein to its promoter is in accordance with the predicted cell proliferation following POU5F1-binding effect combined with the increased repression (caused by the increased expression of ZNF350) of the GADD45A protein, which represents a growth arrest-associated gene.12

As for TFIIB, it is involved in start site selection, promoter binding and promoter bending during initiation. This protein is a component of the set of basal TFs required to allow specific binding of the RNA polymerase II protein on promoter sequences,51, 52 whereas MYC-MAX dimer has been demonstrated to activate transcription of reporter genes in an E-box-dependent manner.53, 54, 55 Despite the suppression of binding sites for TFs known to activate gene transcription such as MYC-MAX and TFIIB, it seems that the creation of a c-Myc-binding site by the c.-874 G>A variant could overturn the disruption of the potential promoter-binding element of MYC-MAX and TFIIB, known to stimulate transcription. Indeed in addition to its heterodimerization with MAX, the MYC protein could also form homodimers to bind DNA.56, 57 Moreover, MYC functions independent of MAX, such as the regulation of Pol III, have recently been demonstrated in a Drosophila model.58 Altogether, this supports the potential action of MYC protein (without MAX) in the regulation of transcriptional activity of ZNF350 gene. Hence the creation of a new binding site for the c-Myc protein could be responsible for the upregulation of transcriptional activity observed in the presence of the c.-874A variation.

This study represents the first description of genomic variants influence at the promoter level of the ZNF350 gene, and the information is still very limited and scarce regarding the characterization of the ZNF350 gene in relation with breast cancer. Low levels of ZNF350 have been observed in tumor tissue, but on the other hand an increase of expression of ZNF350 is associated, together with BRCA1, with a repression of GADD45A, and could subsequently lead to an increase of DNA damage and carcinogenesis because the low expression of GADD45A could not induce cell cycle arrest. Taking this into account, it is tempting to speculate that the increase of ZNF350 expression triggered by the promoter sequence variations described herein could be involved in tumorigenesis initiation rather than in tumorigenesis development. However, this would have to be further tested in additional studies.