Introduction

In developed countries, breast cancer incidence is estimated as 7.8% at 80 years of age (Collaborative Group on Hormonal Factors in Breast Cancer 2001). Mutations in the two most important breast cancer susceptibility genes, BRCA1 and BRCA2, account for only 25% of families with hereditary breast cancer (Easton 1999; Ponder 2001). A lower proportion of hereditary breast cancer is explained by mutations in TP53 (Lalloo et al. 2003; Gasco et al. 2003), PTEN (Ueda et al. 1998; Ball et al. 2001; Guenard et al. 2007), STK11 (Giardiello et al. 2000), ATM (Stankovic et al. 1998; Chenevix-Trench et al. 2002; Szabo et al. 2004; Renwick et al. 2006), and CHEK2 (CHEK2 Breast Cancer Case-Control Consortium 2004) genes, and more recently, specific FANC genes have been associated with breast cancer susceptibility, namely, FANCD1 subsequently identified as BRCA2, FANCN/PALB2, and FANCJ/BRIP1 (Durocher et al. 2005; Seal et al. 2006; Rahman et al. 2007; Erkko et al. 2007; Tischkowitz et al. 2007).

FANCJ, also called BRIP1/BACH1, has been identified as deficient in Fanconi anemia (FA) complementation group J (Levitus et al. 2005). Many of the proteins encoded by FA genes, including the more recently identified FANCI (Smogorzewska et al. 2007), interact directly with each other to form a multisubunit nuclear complex and are involved in DNA repair processes (Medhurst et al. 2001). An increased risk to develop cancer has been reported for homozygous FA patients (Alter et al. 2003).

The BRIP1/BACH1/FANCJ gene is located at 17q22–q24 and encodes a helicase in which the C-terminal part is known to interact with BRCA1 via its BRCA1 C-terminal (BRCT) repeats (Cantor et al. 2001). The corresponding protein contributes to BRCA1-associated DNA repair function, and mutation of an essential catalytic residue (Pro47) found in most members of the DEAH family interferes with normal double-strand break (DSB) repair in a manner that was dependent on FANCJ binding to BRCA1 (Cantor et al. 2001). In addition, phosphorylation of the Ser990 residue triggers FANCJ interaction with BRCT domains of BRCA1, which is required for establishment of the G2 cell-cycle checkpoint response to DNA damage (Yu et al. 2003). Proper localization of FANCJ to the nucleus is impaired by the c.517C > T variant (p.Arg173Cys) (Lei et al. 2003), which has been suggested to have an effect on breast cancer susceptibility. Furthermore, enzymatic activity of FANCJ was found defective in two patients with germline FANCJ coding sequence mutations who experienced early onset breast cancer (Cantor et al. 2004). Germline mutations affecting domain activity or messenger ribonucleic acid (mRNA) expression were also identified in early onset breast cancer patients, suggesting an implication for FANCJ in breast cancer susceptibility (Cantor et al. 2001, 2004). Given the close relationship between BRCA1 and FANCJ and the observation that other helicases belonging to the RecQ family are implicated in cancer-predisposing syndromes, namely, the Bloom (BLM gene), the Werner (WRN gene), and the Rothmund-Thomson syndromes (RecQL4 gene) (Ellis et al. 1995; Yu et al. 1996; Kitao et al. 1998, 1999), this reinforces the rational basis for a potential implication of FANCJ in breast cancer predisposition.

In recent years, a number of studies have been conducted to evaluate the possible implication of the FANCJ gene with regard to breast cancer susceptibility (Rutter et al. 2003; Sigurdson et al. 2004; Lewis et al. 2005; Garcia-Closas et al. 2006; Vahteristo et al. 2006; Seal et al. 2006; Song et al. 2007; Frank et al. 2007). However, the great majority was performed either on specific variants (such as P919S and −64G > A) (Sigurdson et al. 2004; Frank et al. 2007), on breast cancer cases not selected on the basis of a family history (Garcia-Closas et al. 2006; Song et al. 2007), or on a limited number of cases from BRCA1/2-negative breast cancer families (Rutter et al. 2003; Vahteristo et al. 2006). Recently published results have clearly shown that inactivating truncating mutations of FANCJ confer susceptibility to breast cancer in high-risk non-BRCA1/BRCA2 families (Seal et al. 2006).

In this regard, given that deleterious mutations in BRCA1/BRCA2 were identified in only 24% of 256 high-risk French Canadian breast/ovarian cancer families (Simard et al. 2007), and that germline mutations in other genes are rare and do not seem to contribute substantially to breast cancer susceptibility in high-risk French Canadian breast cancer families (Durocher et al. 2006, 2007; Guenard et al. 2007; Desjardins et al. 2008), evaluation of the FANCJ contribution to breast cancer in these families is imperative. Therefore, the complete coding, promoter, and flanking intronic regions of FANCJ were analyzed in 96 breast cancer cases drawn from nonrelated BRCA1/BRCA2-negative French Canadian families with a high risk of breast cancer.

Materials and methods

Ascertainment of families

Recruitment of high-risk French Canadian breast and/or ovarian cancer families from Canada was part of a large ongoing interdisciplinary research program Interdisciplinary Health Research International Team on Breast Cancer (INHERIT BRCAs). More details regarding ascertainment criteria, experimental procedures, and the INHERIT BRCAs research program are described elsewhere (Vezina et al. 2005; Antoniou et al. 2006; Simard et al. 2007). Briefly, all patients were referred by physicians from Québec province, and they were also responsible for BRCA1/2 test result disclosure to participants. Ethics committees approved the study, and patients signed informed consent. Patients were screened for mutations and large genomic rearrangements of the BRCA1 and BRCA2 genes (Moisan et al. 2006). Subsequently, another component was designed for the “localization and identification of new breast cancer susceptibility loci/genes.” Ethics approval for this latter study was also obtained from the different institutions participating in this research project, and each participant knowing their inconclusive BRCA1/2 test results status had to sign a specific informed consent for their participation in this component. All participants had to be at least 18 years of age and mentally capable. A subset of 96 high-risk French Canadian breast/ovarian cancer families were recruited for this study according to the ascertainment criteria described previously (Durocher et al. 2006, 2007; Guenard et al. 2007; Desjardins et al. 2008). The mean age at diagnosis of these 96 individuals affected with breast cancer was 48.5 years (32–74). Genomic DNA extraction of the 96 French Canadian breast cancer cases as well as 73 healthy unrelated French Canadian individuals was performed as previously described (Durocher et al. 2006, 2007; Guenard et al. 2007; Desjardins et al. 2008). These control individual samples were recruited on a nonnominative basis as part of a long-term study aimed at characterizing the genetic variability in human populations approved by Institutional Ethics Review Board.

DNA/RNA isolation from immortalized cell lines and cDNA synthesis

Lymphocytes were isolated and immortalized from blood samples using the Epstein-Barr virus (EBV) in 15% Roswell Park Memorial Institute (RPMI) media, as previously described (Boukamp et al. 1990, 1998). Total RNA was extracted from EBV-transformed β-lymphoblastoid cell lines using TRI REAGENT® (Molecular Research Center Inc., Cincinnati, OH, USA) according to the manufacturer’s instructions. Following RNA extraction, reverse transcription of 5 μg of RNA was performed as previously described (Durocher et al. 2006). Genomic DNA from 96 high-risk French Canadian breast cancer individuals and 73 controls was extracted from peripheral blood using standard methods, either phenol–chloroform, Gentra kits (Minneapolis, MN, USA), or the QIAamp DNA Blood MAXI Kit (QIAGEN Inc., Mississauga, ON, Canada) following supplier instructions. Genomic DNA from immortalized cell lines was extracted using the cetyl trimethylammonium bromide (CTAB) method (Sambrook et al. 2001).

Polymerase chain reaction amplification, sequence analysis, and variant characterization

Amplification and direct sequencing of complementary DNA (cDNA) and genomic DNA of the FANCJ gene were conducted using primers selected with the Primer Express 2.0 software (Applied Biosystems, Foster City, CA, USA). These primers, listed in Supplemental Table 1, were chosen to amplify exonic, promoter, and flanking intronic regions of the FANCJ gene. A fluorescence-based direct sequencing method using BigDye® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) was performed to analyze both cDNA and genomic sequences. Sequence reactions were run on an Applied Biosystems 3730xl DNA analyzer, and sequence data were analyzed using the Staden Pregap4 and Gap4 programs (Bonfield et al. 1998). Each single nucleotide polymorphism (SNP) was tested for departure from Hardy–Weinberg equilibrium (HWE) by means of a χ2 test. All P values were two-sided, with 1 degree of freedom. Allelic distribution in both series was tested using a χ2 test. P values <0.05 were considered significant.

Computational analysis

To identify potential effects of variants located in the promoter region, a transcription factor search was performed using the program MatInspector from the GenomatixSuite (Quandt et al. 1995; Cartharius et al. 2005). The Exonic Splicing Enhancer (ESE) finder Web-based program (Cartegni et al. 2003) was used to determine the potential effect of variants found in the coding region of the FANCJ gene on pre-mRNA splicing. All tests were run under default threshold values. The NNSPLICE 0.9 program (Reese et al. 1997) was used to evaluate the potential effect of intronic variants identified in this study on the strength of potential splice sites leading to the splicing of pre-mRNA, whereas the SIFT and PolyPhen Web-based softwares were used to predict the effect of amino-acid substitution on protein structure.

Reporter gene assay

FANCJ promoter constructs were amplified from genomic DNA using either a 1,068-bp (long) or a 633-bp (short) fragment with primers introducing 5′ KpnI and 3′ XhoI sites (Supplemental Table 2), and these amplicons were thereafter subcloned into pGL3 basic vector (Promega, Madison, WI). Variant constructs were produced by site-directed mutagenesis (QuikChange Site-Directed Mutagenesis Kit; Stratagene, La Jolla, CA) or obtained by polymerase chain reaction (PCR) amplification and subcloning. Transient transfection assays of MCF-7 cells and dual-luciferase reporter assays were conducted as previously described (Duguay et al. 2004). Cells were harvested 48 h after transfection, which was performed in triplicate. Reporter-gene activity was corrected both by subtracting background signal and by normalizing firefly activity against renilla activity obtained from the same sample. Results are expressed as mean ± standard deviation (SD) for each triplicate, and luciferase activity was compared between the variant and its respective wild-type (WT) construct, with a two-tailed unpaired Student t test. P values < 0.05 were considered significant.

Linkage disequilibrium and haplotype analyses

The linkage disequilibrium analysis (LDA) program (Ding et al. 2003) was used to calculate pairwise linkage disequilibrium (LD; Lewontin’s |D′| and r 2 values) for each SNP pair identified in our breast cancer series (Lewontin 1964; Devlin and Risch 1995). The PHASE 2.1.1 software (Stephens et al. 2001; Stephens and Donnelly 2003) was used to estimate haplotype frequencies. SNPs having a minor allele frequency (MAF) ≥5% identified in both sample sets were used to perform this haplotype analysis. The algorithm was run five times, with a minimum of 100 permutations, under default conditions. Genotyping data from both series and from the Centre d'Etude du Polymorphisme Humain (CEPH) cohort (HapMap database) were used to identify LD blocks within the FANCJ gene using the Haploview software. Identification of tagging SNPs (tSNPs) within each LD block was conducted using the same software.

Electronic databases

ESE finder: http://rulai.cshl.edu/cgi-bin/tools/ESE3/esefinder.cgi?process=home

Haploview: http://www.broad.mit.edu/mpg/haploview

HapMap: http://www.hapmap.org

Matinspector: http://www.genomatix.de/matinspector.html

NCBI: http://www.ncbi.nlm.nih.gov

PHASE: http://www.stat.washington.edu/stephens/software.html

PolyPhen: http://genetics.bwh.harvard.edu/pph/

SIFT: http://blocks.fhcrc.org/sift/SIFT.html

Splice Site Prediction Program using Neural Networks (NNSPLICE; SSPNN): http://www.fruitfly.org/seq_tools/splice.html

TFSEARCH: http://www.cbrc.jp/research/db/TFSEARCH.html

Transcription Element Search System (TESS): http://www.cbil.upenn.edu/tess

UCSC Genome Bioinformatics: http://genome.ucsc.edu/

Results

FANCJ sequence analysis

Analysis of the promoter, whole-exonic, and intronic flanking sequences of the FANCJ gene in our 96 breast cancer individuals led to the identification of 42 variants, of which 17 displayed an MAF ≥5% (Table 1). These sequence variations were composed of 37 nucleotide substitutions, three deletions, and two insertions. Among these 42 variants, 13 were exonic, five and 24 were located in promoter and intronic regions, respectively, and 22 were novel, with the remaining 20 being reported in the SNP database (dbSNP build 127). Analysis of the promoter region, consisting of approximately 1,400 bp upstream of exon 1, led to identification of five additional variants, including four not reported in databases.

Table 1 Observed coding and intronic sequence variants and genotype frequencies in familial breast cancer cases and controls

Sequence variations displaying an MAF ≥5% in breast cancer cases were further assessed in a series of 73 unaffected individuals also of French Canadian origin. Rare changes (MAF <5%) located in the proximal region of these common variants (MAF ≥5%) were also analyzed, yielding a total of 38 nucleotide variants characterized in both sample sets. Of these, 14 were not observed in control individuals. Corresponding frequencies are denoted in Table 1. No deviation from HWE was observed for any of the nucleotide changes identified, nor was there a significant difference in terms of allelic frequency observed between both series based on single-marker analysis. The segregation of the variants identified was examined in families for which DNA material was available from multiple individuals (34 of the 42 variants), but no clear segregation between the identified variants and breast cancer was observed (data not shown).

Conservation of human FANCJ residues

Among the five amino-acid substitutions found in our breast cancer cases, the common c.2755T > C variant (p.Pro919Ser) is located within the BRCA1-binding domain (Cantor et al. 2001) and the four remaining variants (c.415T > G, c.517C > T, c.577G > A, c.823A > G) are situated within the Rad3-related DNA helicase domain of FANCJ (residues 1–888) (Cantor et al. 2001). As illustrated in Table 2, comparison of these amino-acid changes was performed across relevant species using data extracted from the University of California Santa Cruz (UCSC) database. Alignment of FANCJ orthologue sequences revealed that p.Ser139, p.Arg173, and p.Ile275 are strongly conserved in higher species, with p.Ile275 also found in more distant species such as Xenopus tropicalis and Tetraodon nigroviridis. The p.Val193 residue is poorly conserved, whereas the p.Pro919 residue is highly variable and not conserved in other species. Taken together, the comparison of these orthologues indicates that three variants, namely, c.415T > G (p.Ser139Ala), c.517C > T (p.Arg173Cys), and c.823A > G (p.Ile275Val), are under strong functional constraint. Of these five amino-acid substitutions, only p.Arg173Cys is predicted to be damaging or not tolerated for protein folding. The remaining four, which include two unreported amino-acid changes, are considered to be tolerated or benign.

Table 2 Sequence variants detected in human FANCJ and residues found in orthologues

In silico analysis

Variants located in the promoter region were evaluated for their potential effect on transcription-factor binding sites (TFBS). As illustrated in Table 3, whereas two variants (c.−141 − 331T > C and c.−141 − 280G > T) seem to have a limited effect, the c.−141 − 785delCTTTT, c.−141 − 64G > A and c.−141 − 55A > G changes demonstrate a stronger potential impact on several promoter binding sites (BS), involving in particular androgen receptor (AR), myogenic factor 4 (MF4), KLF4, ras-responsive element binding 1 (RREB1), and aryl hydrocarbon receptor (AHR) BS.

Table 3 Identification of transcription-factor binding sites (TFBS) potentially affected by variants located in the FANCJ promoter region

Regarding the possible effect of exonic variants on ESE, analysis of the scores obtained revealed that certain exonic variants markedly alter the binding capacity of putative ESE elements, either by reducing their values or by creating new BS for SR proteins. As illustrated in Table 4, BS scores affecting binding of SC35, SRp55, and SRp40 proteins are mainly disturbed by four variants, namely, c.517C > T (SRp55), c.2637G > A (SC35), c.2755T > C (SC35, SRp40 and SRp55), and the 3′-UTR c.3750 + 396delCT variation, which seems to reduce the binding of the SC35 protein. All these variations could result in promotion or repression of splicing, or more particularly, on alternative splicing.

Table 4 Identification of exonic splicing enhancer (ESE) potentially affected by exonic variants in the FANCJ gene

The potential effect of the 24 intronic variants on the strength of splice sites leading to the splicing of pre-mRNA was also evaluated by in silico analyses. The results obtained indicated that few sequence variations could affect pre-mRNA splicing, the score variations unlikely causing any marked changes in alternative splicing (data not shown). To further assess the possible effect of exonic and intronic variants on splicing at the mRNA level, cDNA material from our 96 high-risk breast cancer individuals was also analyzed. This analysis allowed us to confirm the presence of all exonic variants previously identified on genomic DNA with the exception of c.−138T > C, which could not be detected due to 5′-end localization of this variant on mRNA. In agreement with in silico studies, analysis of cDNA material revealed no aberrant splicing, even in carriers of variants that were suggestive to alter splicing. In addition, analysis of cDNA material from immortalized lymphocytes of a number of these breast cancer individuals treated with puromycin, an inhibitor of nonsense-mediated decay, confirmed these results (data not shown).

Luciferase assay

As illustrated in Fig. 1, transient transfection of FANCJ promoter constructs revealed a slightly significant reduction for the c.−141 − 280G > T variant in reporter activity relative to the WT FANCJ promoter construct, showing a 15% reduced ability to drive the firefly luciferase reporter gene. As for the four other promoter variants examined, no significant effect was observed on transcriptional activation.

Fig. 1
figure 1

Promoter activity of long (1,068-bp) and short (633-bp) constructs carrying variations present in the promoter. An asterisk (*) indicates P < 0.05 (Student t test) relative to its respective wild-type (WT) construct

Linkage disequilibrium and haplotype analyses

Pairwise LD between all 42 SNPs identified in the case series was calculated using the LDA program (data not shown). Complete LD was found between SNPs 1 (c.−141 − 785delCTTTT) and 42 (c.3750 + 483T > C), which are located 181 kb apart (|D′| = 1.0), indicating that LD at the FANCJ locus does not decline significantly with distance. However, regarding specific association involving SNPs 16 (c.508 − 281A > G) and 17 (c.508 − 31C > G), lower LD values were observed, mainly with variants located within the 3′UTR region of the gene. As expected, r 2 coefficient calculated for the FANCJ gene region displayed lower values, as this measure was dependent on allelic frequency, which was well represented by the large spectrum of r 2 values ranging from 0 to 1.0. Indeed, the majority of lowest r 2 values were observed for 19 of the 25 SNPs displaying an MAF <5%.

PHASE program estimation of FANCJ haplotypes reconstructed from the 17 SNPs having an MAF ≥5% in both sample sets identified 17 haplotypes exhibiting a frequency ≥1% (Fig. 2a). These 17 haplotypes represent >80% of all haplotypes estimated in both series. The permutation test of these 17 SNP haplotypes indicated no significant difference in the estimated haplotype frequency distributions between both groups (P = 0.327, data not shown).

Fig. 2
figure 2

a Haplotype frequencies and combined frequencies of haplotypes exhibiting a minor allele frequency (MAF) ≥1% in cases and controls, as reconstructed by the PHASE program. Single nucleotide polymorphisms (SNPs) denoted with an asterisk (*) are insertions and are represented by the I letter in haplotypes. b Haplotype blocks and tagging SNPs (tSNPs) identified using SNPs showing an MAF >5% (17 SNPs). Tagging SNPs identified on a block-by-block basis using haplotypes showing a frequency >5% are denoted with an asterisk (*). Haplotype frequencies are displayed on the right of each haplotype combination, whereas the level of recombination is displayed above the connections between two blocks. Thin connections represent haplotypes with frequencies between 1% and 10%, whereas haplotypes with frequencies >10% are represented by thick lines

Given that no significant difference was observed between cases and controls following PHASE analyses, that no deviation from HWE for all SNPs in either series was observed, and that no significant difference in haplotype frequencies was identified, these observations support the use of individuals from both series as one population to identify LD blocks and tSNPs of the FANCJ gene in the French Canadian population.

LD-block assignment was conducted using the Haploview software and led to the identification of three LD blocks in the FANCJ gene (Fig. 2b). The 5′-block identified (promoter region to intron 5) encompasses approximately 16 kb, whereas LD blocks two (intron 8–16) and three (exon 19–20) covered 59 kb and 3 kb, respectively. Following LD-block partition, tSNPs were identified within each block. When considering haplotypes having a frequency ≥5%, eight tSNPs were identified in the French Canadian population, which represents >95% of all haplotypes. SNPs 4 (rs2048718), 7 (rs4988340), and 12 were identified as tSNPs within the first block, whereas SNPs 22 and 25 (rs11390869, rs2191248) represent the middle block, with the third block being defined by SNPs 34, 35, and 36 (rs4986765, rs4986764, and rs4988357).

Using HapMap data from the CEPH/CEU cohort, three LD blocks were also identified (Fig. 2b). Out of 17 SNPs used in our haplotype block analysis in French Canadian individuals, 11 were part of the CEPH/CEU sample set, with nine present at an MAF ≥5% in the CEPH/CEU sample set. Despite using a different panel of SNPs, composition of the three blocks was relatively the same between French Canadian and CEPH/CEU data sets. Indeed, similar regions of LD breakage were observed following both analyses.

Discussion

To increase the power of our study aiming at finding genetic variants involved in breast cancer susceptibility, we selected individuals from our cohort of French Canadian non-BRCA1/2 high-risk breast cancer families (one individual per family). Given that this population is considered as a founder population, this allowed us to increase the likelihood of potentially identifying genetic variants associated with breast cancer (Antoniou and Easton 2003).

The previous identification of BRCA2/FANCD1 and PALB2/FANCN (Rahman et al. 2007; Erkko et al. 2007) as breast cancer susceptibility genes further strengthens the implication of FA genes in breast cancer susceptibility. Inactivating truncating mutations of FANCJ have been demonstrated to confer susceptibility to breast cancer in high-risk non-BRCA1/BRCA2 families from the UK (Seal et al. 2006). Although no deleterious germline mutation leading to a premature termination of the protein was identified in our study, 42 nucleotide variations were found, including 22 novel variants not previously reported in nucleotide databases (dbSNP and UCSC databases).

A number of case-control studies investigated the potential association of FANCJ germline mutations with breast cancer susceptibility. Among the few polymorphisms/mutations not identified in our study, and for which a significant association has been demonstrated, P47A and M299I were originally described by Cantor et al. (2001). However, these breast cancer associations were not subsequently confirmed in large breast cancer cohorts (Rutter et al. 2003; Lewis et al. 2005; Seal et al. 2006). Two silent variants observed in our French Canadian cohort (c.2637G > A: p.Glu879Glu and c.3411C > T: p.Tyr1137Tyr) have been investigated in several breast cancer case-control studies regarding their implication with breast cancer susceptibility, but none of these studies reported a significant association with breast cancer for both nucleotide alterations (Cantor et al. 2001; Luo et al. 2002; Karppinen et al. 2003; Rutter et al. 2003; Lewis et al. 2005; Vahteristo et al. 2006).

Whereas the c.517C > T (p.Arg173Cys) variant has been reported to impair nuclear translocation (Yu et al. 2003), this nucleotide change, together with another rare missense variant (c.577G > A: p.Val193Ile), do not seem to be associated with breast cancer in other studies (Luo et al. 2002; Rutter et al. 2003; Lewis et al. 2005; Vahteristo et al. 2006; Seal et al. 2006). As for the common c.2755T > C variant (p.Pro919Ser) located in the BRCA1-binding domain spanning residues 888–1,063 (Cantor et al. 2001) and which has been previously reported to be associated with an increased risk of breast cancer to age 50 (Sigurdson et al. 2004), we and others did not highlight any significant association with breast cancer risk (Cantor et al. 2001; Luo et al. 2002; Karppinen et al. 2003; Rutter et al. 2003; Vahteristo et al. 2006; Garcia-Closas et al. 2006; Seal et al. 2006; Frank et al. 2007).

Of the five variations identified in the promoter region, the sole variant assessed in previous studies (c.−141 − 64G > A) demonstrated no association with either breast cancer (Rutter et al. 2003; Sigurdson et al. 2004; Song et al. 2007; Pharoah et al. 2007; Frank et al. 2007) or with bladder cancer risk (Figueroa et al. 2007). In concordance with these findings, we found this sequence variation to be more frequent in the control series, further supporting its nonassociation with breast cancer risk. With the exception of c.205 + 162T > C and c.1281 + 91insT, all reported intronic variants were evaluated in the context of breast cancer (Karppinen et al. 2003; Rutter et al. 2003; Lewis et al. 2005; Vahteristo et al. 2006; Garcia-Closas et al. 2006; Song et al. 2007; Pharoah et al. 2007), and none were shown to be associated with an increased risk.

In agreement with our results, two recent studies, performed on large cohorts of breast cancer cases, included four variants identified in the study (c.1340 + 109G > A, c.− 141 − 64G > A, c.508−31C > G, c.508−281A > G), and they came to the conclusion that none were associated significantly with breast cancer risk (Pharoah et al. 2007; Song et al. 2007), although rs4988344 (c.508 − 31C > G) seemed to be associated with an increased risk of ovarian cancer (Song et al. 2007). Although eight variants could not be evaluated further due to material availability from other family members, none of the variations examined clearly displayed a segregation with the disease. However, partial or incomplete segregation patterns were recently identified for susceptibility alleles that confer modest increase in breast cancer risk (Rahman et al. 2007), thus reinforcing the need for extensive analysis of the variants identified in our study.

Three of the five promoter variants reported here are predicted to impact TFBS affinity, which is consistent with other results obtained in 16 cell-cycle checkpoint genes (Belanger et al. 2005). The identification of RREB1 and KLF4 BS in the FANCJ promoter, involved in the regulation of the cell-cycle checkpoint regulator p16, and in G1/S and G2/M transitions of the cell cycle, respectively, points toward a possible cell-cycle-dependent regulation of FANCJ, which is concordant with the known cell-cycle-dependent phosphorylation and interaction of FANCJ with BRCA1 (Yu et al. 2003). The promoter variants identified could also be involved in hormone activation or hormone response. Indeed, the c.−141 − 785delCTTTT variant seems to abolish the sole AR BS present in the promoter region analyzed, whereas the c.−141 − 64G > A variant creates a potential AHR BS, the latter known to be implicated in repression of E2F interaction with promoter of cell-cycle genes (Marlowe et al. 2004) and of estrogen-receptor (ER) binding to estrogen-responsive elements (Kharat and Saatcioglu 1996), resulting in inhibition of proliferation. Moreover, a BS for the Sp2 transcription factor, disrupted by both c.−141 − 64G > A and c.−141 − 55A > G variants, is known to interact with the G1/S phase transition-associated E2F1 transcription factor and to regulate luteinizing hormone (LH) receptor at the transcription level.

With the exception of the c.−141 − 280G > T variant, for which a slightly significant difference in the ability to drive the firefly luciferase reporter gene was identified, no other variant identified in this study seems to cause a change in the basal transcriptional activity of the FANCJ gene. Further investigations are required to determine the potential impact of this promoter variant on binding of transcription factors to the FANCJ promoter region.

Regarding alternative splicing of the FANCJ gene, our findings are consistent with results from Cantor et al. (2001) in which no aberrant spliced mRNA species was identified in individuals from breast cancer families. Altered splicing has, however, been reported in individuals from FA complementation group J (Levran et al. 2005; Levitus et al. 2005) and has also been observed in an individual affected with breast cancer from a BRCA1/2-negative family (Seal et al. 2006).

Haplotype reconstruction from the 17 SNPs having an MAF ≥5% in both sample sets led to identification of 11 haplotypes (frequency ≥2%) (Fig. 2a). The only other haplotype analysis identified nine haplotypes, with a frequency ≥2% (Rutter et al. 2003). Although similar, the greater diversity obtained here is a reflection of the number of SNPs used for haplotype analysis: 17 in this study as opposed to nine (all included in our set) in Rutter et al. (2003).

Analysis conducted with the Haploview software identified three distinct blocks (Fig. 2b). A previous study identified two haplotype blocks located at the 5′ and 3′ ends with a large region of LD breakage comprising exons 1–12 (Rutter et al. 2003). Another investigation performed by Song et al. (2007) also identified two haplotype blocks but with LD breakage between introns 6 and 9. We identified three haplotype blocks, which is concordant with results obtained from HapMap data, in which three haplotype blocks were also identified. Another study including 9-kb upstream and 9-kb downstream of the FANCJ gene identified 12 tSNPs (Song et al. 2007), whereas eight tSNPs were identified among our French Canadian individuals. This number of tSNP is consistent with the number of tSNPs normally required at other gene loci and in other populations (Johnson et al. 2001; Weale et al. 2003), including the French Canadian population (Durocher et al. 2006, 2007). Although only two tSNPs (rs2191248 and rs4986765) have been tagged in both sample sets, this could be explained by the fact that several equivalent SNPs may represent a given LD block.

Analysis of promoter, exonic, and flanking intronic regions of FANCJ in 96 high-risk individuals from non-BRCA1/BRCA2 French Canadian families did not lead to the identification of deleterious truncating mutation, exon deletion, or retention of an intronic portion. However, this analysis led to the identification of 42 variants, 22 of which are novel, including two common variants displaying a frequency of 43% and 38%. Moreover, this study reports four novel variations in the promoter region of FANCJ potentially affecting transcription factor BS, one of these significantly altering the promoter activity in a reporter-gene assay. Further studies are therefore warranted to fully establish the functional contribution of these sequence changes on FANCJ gene expression.