Introduction

A new class of epigenetic gene expression regulators has been identified about 20 years ago, microRNAs (miRs). miRs are small, non-coding RNAs of 19 to 22 nucleotides (nt) that bind to the 3′- untranslated regions (3′-UTRs), or less frequently to the coding sequence or 5′-UTR of targeted mRNAs, thereby inducing their degradation or inhibiting their translation, resulting in gene downregulation. Recognition of an mRNA by a targeting miR is mediated by the seed region (nt 2_7 of a miR) around which partial complementarity extends. Alterations of miR/mRNA interactions are likely to impair the control of gene expression, should they occur through mismatches introduced by SNPs within miR genes or miR-binding target sites, or through variations in miR expression levels.

The involvement of miRs in genetic susceptibility to diseases, and particularly to complex disorders such as cancer, is being extensively investigated since the first report of a phenotype-causing variant creating a potential illegitimate miR target site in 2006.1 However, the focus is principally on finding genetic association with common SNPs located in miR genes or miR-binding target sites,2 and very few studies aimed to identify rarer variants in case–control series even if large scale resequencing approaches focusing on miR genes and 3′-UTR are starting to be used.3 This paucity of mutation screening studies could stem from the difficulty in interpreting rare variants that would be identified by such an approach, miR true targets remaining rather scarce, their genes poorly characterized, not to mention their regulation. Most miR genes are transcribed by RNA polymerase II into primary miR transcripts, pri-miR, which are about 1000 nt long and contain one or more stable stem-loop structures. They are subsequently processed first to shorter pre-miRs (70_100 nt), then to mature miRs. Variants falling in pre-miR or mature miR sequences can benefit from the use of computational algorithms to predict their potential effects and be tested in functional assays, as has been done for the eight rare variants identified in precursor or mature miRs during the screening of 59 miR genes on the X-chromosome in 193 males with schizophrenia spectrum disorders.4 For variants identified outside these sequences, prediction of the effect they could have on miR production is nearly impossible, even if evidence indicates that pri-miR processing is regulated during development in order to achieve optimal miRNA expression patterns5 and can be dysregulated in diseases.6 As a result, the problematic issue of sorting out the candidate deleterious variants from the more abundant functionally irrelevant ones remains rather perilous.

BRCA1 (MIM# 113705) and BRCA2 (MIM# 600185) genes are the two major breast cancer susceptibility genes: pathogenic variants in these genes explain nearly 25% of the familial risk for breast cancer.7 Other high-risk susceptibility genes such as TP53, PTEN, CDH1, STK11 and PALB2 have been identified, as well as a number of intermediate-risk genes such as ATM, CHEK2, NBN and NF1, and close to 100 low-risk alleles.8, 9 Despite this, all in all, only about 50% of the excess risk is explained today. Evidence is starting to emerge that some breast cancer genes may harbour different types of variants with inversely correlated cancer risks and allele frequencies: for example, BRCA2 and TP53, on top of a myriad of high-risk rare variants, also harbour more frequent variants associated with lower breast cancer risk, p.K3326X10 and p.R337H,11 respectively. This has prompted scientists to search in BRCA1/2-negative breast cancer families for variants deregulating the expression of the BRCA1 and BRCA2 genes, possibly associated with lower cancer risks than inactivating BRCA1/2 variants. For our part, here, we decided to focus our attention on miRs, as we showed a few years ago that miR-146a and miR146b-5p negatively regulate the expression of BRCA1.12 MiR-146a and miR-146b-5p are well studied miRs with multiple targets, whose gene and processing are rather well known. MIR146A (MIM# 610566) has two exons separated by a ~16 kb intron13, 14 and is transcribed into a pri-miR of 2329 nt (GenBank accession number: EU147785.1); exon 2 contains the sequence of the pre-miR, which starts 53 nt from its 5′-end. MIR146B (MIM# 610567) has only one exon and the transcription start site is located ~700 nt upstream of the mature miR-146b-5p sequence,14 but is not defined precisely, nor is identified the polyadenylation site. We reasoned that variants impacting these genes, hence this regulation, might predispose to breast cancer. We thus screened the MIR146A and MIR146B genes in a familial breast cancer case–control study, GENESIS, with the hope to find activating mutations that would lead to enhanced expression and/or enhanced BRCA1 affinity. We also screened BRCA1 and BRCA2 3′-UTRs in GENESIS in order to identify potential 3′-UTR-dependent mechanisms of gene dysregulation.

Subjects and methods

Study subjects

The study was conducted on a subgroup of subjects from the GENESIS (GENE SISters) French national study.15 The index cases are women diagnosed with infiltrating mammary adenocarcinoma who attended a cancer genetics clinic in France and were found negative for BRCA1/2 pathogenic variants after extensive diagnostic screening (BRCA1/2 coding sequences and intron–exon boundaries screening and search for large gene rearrangements). The index cases were selected on the basis of a family history of breast cancer consisting of at least one sister with a breast cancer. The recruitment was done through the French national network of cancer genetics clinics (‘Genetics and Cancer Group’ of UNICANCER). The controls were female friends or colleagues of index cases matched by age (±3 years), unaffected with cancer at the time of ascertainment. Affected and unaffected sisters as well as other family members were included in the study, if available. Inclusions began in April 2007 and ended in December 2012. Information about ethnic origin was self-reported by study subjects. We considered as Caucasians all subjects with two parents reported as ‘Caucasian origin’. In the present study, we analysed the subgroup encompassing the first 716 index cases and the first 619 controls included for which blood samples were available and fulfilling the GENESIS study criteria. Their characteristics are shown in Supplementary Table 1.

Ethics statement

All participants gave written informed consent and the GENESIS project was submitted to the appropriate ethics committee (CCP Ile-de-France III) on 18 September 2006 and was approved on 3 October 2006. CNIL authorization: The CNIL (French data protection Authority) has approved this study (22/05/2006).

DNA extraction

Genomic DNA has been extracted from blood samples using the DNA extractor Autopure LS (Qiagen, Courtaboeuf, France), and DNA handling (DNA normalization and aliquoting) has been done using a TECAN EVO instrument. DNA quality and purity has been evaluated by measuring absorbance on a spectrophotometer. For all DNA samples, the A260/A280 ratio has been calculated. Good-quality DNA has been obtained for the vast majority of samples (ratio of 1.7_2.0). Only subjects with good-quality DNAs have been selected for this study.

Mutation screening

MIR146A (99 nt; hg19 coordinates: chr5:159,912,359_159,912,457) and MIR146B (73 nt; hg19 coordinates: chr5: chr10:104,196,269_104,196,341) were amplified separately in one fragment containing 274 nt upstream and 50 nt downstream, and 233 nt upstream and 59 nt downstream of the sequence of the primary transcript (pri-miR), respectively.

We designed primers for the amplification by PCR of the 3′-UTR of BRCA1 (1382 nt; hg19 coordinates: chr17:41,196,313_41,197,694) and of that of BRCA2 (902 nt; hg19 coordinates: chr13:32,398,771_32,399,672) in 9 and 7 overlapping fragments, respectively.

All primers were tailed in 5′ with M13 universal sequences for subsequent sequencing reactions. All PCR fragments were screened for genetic variants using high-resolution melting (HRM) analysis, except for fragment BRCA1_4F/R that encompasses an Alu sequence (nt 580_881) difficult to analyse by HRM and was therefore sequenced. The primers’ sequences, PCR elongation temperatures and size of the fragments are shown in Supplementary Table 2. The reference sequences used for the description of the variants are: NR_029701.1 (MIR146A), NR_030169.1 (MIR146B), NM_007294.3 (BRCA1) and NM_000059.3 (BRCA2).

MIR146A/B rare variants have been submitted to the Leiden Open Variation Database (LOVD 3.0 shared installation; http://www.databases.lovd.nl/shared) and those identified in BRCA1 and BRCA2 3′-UTRs to the UMD-BRCA1 and UMD-BRCA2 databases, respectively, (http://www.umd.be/BRCA1/ and http://www.umd.be/BRCA2/).

Vectors

The Luc-BRCA1 and Luc-BRCA2 3′-UTR wild-type (WT) vectors were constructed by cloning the 3′-UTR of BRCA1, amplified by PCR using forward primer 5′-TCGCGACGTCCTGCAGCCAGCCACAGG-3′ and reverse primer 5′-GGAATTCCATATGGTTTGCTACCAAAGTTTATTTGCAGTG-3′ or the 3′-UTR of BRCA2, amplified by PCR using forward primer 5′-TCGCGACGTCGTCGCATTTGCAAAGGCGAC-3′ and reverse primer 5′-GGAATTCCATATGAATCAGTGCCAATTTGAAAGC-3′, respectively. All primers contained a restriction site (underlined) upstream of the specific sequence: forward primers, Aat II; reverse primers, Nde I. The PCR fragments were each cloned between the Aat II and Nde I restriction sites in the pGL3-spacer vector directly downstream the firefly luciferase coding sequence. The integrity of the 3′-UTRs was checked by sequencing. Variants were introduced in the WT vectors using the QuickChange XL Site-Directed Mutagenesis kit (Stratagene, Amsterdam, The Netherlands) according to the manufacturer’s instructions.

The pRL-SV40 Renilla luciferase vector (Promega, Charbonnières-les-Bains, France) was used as a transfection control.

All the plasmids used for transfections were prepared with the Nucleobond Xtra Midi Plus kit (Macherey-Nagel, Hoerdt, France) following the manufacturer’s instructions.

Luciferase assay

HeLa, HBL-100 and MCF7 cells were grown in Dulbecco’s modified Eagle medium supplemented with 10% fetal calf serum and 1% penicillin–streptomycin, and 1 mm sodium pyruvate and non essential amino acids for MCF7 (Gibco, Cergy Pontoise, France), in a 5% CO2 incubator at 37 °C. They were seeded at 15 000 cells per well in 96-well plates 17 h before being transfected with the plasmids encoding the firefly and Renilla luciferase proteins, pGL3-3′-UTR-BRCA1 WT, pGL3-3′-UTR-BRCA2 WT or mutant vectors (150 ng of each), using the jetPEI reagent (Polyplus transfection, Illkirch, France) according to the manufacturer’s instructions. Cells were washed 24 h after transfection. Forty-eight hours post-transfection, cells were washed with 1 × PBS and lysed. Firefly and Renilla luciferase activities were measured using the Dual-Glo Luciferase Assay (Promega) according to the manufacturer’s instructions. Firefly luciferase expression was adjusted to Renilla luciferase expression to normalize for transfection efficiency.

Bioinformatics prediction

The following online programs were used to assess miRs-binding sites on the 3′-UTR of BRCA1 and BRCA2 and/or the potential effects of variants on miRs binding with the default parameters:

microRNA.org: http://www.microrna.org/microrna/home.do

miRcode: http://www.mircode.org/

miRDB: http://mirdb.org/miRDB/

miRmap: http://mirmap.ezlab.org/

TargetScan: http://www.targetscan.org/

RegRNA 2.0: http://regrna2.mbc.nctu.edu.tw/

The potential effects of variants on mRNA secondary structures were assessed using the following online programs:

RNAfold: http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi

RegRNA 2.0: http://regrna2.mbc.nctu.edu.tw/

Results

The MIR146A and MIR146B genes and the 3′-UTR of the BRCA1 and BRCA2 genes were screened in a total of 1335 individuals of the French national GENESIS study: 716 breast cancer index cases, negative for BRCA1/BRCA2 pathogenic variants in the coding sequence and in intron/exon junctions as well as for large rearrangements, having a sister also diagnosed with breast cancer; and 619 controls without cancer at the time of enrolment. Whenever possible, the presence of the rare variants identified was verified in affected and unaffected relatives of the index cases.

Apart from the well studied frequent variant, n.60C>G (rs2910164; dbSNP MAF not available), only one rare variant was identified in MIR146A in an index case, n.-143T>A (Table 1 and Supplementary Table 3). We identified four rare variants in MIR146B: two variants, n.-130G>C and n.-9C>T, each found in one index case, were reported for the first time, while the other two, n.-102T>C present in several cases and several controls and n.1C>T present in two controls had already been reported, albeit at different frequencies (Table 1 and Supplementary Table 3). None of the variants identified only in index cases in MIR146A nor in MIR146B affected the pre-miR sequences. As shown in Supplementary Figure 1, it has not been possible to verify if any of these variants co-segregated with the disease because no affected family member participated in the study.

Table 1 Rare variants identified during the screening of the MIR146A/B gene coding sequences and of the 3′-UTR of the BRCA1/2 genes in the GENESIS case/control study (grey rows highlight variants found only in screened index cases, whose pedigrees are shown in Supplementary Figure 1)

Three frequent variants were detected in the 3′-UTR of BRCA1: c.*421G>T (rs8176318; dbSNP MAF=0.324); c.*855delA (rs33947868; dbSNP MAF=0.498); c.*1287C>T (rs12516; dbSNP MAF=0.342), and three in the 3′-UTR of BRCA2: c.*105A>C (rs15869; dbSNP MAF=0.161); c.*369A>G (rs7334543; dbSNP MAF=0.222); c.*532A>G (rs11571836; dbSNP MAF=0.197). On top of these, we also detected five rare variants in BRCA1 3′-UTR and one in BRCA2 3′-UTR, all with a MAF<0.001 in our samples (Table 1 and Supplementary Table 3). Among the five rare variants detected in the 3′-UTR of BRCA1, three had been detected in previous targeted screening and/or in the 1000 Genomes Project: c.*291C>T, found in one control, c.*713C>T, present in one index case and c.*750A>G, found in several cases and several controls. One index case (GE0614) carried both BRCA1 c.*750A>G and MIR146B n.-102T>C. The two BRCA1 3′-UTR new variants identified each in one index case were c.*780C>T and c.*1013A>G. No variant was identified within or in close proximity to the miR-146a- and miR-146b-5p-binding site (nt c.*489_507). The only rare variant identified in the 3′-UTR of BRCA2, c.*172G>A, was identified in an index case and has never been reported before (Table 1 and Supplementary Table 3).

To evaluate the potential causality of the variants identified in index cases in BRCA1/BRCA2 3′-UTRs, we performed luciferase reporter assays in three human cell lines: HeLa (cervical carcinoma cells), HBL-100 (normal breast epithelial cells) and MCF7 (breast carcinoma cells). After transient transfection of luciferase reporter vectors containing either WT or mutated UTRs, we observe in some cases significant effect of the variants analysed on luciferase activity, that is, a slight increase or decrease of luciferase expression in the presence of some variants (Figure 1). However, none of these effects is seen in all the cell lines analysed and it is, therefore, difficult to extrapolate from these data to the situation in carriers, even if the results of the luciferase assays seem to be consistent from one study to another (for example, the absence of alteration of effect of luciferase expression on BRCA1 c.*750A>G in MCF7 cells has already been observed16).

Figure 1
figure 1

Relative luciferase activity after cotransfection of either the Luc-BRCA1 3′-UTR or Luc-BRCA2 3′-UTR reporter vectors into HeLa, HBL-100 or MCF7 cells carrying WT or mutated sequences, as indicated. Error bars represent SEM for four (HeLa) or three (HBL-100 and MCF7) independent experiments. *P<0.05; **P<0.01; ***P<0.001 (Student’s t-test with respect to cells transfected with WT sequences).

Potential miR-binding sites were predicted using the microRNA.org, miRcode, miRDB, miRmap and TargetScan programmes available online, as well as the interface provided by the integrated web server RegRNA 2.0 (Supplementary Figure 2). One miR-binding site was identified in the proximity of the BRCA1 variant c.*1013A>G, which precedes the sequence complementary to the seed region of miR-543 (microRNA.org prediction, Supplementary Figure 3). The variant increases miR-543/BRCA1 3′-UTR complementarity, but it should be noted that the other prediction programmes failed to identify a miR-543-binding site on WT BRCA1 3′-UTR (the test of the mutant 3′-UTR sequence was not allowed in the programmes’ design). TargetScan predicted a seed match for miR-323b-5p at position c.*172_178 of BRCA2 3′-UTR; c.*172G>A decreases miR-323-5p/BRCA2 3′-UTR complementarity, but as before, this finding was not confirmed with other algorithms (and again, it was not possible to test the mutant 3′-UTR).

Secondary structures of mRNA are important for post-transcriptional regulation as they affect binding of trans-acting factors. We, therefore, used the RNAfold secondary structure prediction programme available online to try to predict the consequences of the variants identified in the 3′-UTRs (Table 1 and Supplementary Figure 4). The BRCA1 variant c.*291C>T (identified in a control) or c.*750A>G (identified both in breast cancer cases and controls) had no or little effect on BRCA1 3′-UTR predicted conformation, respectively (Supplementary Figure 4A). By contrast, the three variants identified exclusively in cases in our study, c.*713C>T, c.*780C>T and c.*1013A>G induced larger effects, particularly c.*713C>T (Supplementary Figure 4B). The BRCA2 c.*172G>A variant had a small impact on BRCA2 3′-UTR predicted conformation (Supplementary Figure 4C). RNA accessibility, assessed through the integrated web server RegRNA 2.0, displayed no major change in the presence of BRCA1 and BRCA2 variants (Supplementary Figure 2). A small decrease of RNA accessibility was observed, though, for BRCA1 c.*1013A>G.

The pedigrees of the families of the patients carrying the BRCA1/BRCA2 3′-UTR variants identified only in index cases are shown in Supplementary Figure 1. In two out of four cases, an affected sister was available and the variants, BRCA1 c.*780C>T and c.*1013A>G, did not segregate with the disease.

Discussion

Numerous studies aim to identify altered miR expression profiles in pathological versus normal tissues, and to analyse the impacts of miR gene frequent variants on the susceptibility of diseases. MiR gene screening, on the contrary, is seldom performed, although evidence has been provided that genetic variation in this class of genes could be linked to diseases, as for protein-coding genes.17, 18 Here, we chose to screen in familial breast cancer cases from the GENESIS study, the coding region of two miR genes, MIR146A and MIR146B, producing miR-146a and miR-146b-5p, respectively, as it has been shown that they regulate the expression of BRCA1.12, 13, 19 A common MIR146A variant, rs2910164, which resides in the region encoding the sequence complementary to the mature miR in the stem-loop structure of the pre-miR, is one of the most extensively studied miR-related genetic variant. In the most recent meta-analysis published to date, no evidence of association between rs2910164 and breast cancer risk was obtained, although association was observed with bladder, cervical, liver and lung cancers, as well as with oral squamous cell carcinoma.20

The only rare MIR146A variant we identified in our screened region in one index case has already been found in the 1000 Genomes Project at a comparable frequency. It falls within the intron, 90 nt upstream of the intron/exon boundary, and is, therefore, unlikely to affect MIR146A expression or function. The two novel rare MIR146B variants that we found in index cases only fall within the sequence of pri-miR-146b-5p, but we were unable to predict their potential significance, and neither were we able to study co-segregation. Search for variation in miRs expression in cases versus controls was unfortunately not an option, as we have not only access solely to DNA (and not to RNA), but also the only available biological specimen in the GENESIS study is blood. As miR expression is highly tissue specific, expression analyses in lymphocytes would probably not be relevant to breast and ovarian cancers.

In order to identify potential 3′-UTR-dependent mechanisms of gene dysregulation, we also screened the 1382 nt- and 902 nt-long BRCA1 and BRCA2 3′-UTRs, respectively. This led to the identification of six rare variants, of which three are novel. When compiling bioinformatic predictions, the results of a functional assay and genetic data, we did not find evidence that any of these variants could dramatically impact BRCA1 or BRCA2 expression. It should be noted that the strong impact predicted by RNAfold of c.*713C>T on BRCA1 3′-UTR secondary structure is accompanied by an increase of luciferase activity in two out of the three cell lines tested; however, this latter effect is opposite to what would be expected from a variant associated with increased breast cancer risk.

A few studies examined the possibility that variants in the 3′-UTR of the BRCA1 gene might be associated with breast cancer risk and reported the screening of this region in familial breast cancer cases and sometimes in controls as well.16, 21, 22, 23, 24, 25 The first published studies reported two variants: c.*36C>G, identified while screening 211 breast cancer cases,22 and c.*372_387del16 while screening 78 breast cancer cases belonging to high-risk breast and ovarian cancer families.25 Pietschmann et al screened BRCA1/BRCA2 coding sequences and 5′- and 3′-UTRs in 10 Iranian high-risk breast cancer families.23 They identified one rare variant in BRCA1 3′-UTR, c.*381_389del9ins29, which was not considered as causal as it was not present in the other four breast cancer cases in the family of the proband. Lheureux et al identified two novel rare variants, c.*750A>G and c.*1286C>A in 70 BRCA1/2-negative high-risk breast or ovarian cancer cases, with clear evidence of neutrality for the former and little evidence of causality for the latter based on luciferase reporter assays and bioinformatics predictions.16 Brewster et al reported the largest study to date, with the screening of 1612 breast cancer cases and 1554 controls, sourced from five collections, and identified 23 rare variants, of which 15 were novel.21 Using luciferase reporter assays and bioinformatics predictions, they were able to show that c.*1340_1342delTGT introduces a functional miR-103 target site and might be therefore pathogenic.21

Data are much scarcer concerning the 3′-UTR of BRCA2, as we are aware of only one small study published so far in 10 Iranian high-risk breast cancer families,23 and 1 study performed on 100 Turkish early-onset or familial breast cancer,26 which both did not lead to the identification of any rare variant. Our study is thus the first to report the screening of this region in a large series, which showed that 3′-UTR variants are extremely rare in BRCA2 (only one variant found, c.*172G>A). No evidence of pathogenicity was found for this variant.

In conclusion, although several studies reported the association of BRCA1 3′-UTR, miR gene or miR-related frequent variants with breast cancer risk,26, 27, 28 the clear involvement of rare variants has not been evidenced so far.