CYP3A7*1C allele: linking premenopausal oestrone and progesterone levels with risk of hormone receptor-positive breast cancers

Epidemiological studies provide strong evidence for a role of endogenous sex hormones in the aetiology of breast cancer. The aim of this analysis was to identify genetic variants that are associated with urinary sex-hormone levels and breast cancer risk. We carried out a genome-wide association study of urinary oestrone-3-glucuronide and pregnanediol-3-glucuronide levels in 560 premenopausal women, with additional analysis of progesterone levels in 298 premenopausal women. To test for the association with breast cancer risk, we carried out follow-up genotyping in 90,916 cases and 89,893 controls from the Breast Cancer Association Consortium. All women were of European ancestry. For pregnanediol-3-glucuronide, there were no genome-wide significant associations; for oestrone-3-glucuronide, we identified a single peak mapping to the CYP3A locus, annotated by rs45446698. The minor rs45446698-C allele was associated with lower oestrone-3-glucuronide (−49.2%, 95% CI −56.1% to −41.1%, P = 3.1 × 10–18); in follow-up analyses, rs45446698-C was also associated with lower progesterone (−26.7%, 95% CI −39.4% to −11.6%, P = 0.001) and reduced risk of oestrogen and progesterone receptor-positive breast cancer (OR = 0.86, 95% CI 0.82–0.91, P = 6.9 × 10–8). The CYP3A7*1C allele is associated with reduced risk of hormone receptor-positive breast cancer possibly mediated via an effect on the metabolism of endogenous sex hormones in premenopausal women.


BACKGROUND
Epidemiological studies provide strong evidence for a role of endogenous hormones in the aetiology of breast cancer. 1,2 Pooled analyses of data from prospective studies estimated that a doubling of circulating oestradiol or oestrone was associated with a 30-50% increase in breast cancer risk in postmenopausal women and a 20-30% increase in breast cancer risk in premenopausal women; there was no evidence that premenopausal progesterone levels were associated with breast cancer risk. 2,3 We have previously screened 642 SNPs tagging 42 genes involved in sex steroid synthesis or metabolism, and tested for the association with premenopausal urinary oestrone glucuronide and pregnanediol-3-glucuronide levels, measured in urine samples collected at pre-specified days of the woman's menstrual cycle. 4 Oestrone-3-glucuronide and pregnanediol-3-glucuronide are urinary metabolites of oestrogen and progesterone, respectively, 5,6 that are used in the context of reproductive medicine to monitor ovarian activity. 7 None of the variants that we tested was associated with urinary pregnanediol-3-glucuronide, but a rare haplotype, defined by two SNPs spanning the cytochrome P450 family 3 subfamily A (CYP3A) gene cluster, was associated with a highly significant 32% difference in urinary oestrone-3glucuronide. 4 Fine-scale mapping analyses identified the SNP rs45446698 as a putative causal variant at this locus; rs45446698 is one of seven highly correlated SNPs that cluster within the CYP3A7 promoter and comprise the CYP3A7*1C allele. 8 A genome-wide association study (GWAS) of postmenopausal plasma oestradiol levels found no association at this locus. 9 A subsequent GWAS of pre-and postmenopausal hormone levels similarly found no association with plasma oestradiol at this locus; they did however find associations at this locus with DHEAS and progesterone. 10 The CYP3A genes (CYP3A5, CYP3A7 and CYP3A4) encode enzymes that metabolise a diverse range of substrates; 11 in addition to a role in the oxidative metabolism of hormones, CYP3A enzymes metabolise~50% of all clinically used drugs, including many of the agents used in treating cancer. 12 CYP3A4, the major isoform in adults, is predominantly expressed in the liver, where it is the most abundant P450, accounting for 30% of total CYP450 protein. CYP3A7, the major isoform in the foetus, is generally silenced shortly after birth. 13 In CYP3A7*1C carriers, a region within the foetal CYP3A7 promoter has been replaced with the equivalent region from the adult CYP3A4 gene; 14 this results in adult expression of CYP3A7 in CYP3A7*1C carriers and may influence metabolism of endogenous hormones, exogenous hormones used in menopausal hormone treatment and clinically prescribed drugs, including agents used in treating cancer, in these individuals. 12,15 In order to identify additional variants that are associated with premenopausal urinary hormone levels and to further characterise the associations at the CYP3A locus, we carried out a GWAS of urinary oestrone-3-glucuronide and pregnanediol-3-glucuronide levels, using mid-luteal-phase urine samples from women of European ancestry and followed up by testing for an association with breast cancer risk in cases and controls from the Breast Cancer Association Consortium (BCAC). To determine whether the CYP3A7*1C allele influences metabolism of exogenous hormones, we evaluated gene-environment interactions with menopausal hormone treatment for breast cancer risk, and to investigate whether adult expression of CYP3A7 impacts on agents used in treating cancer, we analysed associations with breast cancerspecific survival.

GWAS subjects
Generations Study. Full details of the Generations Study have been published previously. 16 Briefly, the Generations Study is a cohort study of more than 110,000 women from the UK general population, who were recruited beginning in 2003 and from whom detailed questionnaires and blood samples have been collected to investigate risk factors for breast cancer.
British Breast Cancer Study. Full details of the British Breast Cancer Study have been published previously. 17 Briefly, the British Breast Cancer Study is a national case-control study of breast cancer, in which cases of breast cancer were ascertained through the cancer registries of England and Scotland and through the National Cancer Research Network. Cases were asked to invite a healthy female first-degree relative with no history of cancer and a female friend or non-blood relative to participate in the study.
Mammography Oestrogens and Growth Factors study. Full details of the Mammography Oestrogens and Growth Factors study have been published previously. 18 Briefly, this is an observational study nested within a trial of annual mammography screening in young women that was conducted in Britain. 19 Approximately 54,000 women aged 39-41 years were randomly assigned to the intervention arm from 1991 to 1997 and offered annual mammograms until age 48 years. From 2000 to 2003, women in the intervention arm who were still participating in this trial were invited to participate in the Mammography Oestrogens and Growth Factors study; they were asked to provide a blood sample and complete a questionnaire detailing demographic, lifestyle and reproductive factors. More than 8000 women were enrolled in the study.
GWAS subjects were drawn from the Generations Study (N = 184), the British Breast Cancer Study (N = 284) and the Mammography Oestrogens and Growth Factors study (N = 109). To be eligible for the GWAS analysis of oestrone-3-glucuronide and pregnanediol-3-glucuronide levels, women had to be having regular menstrual cycles (i.e., their usual cycle length had to be between 21 and 35 days) and not using menopausal hormone therapy or oral contraceptives. All of the women included in this analysis reported being of European ancestry, and none had been diagnosed with breast cancer at the time of study recruitment.

Measurement of hormone levels
The protocol for collecting timed urine samples has been published previously. 18 Briefly, a woman's predicted date of ovulation was estimated from the date of the first day of her last menstrual period and her usual cycle length; ovulation was predicted to occur 14 days before the date of her next menstrual period. On this basis, women were asked to provide a series of early morning urine samples on pre-specified days of their cycle. For this analysis, the mid-luteal-phase sample, taken at 7 days after the predicted day of ovulation, was used. To confirm that ovulation had occurred, consistent with the predicted date of ovulation, pregnanediol-3-glucuronide was measured; to take account of the differences in volume in early morning urine samples from different women, we measured creatinine, a waste product of normal muscle and protein metabolism that is released at a constant rate by the body. Samples in which pregnanediol-3glucuronide, adjusted for creatinine levels, was >0.3 µmol/mol, were taken forward for measurement of creatinine-adjusted oestrone-3-glucuronide.
Pregnanediol-3-glucuronide and oestrone-3-glucuronide were analysed by commercial competitive ELISA Kits (Arbor Assays, Ann Arbor, USA) according to the manufacturer's instructions. For pregnanediol-3-glucuronide, the lower limit of detection was determined as 0.64 nmol/l; intra-and inter-assay coefficients of variation were 3.7% and 5.2%, respectively. For oestrone-3-glucuronide, the lower limit of detection was determined as 19.6 pmol/l; intra-and inter-assay coefficients of variation were 3.5% and 5.9%, respectively. Creatinine was determined using the creatininase/creatinase-specific enzymatic method 20 using a commercial kit (Alpha Laboratories Ltd. Eastleigh, UK) adapted for use on a Cobas Fara centrifugal analyser (Roche Diagnostics Ltd, Welwyn Garden City, UK). For within-run precision, the coefficient of variation was <3%, while for intra-batch precision, the coefficient of variation was <5%.
GWAS genotyping and quality control DNA from 577 women was genotyped using Illumina Infinium OncoArray 500 K BeadChips. We excluded samples for which <95% of SNPs were successfully genotyped. Identity-by-descent analysis was used to identify closely related individuals enabling exclusion of first-degree relatives. We applied SmartPCA 21 to our data and used phase II HapMap samples to identify individuals with non-Caucasian ancestry. The first two principal components for each individual were plotted, and k-means clustering was used to identify samples separated from the main Caucasian cluster. SNPs with call rates <95% were excluded, as were SNPs with minor allele frequency (MAF) < 2% and those whose genotype frequencies deviated from Hardy-Weinberg proportions at P < 1 × 10 -05 . Following QC, 487,659 SNPs were successfully genotyped in 560 samples (Generations Study: N = 179, British Breast Cancer Study: N = 278 and Mammography Oestrogens and Growth Factors study: N = 103). Genome-wide imputation was performed using 1KGP Phase 3 reference data. Haplotypes were pre-phased using SHAPEIT2. 22 Imputation was performed using IMPUTE2. 23 Imputed SNPs with INFO scores <0.8 and MAFs <2% were excluded from subsequent analyses. After QC, a set of 7,792,694 successfully imputed SNPs were available for association analysis.
Genotyping rs45446698 and sequencing of the CYP3A7*1C allele For the 119 Generations Study women who were not included in the GWAS but for whom progesterone was subsequently measured, rs45446698 was genotyped by TaqMan (Thermo Fisher Scientific Ltd, UK). The call rate was 100% with 100% concordance between 12 duplicates. To confirm that rs45446698 tags the CYP3A7*1C allele, we sequenced this region in 31 women selected on the basis of their rs45446698 genotype (9 common homozygotes and 22 carriers). A 370-bp DNA region (chr7: 99 332 745-99 333 114; GRCh37/hg19) was amplified using Phusion High-Fidelity DNA Polymerase (New England Biolabs, UK) and primers CCATAGAGACAAGAGGAGA (forward) and CTGAGTCTTTTTTTCAG-CAGC (reverse). The PCR product was purified using QIAquick Gel Extraction Kit (Qiagen) and Sanger sequenced using a commercially available service (Eurofins Genomics, Germany).

Statistical analysis of GWAS data
Tests of association between SNP genotypes and log-transformed creatinine-adjusted oestrone-3-glucuronide and pregnanediol-3glucuronide adjusted for study were performed using linear regression in SNPTEST v2.5. 24 Test statistic inflation was assessed visually using a QQ Plot ( Supplementary Fig. S1) and formally by calculating the inflation factor, λ. There was no evidence of systematic test statistic inflation (λ = 1.01 for both oestrone-3glucuronide and pregnanediol-3-glucuronide). For the single significant association (rs45446698), we used multivariate linear regression to adjust for potential confounders: age at menarche (<12, 12, 13, 14 and >14 years), age at collection of urine samples Follow-up genotyping of rs45446698 Genotype data for rs45446698 were generated as part of iCOGS 25 and OncoArray. 26 Full details of SNP selection, array design, genotyping and post-genotyping QC have been published. 25,26 Participants genotyped in both collaborations were excluded from the iCOGS data sets with the exception of the GxE interaction analysis of menopausal hormone treatment, for which five studies (CPS-II, PBCS, UKBGS, MCCS and pKARMA) were excluded from OncoArray, rather than iCOGS, in order to maximise the number of studies with sufficient cases and controls for analysis. We excluded cases with breast tumours of unknown invasiveness, or in situ disease, and those for whom age at diagnosis was not known.
Statistical analysis of rs45446698 and breast cancer risk Due to the low MAF of rs45446698 (3.7%, 0.03% and 0.4% in individuals of European, Asian and African ancestry, respectively), we restricted our analyses to individuals of European ancestry and excluded studies with <50 cases or controls; there were 35 (iCOGS) and 56 (OncoArray) studies for the current case-control analysis (Supplementary Tables S1 and S2).
We combined heterozygote and rare homozygote genotypes and estimated carrier ORs using logistic regression, adjusted for 15 principal components 25,26 and study. Stratum-specific carrier ORs were estimated for a set of pre-specified prognostic variables (oestrogen receptor (ER), progesterone receptor (PR), HER2, grade and stage). We excluded studies with <50 cases or controls in any individual stratum from stratified analyses. Interactions were assessed based on case-only models (ER, PR, Her2, stage and grade). In the subset of studies for which covariate data were available, we used multivariable logistic regression to adjust for reference age (defined as age at diagnosis for cases and age at interview for controls), age at menarche, BMI and parity (as above). Finally, we stratified our analyses on menopausal status at reference age. When menopausal status was missing, the reference age was used as a surrogate (<54 premenopausal and ≥54 postmenopausal). To select the reference age that most accurately captured menopausal status in this group of studies, we generated AUC curves based on women who had reported natural menopause with different reference age cut-offs (50-56 years); on this basis, a reference age of 54 was selected. P values were estimated using likelihood ratio tests with one degree of freedom. All P values reported, for all analyses, are two-sided. Statistical analyses were performed using STATA version 11.0 (StataCorp, College Station, TX, USA).
Statistical analysis of gene-environment interaction (GxE) with menopausal hormone treatment Postmenopausal women from 13 (iCOGS) and 27 (OncoArray) studies provided the data on menopausal hormone treatment. Menopausal status and postmenopausal hormone use were derived as of the reference date (defined as date of diagnosis for cases and interview for controls); women with unknown age at reference date were excluded from this analysis. All analyses were conducted only in postmenopausal women. Carrier ORs for breast cancer risk were estimated using logistic regression stratified by current use of menopausal hormone treatment, oestrogen-progesterone therapy and oestrogen-only therapy, respectively. Analyses were adjusted for study, ten principal components, reference age, age at menarche, parity, BMI, former use of menopausal hormone treatment and use of any menopausal hormone treatment preparation other than the one of interest in analyses of current use of menopausal hormone treatment by type. To account for potential heterogeneity of the main effects of menopausal hormone treatment/oestrogen-progesterone therapy/oestrogen-only therapy by study design, we included an interaction term between the risk factor of interest and an indicator variable for study design (prospective cohorts/population-based case-control studies, non-population-based studies).
Interactions between rs45446698 and current use of menopausal hormone treatment, oestrogen-progesterone therapy and oestrogen-only therapy were assessed using likelihood ratio tests, based on logistic regression models with and without interaction between rs45446698 and current use of menopausal hormone Statistical analysis of breast cancer-specific survival in cases In total, 38 (iCOGS) and 63 (OncoArray) studies provided follow-up data for analysis of breast cancer-specific survival. Analysis of outcome was restricted to patients who were at least 18 years old at diagnosis and for whom vital status at, and date of the last follow-up were known. Patients ascertained for a second tumour were excluded. Time-to-event was calculated from the date of diagnosis. For prevalent cases with study entry after diagnosis, left truncation was applied, i.e., follow-up started at the date of study entry. 27 Follow-up was right-censored at the date of death (death known to be due to breast cancer considered an event), the date the patient was last known to be alive if death did not occur or at 10 years after diagnosis, whichever came first. Follow-up was censored at 10 years due to limited data availability after this time.
Hazard ratios (HR) for association of rs45446698 genotype with breast cancer-specific survival were estimated using Cox proportional hazards regression implemented in the R package survival (v. 2.43-3) stratified by country. iCOGS and OncoArray estimates were combined using an inverse-variance-weighted meta-analysis.
To test for the association between rs45446698 and breast cancer risk, we combined genotype data from 56 studies (OncoArray; Supplementary Table S1) with imputed data from 35 studies (iCOGS; Supplementary Table S2) in a total of 90,916 cases and 89,893 controls of European Ancestry. The rs45446698-C allele was associated with a reduction in breast cancer risk (OR = 0.94, 95% CI 0.91-0.98, P = 0.002, Table 2) with no evidence of heterogeneity between data sets (P het = 0.58). There was no evidence that the reduction in breast cancer risk associated with being a rs45446698-C carrier differed according to Her2 status, tumour grade or stage (Supplementary Table S4). Stratifying by ER status, the association was limited to ER-positive (ER + ) breast cancers (OR = 0.91, 95% CI 0.87-0.96, P = 0.0002 and OR = 1.03, 95% CI 0.95-1.11, P = 0.50 for ER + and ER− cancers, respectively; P int = 0.03, Table 2). Stratifying by ER and PR status, the association was limited to ER + /PR + cancers (ER + /PR + : OR = 0.86, 95% CI 0.82-0.91, P = 6.9 × 10 -8 ; ER + /PR−: OR = 1.06, 95% CI 0.96-1.16, P = 0.25; P int = 0.0001, Table 2). Adjusting for demographic and reproductive factors in the subset of studies for which these additional covariates were available did not alter this association (Supplementary Table S5). Defining reference age as age at diagnosis for cases and age at interview for controls and using this as a proxy for menopausal status (<54 or ≥54 years), we further stratified our analysis on menopausal status; there was little evidence that the association with ER + /PR + breast cancer differed by menopausal status (premenopausal OR = 0.94, 95% CI 0.84-1.06, P = 0.31, postmenopausal OR = 0.86, 95% CI 0.80-0.93, P = 0.0001, P het = 0.28).
On the assumption that genetic variants that influence metabolism of endogenous hormones 5 may also impact on metabolism of exogenous hormones, we investigated whether menopausal hormone treatment modified the association between rs45446698 genotype and ER + /PR + breast cancer risk in 17,831 postmenopausal breast cancer cases and 40,437 postmenopausal controls. The rs45446698-C carrier OR was lower (i.e., more protective) in current users of any menopausal hormone treatment but particularly in those who used combined oestrogen-progesterone therapy (current users: OR = 0.68, 95% CI 0.52-0.90, P = 0.007; never users: OR = 0.85, 95% CI 0.76-0.95, P = 0.005, Table 3). This difference was not, however, statistically significant (P int = 0.15, Table 3).
Finally, to determine whether rs45446698 genotype could affect patient outcome by influencing metabolism of cytotoxic agents that are CYP3A substrates, 15 we tested for the association between rs45446698 genotype and 10-year breast cancerspecific survival in 91,539 breast cancer cases from 71 studies for whom follow-up data were available. There was neither overall association between rs45446698 genotype and breast cancerspecific survival (HR = 0.99, 95% CI 0.91-1.09, P = 0.90, Table 4) nor was there any evidence of an association in analyses stratified by tumour characteristics (Supplementary Table S6). Stratifying by treatment regimen, we found no evidence that rs45446698 genotype influenced outcome in cases who were treated with a hormonal agent (i.e., tamoxifen or an aromatase inhibitor, Table 4). There was, however, some evidence that in cases who were treated with a taxane, carriers of the rs45446698-C allele had reduced breast cancer-specific survival compared with noncarriers (HR = 1.46, 95% CI 1.08-1.97, P = 0.01, Table 4).

DISCUSSION
This present GWAS identified a single, highly significant association between the CYP3A7*1C allele (tagged by rs4546698) and premenopausal urinary oestrone-3-glucuronide. This finding alone is not novel; we have previously reported an association between the CYP3A7*1C allele, parent oestrogens and several oestrogen metabolites. 5 What we have demonstrated for the first time is the extent to which this signal dominates the genetic architecture of hormone levels in premenopausal women of Northern European ancestry ( Fig. 1; rs45446698 P = 3.1 × 10 −18 , all other signals P > 1 × 10 -8 ) and we estimate that 11.5% of the variance in urinary oestrone-3-glucuronide levels is explained by this one allele. Two previous GWAS of circulating oestrogen levels have been published, neither reported an association with the CYP3A locus. 9,10 This lack of replication may be explained by our choice of study population. The first GWAS 9 was conducted in postmenopausal women (N = 1623) participating in the Nurses' Health Study and the Sisters in Breast Screening Study. The second was conducted within the Twins UK study (N = 2913) and included men as well as pre-, peri-and postmenopausal women. A strength of our GWAS is that all of the women were premenopausal and had regular menstrual cycles; circulating levels of oestrogens in premenopausal women are much higher compared with those in postmenopausal women. 4,28 For each woman, we assayed a single urine sample taken in the mid-luteal phase of her cycle at exactly 7 days after her predicted day of ovulation. Thus, although our study is relatively small (N = 560), we may have had greater power to detect an association at the CYP3A locus than previous studies due to the very homogeneous premenopausal study population that we selected.   Our findings also demonstrate the potential significance of the choice of hormone or hormone metabolite; both of the previous GWAS assayed plasma oestradiol. In a targeted analysis of urinary oestrogen metabolites, we have previously shown that the association between the CYP3A7*1C allele and oestrone (45.3% lower levels in carriers, P = 0.0005) is more pronounced than the association with oestradiol (26.7% lower levels, P = 0.07) with the implication that measuring urinary oestrone-3-glucuronide (rather than plasma oestradiol) may have contributed to our positive findings. Similarly, by measuring pregnanediol-3-glucuronide and progesterone in premenopausal women from the Generations Study, we were able to demonstrate a significant association of rs45446698 with progesterone (27% reduction, P = 0.001) in the absence of an association with pregnanediol-3-glucuronide (6% reduction, P = 0.61).
The fact that we measured a urinary oestrogen metabolite (oestrone-3-glucuronide) rather than serum or plasma oestrogens (oestradiol or oestrone) limits the interpretation of our results in terms of a causal association. Estimates of the association between circulating oestrogens and breast cancer risk are based on measurements of hormone levels in plasma or serum, 3 and in a recent study that measured luteal-phase serum oestrogens and urinary oestrogen metabolites in 249 premenopausal women, 29 serum oestradiol and oestrone were only moderately correlated with urinary oestrone (serum oestradiol: r = 0.39, serum oestrone: r = 0.48). Our analysis of rs45446698 genotypes in 90,916 cases and 89,893 controls from BCAC, however, provides robust evidence of an association of the CYP3A7*1C allele with breast cancer risk overall (OR = 0.94, P = 0.002) and a more pronounced protective effect on ER + /PR + breast cancers (OR = 0.86, P = 6.9 × 10 −8 ). The specificity of this association (comparing ER + /PR− with ER + /PR + cancers, P het = 0.001) and our replication of Ruth and colleagues report of a signal at the CYP3A locus in their analysis of circulating progesterone levels 10 raise the possibility that premenopausal progesterone levels might influence risk of ER + /PR + breast cancers. This would be in contrast to the findings from Key et al. who reported no evidence of an association between premenopausal progesterone levels and MHT menopausal hormone treatment, EPT oestrogen-progesterone therapy, ET oestrogen-only therapy, P 1 test of H 0 no association between rs45446698 and ER + /PR + breast cancer risk, P int test of H 0 no difference between stratum-specific estimates, NK not known. Studies with less than 50 cases in any stratum were excluded from the stratified analyses leaving 13 studies for analysis in iCOGS data and 27 studies for analysis in OncoArray data. All models are adjusted for reference age, study, ten principal components and former use of MHT. Additionally, when stratified by EPT or ET, models are adjusted for use of any other type of MHT other than the one of interest.  In total, 38 studies from iCOGS and 63 studies from OnocoArray provided follow-up data for analysis of breast cancer-specific survival. The results were censored at 10 years after diagnosis. HR for association of rs45446698 genotype with breast cancer-specific survival was estimated using Cox proportional hazards regression stratified by country. *To test for statistical interaction between rs45446698 genotype and treatment with a taxane, we additionally compared the association in cases who received chemotherapy including a taxane to that in cases who received chemotherapy that did not include a taxane (P int = 0.02; the association in the latter group was in the opposite direction and not significant: HR = 0.88, 95% CI 0.67-1.15, P = 0.34).
CYP3A7*1C allele: linking premenopausal oestrone and. . . N Johnson et al. breast cancer risk overall and no heterogeneity in estimates stratified by PR status. 3 However, the number of cases of PR + (N = 158) and PR− (N = 61) breast cancer was small, and this analysis may have lacked power to detect modest associations in subgroups of cancers. Alternatively, the association of rs45446698 genotype with ER + /PR + breast cancer risk, specifically, may be due to the fact that PR is a marker for an intact oestrogen signalling pathway 30 confirming a direct link between the levels of oestrogen (or oestrogen signalling) and proliferation in this subgroup of cancers. Our analysis of the CYP3A7*1C allele, menopausal hormone treatment and breast cancer risk was inconclusive; while the carrier ORs were consistent with a greater protective effect of this allele in women taking exogenous hormones, particularly oestrogen-progesterone therapy, none of the interactions was statistically significant. Overall, there were 14,119 ER + /PR + breast cancer cases and 32,418 controls for this subgroup analysis, but for what was, arguably, the most pertinent subgroup (i.e., current oestrogen-progesterone therapy use), the number of cases who were current users was relatively small (CYP3A7*1C carriers N = 107, non-carriers N = 1498) and power was limited to detect modest interactions. There are limitations to this analysis; we focussed on current menopausal hormone treatment use (adjusted for past use) as it is for current use that the association with breast cancer risk is the strongest, 31 but we did not have information on dose, duration or the formulation that was used.
Finally, we found no association between CYP3A7*1C carrier status and survival in patients treated with tamoxifen, a known CYP3A substrate. This may reflect the fact that compared to CYP3A4, CYP3A7 is a poor metaboliser of tamoxifen, 32 or that standard doses of tamoxifen achieve high levels of oestrogen receptor saturation. 33 There was some evidence that breast cancer-specific survival was reduced in CYP3A7*1C carriers who were treated with a taxane, compared with non-carriers (P = 0.01); this may, however, be a chance finding given the number of comparisons that were tested.
In conclusion, we present strong evidence that the CYP3A7*1C allele impacts on the metabolism of endogenous hormones, which in turn, reduces the risk of hormone receptor-positive breast cancer in carriers. Optimal strategies for breast cancer prevention in women at high risk of breast cancer and in the general population are an area of active research. In this context, CYP3A7*1C carriers represent a naturally occurring cohort in which the effects of reduced exposure to endogenous oestrogens and progesterones throughout a woman's premenopausal years can be further investigated. Our results regarding the impact of CYP3A7*1C carrier status on exogenous hormones and chemotherapeutic agents are preliminary but warrant further investigation, preferably in the setting of randomised trials.

NBCS COLLABORATORS
Vigo, Spain. BSUCH thanks Peter Bugert, Medical Faculty Mannheim. CBCS thanks study participants, co-investigators, collaborators and staff of the Canadian Breast Cancer Study, and project coordinators Agnes Lai and Celine Morissette. CCGP thanks Styliani Apostolaki, Anna Margiolaki, Georgios Nintos, Maria Perraki, Georgia Saloustrou, Georgia Sevastaki, Konstantinos Pompodakis. CGPS thanks staff and participants of the Copenhagen General Population Study. For the excellent technical assistance: Dorthe Uldall Andersen, Maria Birna Arnadottir, Anne Bank, Dorthe Kjeldgård Hansen. The Danish Cancer Biobank is acknowledged for providing infrastructure for the collection of blood samples for the cases. CNIO-BCS thanks Guillermo Pita, Charo Alonso, Nuria Álvarez, Pilar Zamora, Primitiva Menendez, the Human Genotyping-CEGEN Unit (CNIO). Investigators from the CPS-II cohort thank the participants and Study Management Group for their invaluable contributions to this research. They also acknowledge the contribution to this study from central Cancer Genetic Test Laboratory. The MCCS was made possible by the contribution of many people, including the original investigators, the teams that recruited the participants and continue working on follow-up, and the many thousands of Melbourne residents who continue to participate in the study. We thank the coordinators, the research staff and especially the MMHS participants for their  NHS2 for their valuable contributions as well as  the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA,  ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN,  TX,

ADDITIONAL INFORMATION
Ethics approval and consent to participate Collection of blood samples, urine samples and questionnaire information was undertaken with written informed consent and relevant ethical review board approval in accordance with the tenets of the Declaration of Helsinki (Supplementary Table S7).

Consent to publish Not applicable.
Data availability GWAS data and the complete dataset for follow-up genotyping will not be made publicly available due to restraints imposed by the ethics committees of individual studies; requests for data can be made to the corresponding author (GWAS data) or the Data Access Coordination Committee (follow-up genotyping data) of BCAC (http://bcac.ccge.medschl.cam.ac.uk/). Summary results for all variants genotyped by BCAC (including rs45446698) are available at http://bcac.ccge.medschl.cam. ac.uk/.