CYP3A7*1C allele is associated with reduced levels of 2-hydroxylation pathway oestrogen metabolites

Background: Endogenous sex hormones are well-established risk factors for breast cancer; the contribution of specific oestrogen metabolites (EMs) and/or ratios of specific EMs is less clear. We have previously identified a CYP3A7*1C allele that is associated with lower urinary oestrone (E1) levels in premenopausal women. The purpose of this analysis was to determine whether this allele was associated with specific pathway EMs. Methods: We measured successfully 12 EMs in mid-follicular phase urine samples from 30 CYP3A7*1C carriers and 30 non-carriers using HPLC-MS/MS. Results: In addition to having lower urinary E1 levels, CYP3A7*1C carriers had significantly lower levels of four of the 2-hydroxylation pathway EMs that we measured (2-hydroxyestrone, P=1.1 × 10−12; 2-hydroxyestradiol, P=2.7 × 10−7; 2-methoxyestrone, P=1.9 × 10−12; and 2-methoxyestradiol, P=0.0009). By contrast, 16α-hydroxylation pathway EMs were slightly higher in carriers and significantly so for 17-epiestriol (P=0.002). Conclusions: The CYP3A7*1C allele is associated with a lower urinary E1 levels, a more pronounced reduction in 2-hydroxylation pathway EMs and a lower ratio of 2-hydroxylation:16α-hydroxylation EMs in premenopausal women. To further characterise the association between parent oestrogens, EMs and subsequent risk of breast cancer, characterisation of additional genetic variants that influence oestrogen metabolism and large prospective studies of a broad spectrum of EMs will be required.

Endogenous sex hormones are well-established risk factors for breast cancer. Pooled analyses of data from prospective studies have estimated that a doubling of circulating estradiol (E 2 ), free E 2 or E 1 is associated with a 20-30% or 30-50% increase in breast cancer risk in pre-and postmenopausal women, respectively (Key et al, 2002(Key et al, , 2013. The contribution of specific oestrogen metabolites (EMs) to breast cancer risk is less clear. Briefly, interconversion between the parent oestrogens, E 2 and E 1 occurs by reversible oxidation at the 17a-position of the steroid ring; conversion of parent oestrogens to EMs is by irreversible hydroxylation at the 2-, 4-or 16-positions ( Figure 1; Badawi et al, 2001;Tsuchiya et al, 2005;Samavat and Kurzer, 2015). A recent review, summarising evidence from four prospective studies of oestrogen metabolism and breast cancer risk concluded that there was consistent evidence that enhanced 2-hydroxylation was associated with a reduction in risk of breast cancer that was independent of the strong positive associations of unconjugated parent oestrogens (E 2 and E 1 ) with breast cancer risk (Ziegler et al, 2015).
In an analysis of single-nucleotide polymorphisms (SNPs) tagging genes that are involved in oestrogen synthesis and metabolism, we identified one SNP, rs10273424, which was associated with a 22% reduction in levels of urinary E 1 glucuronide (E 1 G) in premenopausal women (Johnson et al, 2012). rs10273424 maps to the cytochrome P450 3A (CYP3A) gene cluster at 7q22.1; the CYP3A genes (CYP3A5, CYP3A7 and CYP3A4) encode enzymes that catalyse the oxidative metabolism of a wide range of exogenous and endogenous substrates including parent oestrogens (E 1 and E 2 ; Figure 1). The metabolic capacity of the CYP3A enzymes differ, depending on the substrate (Williams et al, 2002); with respect to oestrogen metabolism specifically, 2-hydroxylation of E 1 to 2-OHE 1 is catalysed by CYP3A4, 4-hydroxylation to 4-OHE 1 is catalysed by CYP3A4 and CYP3A5 and 16a-hydroxylation to 16a-OHE 1 is catalysed by CYP3A4, CYP3A5 and CYP3A7 (Figure 1; Badawi et al, 2001;Lee et al, 2003;Tsuchiya et al, 2005). Fine-mapping of the 7q22.1 association signal for urinary E 1 G levels implicated the CYP3A7*1C allele as the causal allele (Johnson et al, 2016). This allele, which comprises seven highly correlated single base changes mapping to the CYP3A7 promoter, results in expression of the fetal CYP3A7 gene in adult carriers of the CYP3A7*1C allele (Gonzalez, 1988;Kuehl et al, 2001;Burk et al, 2002).
The purpose of this current analysis was to determine whether, in addition to the association between CYP3A7*1C carrier status and lower urinary E 1 G levels, the CYP3A7*1C allele was associated with a reduction in levels of specific pathway EMs.

MATERIALS AND METHODS
Study population. The study population from which the women we included in this analysis were drawn has been described previously (Johnson et al, 2012). Briefly, they comprised 729 premenopausal women who were first-degree relatives and friends of breast cancer cases participating in the British Breast Cancer study (Johnson et al, 2005) or participants in the intervention arm of a trial of annual mammographic screening in young women conducted in Britain (Mammography Oestrogens and Growth Factors Study; Walker et al, 2009). To be eligible, women had to be having regular menstrual cycles, not using hormone replacement therapy or oral contraceptives and not to have been diagnosed with breast cancer at recruitment to the study. All women were of   Enzymes (cytochrome P450 (CYPs) and catechol O-methyltransferase (COMT)) involved in oestrogen metabolism are in red. E 1 G, measured in our previous analysis (Johnson et al, 2012), is an E 1 conjugate present in urine. The E 1 measured in this analysis (after hydrolysis of glucuronide and sulphate conjugates in the first step of the LC-MS/MS protocol) is highly correlated with E 1 G (Spearman's correlation, r ¼ 0.70, Po0.0001).
self-reported White British ancestry. To be included in the original analysis and this subsequent analysis, women had to have provided serial urine samples (six follicular phase and one luteal phase), at pre-specified days of their menstrual cycle for measurement of creatinine adjusted urinary E 1 G using an in-house enzyme-linked immunosorbent assay (Johnson et al, 2012). To maximise the statistical efficiency of this analysis of the CYP3A7*1C allele, which has a minor allele frequency (MAF) of just 4% in Northern European populations (Johnson et al, 2016), we selected 60 women on the basis of genotype; a random sample of 30 CYP3A7*1C carriers and 30 CYP3A7*1C non-carriers. We further selected the two periovulatory samples (samples three and four of the six sequential follicular phase samples) on the basis that oestrogen levels would be at their peak in these samples and this would maximise any differences in levels between CYP3A7*1C carriers and non-carriers. To minimise random variation, we used the average of these two sequential samples as our outcome variable.
Ethics. The study was conducted in accordance with the tenets of the Declaration of Helsinki and all participants provided written informed consent.
Genotyping. Genotyping of the tag SNP rs45467892, which is perfectly correlated with the CYP3A7*1C allele, has been described previously (Johnson et al, 2016). Briefly, genotyping was by Taqman (Life Technologies, Paisley, UK). The call rate was 96.9% and concordance between duplicate samples was 100%.
HPLC-MS/MS analysis. LC-MS/MS analysis was carried out for 14 EMs using the method of (Xu et al, 2005(Xu et al, , 2007. We were unable to measure one of the 15 EMs measured by (Xu et al, 2005(Xu et al, , 2007 (16-Epiestriol), as there was no commercially available standard for this EM. Briefly, two aliquots of frozen urine per woman were sent to the Mass Spectroscopy Facility for Quantitative Analysis, Faculty of Natural Science, Imperial College, London, for analysis. Hydrolysis of glucuronide and sulphate-conjugated EMs to form free EMs was carried out by mixing 500 ml of freshly thawed urine with 20 ml of internal standards (comprising 2 ng of each of five deuteriumlabelled oestrogen metabolites (d-EMs); 17b-estradiol-d4, estriol-d3, 2-hydroxy-17b-estradiol-d5, 2-methoxy-17bestradiol-d5 and 16-epiestriol-d3; Qmx Laboratories Ltd, Dunmow, UK) and 500 ml of freshly prepared enzymatic hydrolysis buffer (100 ml of b-glucuronidase from Helix pomatia (Type H-2; Sigma-Aldrich, St Louis, MO, USA) in 10 ml 0.15 M sodium acetate buffer, pH 4.6 containing 2 mg of ascorbic acid). Samples were incubated at 37 1C overnight before extraction with dichloromethane and dansyl chloride dervitazation. The final derivatised samples (200 ml) were transferred to HPLC vials. Urine samples were randomly allocated to one of six analytical batches. Each batch contained 1 matrix blank, 1 matrix blank spiked with internal standards, 8-point calibration standards, 3 quality control (QC) samples and 20 urine samples. QC samples were prepared using charcoal stripped human urine (Golden West Biological Inc., Temecula, CA, USA) with no detectable levels of oestrogen metabolites, spiked with all 14 EMs at a concentration of 2 ng ml À 1 .
Samples (10 ml) were then analysed by HPLC-electrospray ionisation/MS-MS using an Agilent 1100 HPLC coupled to an SCIEX QTRAP 6500 mass spectrometer (AB Sciex LLC, Framingham, MA, USA) running in multiple reaction monitoring (MRM) mode. Chromatographic separation was carried out on a Phenomenex Synergi Hydro-RP 4 mm Â 150 mm Â 2.0 mm column, at 40 1C. The solvent gradient used was 35%A (99.8% H 2 O: 0.2% CHOOH) to 85%B (99.8% MeOH: 0.2% CHOOH) over 60 min. Solvent B was held at 85% for 4 min, then the solvent returned to 35% A for 10 min equilibration before the next injection. The solvent flow rate was 250 ml min À 1 . The ESI source (type: Ion Drive Turbo V) parameters were set to the following: TEM 500 1C, Curtain Gas 45 psi, GS1 50 psi, GS2 60 psi and MS parameters were CAD gas Medium, DP 80, EP 10, CE 45 and CXP 5. A scheduled detection method was used and the MRM detection window was 120 s with a target scan time of 1 s. Transitions and retention times are listed in Supplementary Table 1. Analyst 1.6.2 software (AB Sciex LLC) was used for quantification of the EMs. Peak quantifications were carried out using d-EM internal standards and constructing matrix matched (charcoal stripped human urine) eight-point calibration curves for each of the six analytical batches. The calibration curves were evaluated by plotting the peak area ratios of dansyl-EM/d-dansyl-EM against concentration (ng ml À 1 ) of EMs in the standard and using linear regression with 1/X weighting to fit the data. Using this linear function, the amounts of EMs in the urine sample were interpolated.
The intra-and inter-batch coefficients of variation (CVs), evaluated from three QC samples per analytical batch, in six independent consecutive batches (N ¼ 18 QC samples) ranged from 6% to 10% (intra-batch CV) and 6% to 14% (inter-batch CV; Supplementary Table 2). The two highest inter-batch CVs were for 4-methoxyestrone (14%) and 4-methoxyestradiol (13%), the two oestrogen metabolites that had the lowest concentrations and which were subsequently excluded from the analysis. The LLOQ was estimated as 80 pg ml À 1 , where the intra-and inter-batch precision of all the EMs were consistently o10% and the intra-and inter batch accuracies were between 97 and 105%. The LOD for the entire assay (all 14 EMs) was estimated to be 8 pg ml À 1 .
Statistical analysis. EMs were converted from ng ml À 1 of urine to pg mg À 1 creatinine using the molecular weight of the unconjugated form of each of the EMs and the amount of creatinine (measured in mg ml À 1 ) for each of the samples. This allowed us to create pathway variables as described by Faupel-Badger et al (2010). Where both of the two samples per woman were measured successfully, we used the mean of the two measurements; where one of two measurements from a woman was missing, we used the one available measurement; where both measurements were missing, we excluded the woman from the analysis of this EM. For the majority of EMs, there were no missing values. For four EMs (2-hydroxyestrone (2-OHE1), 2-methoxyestradiol (2-MeOE 2 ), 2-hydroxyestrone-3-methyl ether (3-MeOE 1 ) and 16-ketoestradiol (16-ketoE 2 )), there were o10% missing values (Supplementary Table 3). For two EMs (4-methoxyestrone (4-MeOE 1 ) and 4-methoxyestradiol (4-MeOE 2 )), levels were undetectable in the majority of samples (92 (76.7%) and 107 (89.2%) for 4-MeOE 1 and 4-MeOE 2 , respectively). These two EMs were excluded from further analysis. For the 12 EMs that we were able to measure, we estimated the within-woman variation based on the two sequential samples per woman; intra-class correlation coefficients for individual EMs and grouped EMs are shown in Supplementary Table 4.
For individual and pathway EMs, we calculated geometric means and 95% confidence intervals (CIs) on the natural logarithm scale and exponentiated the values back to the original scale. Linear regression models of the log e -transformed EMs were used to estimate per cent differences between CYP3A7*1C carriers and non-carriers for individual and pathway EMs. We carried out unadjusted analyses and analyses that were adjusted for measurement batch (1-6), body mass index (BMI; quartiles), age at first full-term pregnancy (quartiles) and parity (0, 1-2, 42). Adjustment for these covariates did not alter effect sizes substantially and, therefore, unadjusted results are presented. We used t-tests of the linear regression coefficient to estimate P-values. We applied a Bonferroni correction to establish a statistical significance level of Po0.003, based on our measuring of 14 individual EMs and 5 grouped EMs. Statistical analyses were carried out using R (version 3.2.3; http://cran.r-project.org). All reported P-values are two-sided.

DISCUSSION
To our knowledge, comprehensive data on urinary EMs in premenopausal women, measured using LC-MS, have been previously reported in three studies (Eliassen et al, 2009(Eliassen et al, , 2012Faupel-Badger et al, 2010;Maskarinec et al, 2012). The first of these was the Nurses' Health Study II, a prospective study of North American registered nurses aged 25-42 years at recruitment (Eliassen et al, 2009(Eliassen et al, , 2012. The second was a population-based study of incident breast cancer among women of Asian ancestry living in California and Hawaii (Faupel-Badger et al, 2010) and the third was a randomised trial of the effect of consuming soy foods on hormonal outcomes, conducted in women of Caucasian, Native Hawaiian and Asian ancestry, and living in Hawaii (Maskarinec et al, 2012).
Comparing absolute levels of EMs in our study with those reported by these other studies is not straightforward. Levels of individual EMs and all EMs combined may differ between ethnicities (Maskarinec et al, 2012) and women from several different ethnicities have been included in the reports to date (Asian, Faupel-Badger et al (2010); African-American, Asian, Hispanic and Caucasian, Eliassen et al (2012); Caucasian, Native Hawaiian and Asian, Maskarinec et al (2012)). In addition, the 30 non-carriers that we analysed may not be representative of the general British population, as they were selected on genotype (with a MAF of 4%, we would expect 2.2 CYP3A7*1C carriers in a population-based sample of 30 women) and 6 (20%) were firstdegree relatives of breast cancer cases, suggesting that they may be In parous women. For quantitative traits (age at urine collection, BMI at urine collection and age at first full-term pregnancy), means and ranges are presented. For parity, the number and percentage of women in each category is shown. For quantitative traits, P-values were from t-tests; for parity, the P-value was from a Fisher's exact test. . Geometric mean levels (pmol mg À 1 creatinine) of 12 EMs that we measured in urine samples from 60 premenopausal women; 30 carriers of the CYP3A7*1C allele (light grey) and 30 non-carriers (dark grey). Estimates are the average of two samples per woman, taken on sequential days, calculated to be at or around ovulation based on the woman's usual menstrual cycle length. Error bars represent s.e. Levels of two of the 4-hydroxylation pathway EMs (4-MeOE 1 and 4-MeOE 2 ) were below detection in 92 (4-MeOE 1 ) and 107 (4-MeOE 2 ) of the samples analysed. These two EMs were, therefore, excluded.

Levels of EMs
at higher risk than the general population. Our analysis is based on the average of two periovulatory samples, Faupel-Badger et al (2010) analysed both non-luteal phase and luteal phase samples, whereas the studies reported by Eliassen et al (2009Eliassen et al ( , 2012 and Maskarinec et al (2012) focussed on luteal phase samples. There is, however, no evidence that the 2-OHE 1 :16a-OHE 1 ratio differs according to the phase of the menstrual cycle (Faupel-Badger et al, 2010). Comparing this ratio across studies, the 2-OHE 1 :16a-OHE 1 ratio in our non-carriers (3.9) was similar to the 2-OHE 1 :16a- We have previously demonstrated that the CYP3A7*1C allele is associated with lower levels of urinary E 1 G (Johnson et al, 2016). Here we have demonstrated that CYP3A7*1C carriers have a more pronounced reduction in 2-hydroxylation pathway EMs, increased 16a-hydroxylation pathway EMs and markedly reduced two-hydroxylation pathway:16a-hydroxylation pathway ratios as measured by the 2-OHE 1 :16a-OHE 1 ratio (0.39 in CYP3A7*1C carriers compared with population estimates of 2.1-13.0) or all 2-and 16a-pathway metabolites combined (0.10 in CYP3A7*1C carriers compared with population estimates of 0.90 or 0.96 (Eliassen et al, 2009(Eliassen et al, , 2012). Our data are consistent with expression of the foetal CYP3A7 gene in adult carriers of the CYP3A7*1C allele resulting in (i) a modest reduction in levels of the parent oestrogen E 1 and (ii) a specific bias towards 16a-hydroxylation (which can be catalysed by CYP3A7) over 2-or 4-hydroxylation (which are not catalysed by CYP3A7 (Lee et al, 2003)).
In their recent review of oestrogen metabolism and breast cancer, Ziegler et al (2015) concluded that the combined evidence from prospective studies using LC-MS to measure EMs in prediagnostic serum, plasma and urine was consistent with the hypothesis that enhanced 2-hydroxylation is associated with reduced risk of breast cancer. They further concluded that the inverse association with enhanced 2-hydroxylation (specifically 2hydroxylation:16a-hydroxylation and 2-hydroxylation:parent oestrogens ratios) was independent of the strong positive associations between unconjugated E 2 and E 1 with postmenopausal breast cancer risk (Ziegler et al, 2015). Our analysis supports these conclusions if we assume that breast cancer risk in carriers of the CYP3A7*1C allele is influenced by two components with opposite effects. Based on the combined evidence of prospective studies of premenopausal hormone levels and breast cancer risk, lifetime lower levels of E 1 and E 2 of B45% and 27% in carriers of the CYP3A7*1C allele would be predicted to result in a substantial reduction in risk for these individuals. By contrast, an unfavourable (reduced) 2-hydroxylation:16a-hydroxylation ratio would be expected to increase risk. Our previous analyses demonstrate that the reduction in breast cancer risk for carriers of the rare allele of rs10235235 (which is correlated with the CYP3A7*1C allele A negative difference may be interpreted as lower levels in CYP3A7*1C carriers compared with non-carriers. c Missing values: 2-OHE2, 1 sample (non-carrier); 2-MeOE2, 11 samples (11 women, 2 non-carriers, 9 carriers); 3-MeOE1, 11 samples (9 women, 2 non-carriers, 7 carriers), 16-ketoE2, 5 samples (3 women, 2 non-carriers, 1 carrier).
consistent with an unfavourable (reduced) 2-hydroxylation:16ahydroxylation counteracting a potentially more substantial beneficial effect of lower levels of parent oestrogens. Strengths of our study include our comprehensive analysis of urinary EMs using HPLC-MS/MS and, as genotypes are effectively randomised at birth (Ebrahim and Davey Smith, 2008), our focus on a genetic variant (the CYP3A7*1C allele), which minimises the potential for confounding by unmeasured environmental factors. There are several limitations to our study; owing to the frequency of the CYP3A7*1C allele, the numbers are small and, given our selection procedure, the non-carriers may not be representative of the White British population. Our choice of using two consecutive periovulatory samples makes it difficult to compare EM levels in our study with other published reports, which have mainly analysed a single luteal phase sample (Eliassen et al, 2009(Eliassen et al, , 2012Faupel-Badger et al, 2010;Maskarinec et al, 2012). In addition, we do not have prospective data on CYP3A7*1C carrier status, hormone levels and breast cancer risk in large numbers of women, and we cannot compare breast cancer risk in women with low levels of the parent oestrogen E 1 according to their CYP3A7*1C status (and hence 2hydroxylation:16a-hydroxylation ratio) directly.
In conclusion, we have demonstrated that the CYP3A7*1C allele has a profound effect on levels of the parent oestrogen E 1 and the ratio of 2-hydroxylation:16a-hydroxylation EMs in premenopausal women. To characterise the association between parent oestrogens, EMs and subsequent risk of breast cancer fully, identification of additional genetic variants that influence parent oestrogens and particular pathway EMs, and further prospective studies that analyse a broad spectrum of EMs will be required.