Genome-wide association and epidemiological analyses reveal common genetic origins between uterine leiomyomata and endometriosis

Uterine leiomyomata (UL) are the most common neoplasms of the female reproductive tract and primary cause for hysterectomy, leading to considerable morbidity and high economic burden. Here we conduct a GWAS meta-analysis in 35,474 cases and 267,505 female controls of European ancestry, identifying eight novel genome-wide significant (P < 5 × 10−8) loci, in addition to confirming 21 previously reported loci, including multiple independent signals at 10 loci. Phenotypic stratification of UL by heavy menstrual bleeding in 3409 cases and 199,171 female controls reveals genome-wide significant associations at three of the 29 UL loci: 5p15.33 (TERT), 5q35.2 (FGFR4) and 11q22.3 (ATM). Four loci identified in the meta-analysis are also associated with endometriosis risk; an epidemiological meta-analysis across 402,868 women suggests at least a doubling of risk for UL diagnosis among those with a history of endometriosis. These findings increase our understanding of genetic contribution and biology underlying UL development, and suggest overlapping genetic origins with endometriosis.


GWAS -FibroGENE Cohort Descriptions
Women's Genome Health Study (WGHS) WGHS is a nested cohort within the Women's Health Study (WHS) 1 , an ongoing prospective cohort that was originally launched in 1992 as a randomized controlled trial of female North American health-care professionals focusing on cardiovascular and cancer outcomes, who provided a blood sample at baseline and consented for blood-based analyses. All participants were at least 45 years of age and free of cardiovascular disease, cancer, or other major chronic illnesses at the time of consent. Health-and lifestyle-related information were collected via questionnaires at enrollment and follow-up time points. WHS participants were asked whether they had ever been diagnosed with UL and their age at diagnosis. Cases were defined as women who selfreported 'yes' to having a history of UL, while controls were classified as women who selfreported 'no'. Women who reported an age of UL diagnosis < 20 or > 70 years of age were excluded from the analysis. Participants in WGHS were recruited under an IRB-approved protocol by the Partners HealthCare System Human Research Committee. For this study, a total of 12,840 women of white European ancestry were included: 3,375 UL cases and 9,465 controls.

Northern Finland Birth Cohort (NFBC)
NFBC includes two longitudinal and prospective birth cohorts of white European women and offspring collected at 20-year intervals from the same provinces of Oulu and Lapland in Finland: NFBC1966 and NFBC1986. In this study, we utilized data from NFBC1966. Cases (n=363) with a history of UL were identified through national outpatient and inpatient hospital discharge registers and self-reported diagnosis through postal questionnaire at age 46. The hospital discharge registers include WHO ICD codes for identification of disease diagnoses and dates for each hospital visit. Controls (n=5,000) were drawn from the rest of the cohort population. Informed consent was obtained from all participants using protocols approved by the Ethical Committee of the Northern Ostrobothnia Hospital District.

QIMR Berghofer Medical Research Institute (QIMR)
In the QIMR cohort women were originally recruited into a study examining predisposition to endometriosis 2 and a twin study of gynecological health 3 . The cohort includes affected sister pair families (aged 15-87 years [affected women] at the end of sample collection) and twin pairs (aged 29-91 years at the return of questionnaire) of white European women. For both studies, women completed questionnaires on various aspects of their reproductive health. Participants who answered "yes" to the "uterine fibroids" option of the question "Have you ever had any of the following conditions?" were selected as cases (n=1,484). Of the 1484 cases, 585 came from the endometriosis sample, 579 had a surgically-confirmed diagnosis of endometriosis, the remaining six had a family member (daughter, cousin, or 2nd cousin) diagnosed with endometriosis. Controls (n=3,701) were drawn from twin pairs in the gynecological health study in which both sisters answered "no" to a question about medical history of uterine fibroids (one sample per twin pair).
Validation of self-reported hysterectomy has previously been examined in the twin pairs; the diagnosis was confirmed in 97.6% of those who reported hysterectomy and for whom medical response from a physician was available 3 . Informed consent was obtained from all participants.
Approval for the studies was granted by the Human Research Ethics Committee at the QIMR Berghofer Medical Research Institute and the Australian Twin Registry.

UK Biobank (UKBB)
The UKBB is a large national and international health resource following the health and wellbeing of 500,000 male and female volunteer participants, enrolled at ages from 40 to 69 4 . The UKBB study began in 2006 with the aim to follow the participants for at least 30 years thereafter.
Information has been collected from participants during recruitment using questionnaires on socioeconomic status, lifestyle, family history and medical history. Participants have also been followed up for cause-specific morbidity and mortality through linkage to disease registries, death registries, and hospital admission records. For this study, altogether 220,936 women of European ancestry were considered. Based on both hospital-linked medical records and self-report (interview with research nurse), women with a history of UL were selected as cases (n=15,184), while controls (n=205,752) had no previous history of UL. When limited by heavy menstrual bleeding (HMB), a total of 3,409 women with both UL and HMB were selected as cases, and 199,171 women as controls without a history of UL or HMB. For HMB, only cases with hospitallinked medical records were considered (n=9,813). Informed consent was obtained from all participants. The UKBB project is approved by the North West Multi-centre Research Ethics Committee.

23andMe Cohort
Participants were drawn from the customer base of 23andMe (Mountain View, CA, USA). For this study, the 23andMe cohort included 58,655 unrelated European women. Data on participants' history of UL were collected via self-report in online surveys. Medical history of UL was determined with the research question, "Have you ever been diagnosed with uterine fibroids?", which had three response options: yes, no, and not sure. Females who answered "yes" were selected as cases, those who answered "no" as controls, and those who answered "not sure" were excluded from the study, resulting in 15,068 cases and 43,587 controls. All 23andMe research participants provided informed consent and answered surveys online according to a human subject protocol approved by Ethical and Independent Review Services, an external institutional review board.

Comorbidity analysis -Cohort descriptions and statistical analyses
Nurse's Health Study II (NHSII) NHSII is an ongoing prospective cohort established in 1989 when 116,429 female registered nurses, aged 25 to 42 years, completed a baseline questionnaire on demographic and lifestyle factors, anthropometric variables, and disease histories. Starting in 1993, participants were asked if they had "ever had physician-diagnosed uterine fibroid(s)," and, if so, the date of diagnosis and whether it had been confirmed by pelvic exam or ultrasound/hysterectomy. Thus, the NHSII participants were queried about endometriosis diagnosis prospectively during reproductive years, in contrast to recalled diagnosis from often many years prior for all of WHS and the majority of UKBB participants. The validity of self-reported UL has previously been examined, with the diagnosis confirmed in 93% of women who reported ultrasound or hysterectomy confirmation 5 .
Definition of UL diagnosis was restricted to women who reported ultrasound or hysterectomy confirmation of their diagnosis. Participants were also asked if they had "ever had physician-diagnosed endometriosis," and, if so, the date of diagnosis and whether it had been confirmed by laparoscopy. The validity of self-reported endometriosis has previously been examined, with the diagnosis confirmed in only 54% of those without report of surgical confirmation but in 96% of these medical professional women who reported laparoscopic confirmation 6 . Therefore, cases of endometriosis were restricted to women who reported laparoscopic confirmation of their diagnosis.
Time-dependent survival analysis methods were applied to the NHSII prospective cohort data.
Participants contributed follow-up time from the return of the 1989 questionnaire until report of UL, diagnosis of any cancer with the exception of non-melanoma skin cancer, hysterectomy, menopause, death, or loss to follow-up, whichever occurred first. No other censoring or exclusion criteria were applied. Cox proportional hazards regression models with age and questionnaire period as the time scale and time-varying covariates were used to estimate hazard rate ratios (HR) and 95% confidence intervals (CI) of ultrasound/hysterectomy confirmed UL in participants with laparoscopically confirmed endometriosis compared to those without. Both prevalent (diagnosed before the start of the cohort) and incident (diagnosed after study enrollment) cases of laparoscopically confirmed endometriosis were included. Time-varying covariates were updated in the analyses whenever new information was available from the biennial questionnaires.

Women's Health Study (WHS)
The full WHS cohort includes WGHS described above (WGHS, GWAS -FibroGENE Cohort Descriptions). Healthcare professionals in the WHS cohort encompass registered nurses, licensed practical nurses, licensed vocational nurses, physicians, veterinarians, pharmacists, dietitians, dentist, dental hygienists, speech/hearing/language professionals, physical therapists, and radiology technologists; thus, the overall medical training among the WHS participants in regard to gynecologic conditions was less certain compared to the NHSII participants. For the comorbidity analyses, all ancestries were included (i.e., there was not a restriction to European ancestry as was applied to the GWAS analyses). However, women who did not respond to at least one questionnaire that queried UL and endometriosis were excluded. Briefly, within the full WHS, UL cases included women who at enrollment in 1992 (when they were aged 45 years and older) or on subsequent questionnaires retrospectively self-reported having been diagnosed with UL between the ages of 20 and 70. Women who never reported UL at baseline or at any point during follow-up were classified as not having UL. WHS participants were asked on the 2009 questionnaire (when they were at least 60 years of age) if they ever had physician-diagnosed endometriosis, and if so, whether the diagnosis was confirmed by laparoscopy. As with NHSII, the case definition for the analysis was restricted to women who self-reported physician-diagnosed endometriosis with laparoscopic-confirmation of the diagnosis. The unexposed group was defined as participants who never self-reported having been diagnosed with endometriosis. Neither UL nor endometriosis reports from WHS participants have been validated.
No data on the age at or calendar time-period of endometriosis diagnosis were collected, and therefore for the WHS, cross-sectional logistic regression analyses were used to estimate odds ratios (OR) and 95% CI for self-reported history of UL comparing those with a self-reported history of laparoscopically confirmed endometriosis to those without. The WHS cross-sectional analyses did not temporally order whether endometriosis was diagnosed before or after UL.

UK Biobank (UKBB)
The UKBB study is described in detail above (UK Biobank, GWAS -FibroGENE Cohort Descriptions). When the UKBB began in 2006, all participants were  40 years of age. UL and endometriosis cases were identified by retrospective self-reports (collected through questionnaires administered at recruitment by trained research nurses) and/or hospital diagnosis data (by linkage to hospital records with ICD-9/ICD-10 codes). The unexposed group was defined as participants who never self-reported having been diagnosed with endometriosis.
For the phenotypic comorbidity analyses all ancestries were included, with main groups based on self-report, including White European ancestry (95%), Asian (2%), Black, mixed, and other.
Logistic regression models were used to estimate OR and 95% CI for self-reported history of UL comparing those with a documented history of endometriosis to those without. In this analysis, UL cases were identified by both self-reports and/or hospital diagnosis data, with hysterectomies being the main operation through which UL were diagnosed. However, for endometriosis, only self-reported diagnosis was used as the hospital diagnosis data were artificially inflated (likely due to the high number of hysterectomies resulting in diagnostic bias). As with the WHS analyses, the UKBB cross-sectional analyses did not consider temporality in UL and endometriosis diagnoses.

Covariates
Potential confounders included in multivariable models were defined as factors potentially associated with UL and/or endometriosis risk including: ancestry, age, body mass index (BMI), smoking status, age at menarche, oral contraceptive use, parity, age at first birth, menopausal status, use of anti-hypertensive medication/diastolic blood pressure, physical activity, and alcohol consumption. Risk factors included in models were defined to be consistent across cohorts whenever possible (see footnote in Table 2 for cohort-specific adjustment details), although the value of dynamic variables (e.g. age or BMI) was defined as time-varying and updated biennially in the NHSII cohort, while they were cross-sectionally defined at the time of data collection in WHS and UKBB.     in the GWAS meta-analysis on UL across all cohorts. The labeled SNP represents the most significant SNP for the locus. SNP association P-value is shown on the y axis, while SNP position (with gene annotation) appears on the x axis. Each SNP is colored according to the strength of LD with the lead SNP. Regional association plot was produced in LocusZoom. Also, linkage disequilibrium between the lead SNPs from UL GWAS meta-analysis and heavy menstrual bleeding GWAS at 11p14.1 in women of European ancestry is presented.    Supplementary Table 4. Secondary association signals from GCTA conditional analysis based on summary statistics of UL meta-analysis including all cases and an independent UKBB-based independent sample (N = 5,000) reference to calculate LD.
Footnote: Chr, Chromosome; Genomic position is shown related to GRCh37 (hg19); RA, risk allele; RAF, risk allele frequency; BETA, SE, P, effect size, standard error and P-value from UL final GWAS meta-analysis; Freq_ref, frequency of the risk allele in the reference sample; BETA_cond, SE_cond, P_cond, effect size, standard error and P-value from conditional analyses.