Introduction

Uterine leiomyomas (UL), also known as uterine fibroids, are benign tumours in the uterus1. UL are typically identified during the mid or late reproductive years in women, and they decrease in size after menopause2. The size and number of fibroids can vary among patients3. According to a systematic review published in 2017, the incidence of UL could be ranged from 217–3745 cases per 100,000 women-years4. In general, the prevalence of UL in women ranges from 20% in Europeans to as high as 80% in African-American women4,5. No symptoms can be identified in more than 50% of women with UL6. For the remaining women with UL, their clinical symptoms can range from abnormal bleeding and pelvic pain to infertility and pregnancy complications6.

Evidence from multiple studies has shown that UL have contributions from both environmental and genetic factors7,8,9. Early familial aggregation and twin studies have identified a significant genetic component to UL predisposition10,11. Makinen et al. performed a whole-exome sequencing study on 18 UL patients and identified the MED12 gene as contributing to tumourigenesis12. In a GWAS based on Japanese populations conducted in 2011, three loci on chromosomes 10q24.33, 22q13.1, and 11p15.5 were identified to be significantly associated with the disease status of UL13. These three loci included several genes such as STE20 Like Kinase (SLK), Oligosaccharide-Binding Fold-Containing Protein 1 (OBFC1), Trinucleotide Repeat Containing 6B (TNRC6B), Outer Dense Fiber 3 (ODF3), Bet1 Golgi Vesicular Membrane Trafficking Protein Like (BET1L), RIC8 Guanine Nucleotide Exchange Factor A (RIC8A), and Sirtuin 3 (SIRT3). Since then, several follow up studies have tried to replicate these initial GWAS findings using study samples based on other ethnic groups14,15,16. However, the results of these subsequent studies have not been concordant and have, at times, been contradictory. More studies with large sample sizes are still needed to confirm these previous hits.

In this study, we attempted to replicate two initial significant loci, TNRC6B and BET1L, identified in a GWAS conducted by Cha et al.13 by using study subjects with Chinese Han ancestry. A total of 2,055 study subjects were recruited, and 55 SNPs mapped to TNRC6B and BET1L were selected and genotyped in samples from these subjects. In addition to genetic associations between these SNPs and the disease status of UL, we also examined potential associations between targeted SNPs and clinical characteristics of UL. Bioinformatics tools were also utilized to evaluate the potential biological functions of the targeted SNPs.

Methods

Study subjects

In the present study, a total of 674 women with UL and 1,381 healthy women, controls without any systematic disease, were recruited from the Second Affiliated Hospital of Xi’an Jiaotong University between April 2013 and May 2017. All patients were diagnosed with UL by ultrasonography and confirmed by at least two senior physicians, and all subjects were screened for no other female reproductive system tumours, systemic disease or history of malignancy. Self-administered questionnaires were used to collect demographic data, and the characteristics of our study subjects are shown in Table 1. All participants were unrelated Han Chinese individuals, and the UL and control groups were matched by age and body mass index (BMI). Significant differences were identified for duration of menses (P = 0.005) and menstrual cycle (P = 0.003) between UL cases and healthy controls. The size of UL was categorized into three groups (small, medium, and large) based on the diameter of the UL (small ≤2 cm, 2 cm <medium <4 cm, large ≥4 cm). If subjects were diagnosed with multiple UL, the largest one determined the size group. The study protocol was approved by the Ethics Committee of Xi’an Jiaotong University in accordance with the ethical guidelines of the Declaration of Helsinki of 1975 (revised in 2008). Written informed consent was obtained from participants.

Table 1 The clinical and demographic characteristics of the uterine leiomyoma and control groups.

SNP selection and Genotyping

We searched for all SNPs with a minor allele frequency (MAF) ≥0.05 within the regions of the TNRC6B and BET1L genes in the 1000 Genomes Chinese Han Beijing population (CHB). Then, MAF ≥0.05 with pair-wise tagging and r2 ≥0.8 were used as the cut-off criteria during tag SNP selection, which generated 27 and 28 tag SNPs within the TNRC6B and BET1L genes, respectively. General information about these 55 selected SNPs is summarized in Supplemental Table S1. Most of the selected SNPs were non-coding SNPs. Genomic DNA was extracted from peripheral blood leukocytes according to the manufacturer’s protocol (Genomic DNA kit, Axygen Scientific Inc., California, USA). Genotyping was performed for all SNPs using the Sequenom Mass ARRAY RS1000 system (Sequenom, San Diego, California, USA). The results were processed using Typer Analyser software, and genotype data were generated from the samples17. Case and control status was blinded during all genotyping processes for quality control. Five percent of the samples were repeated at random, and the results were 100% concordant.

Statistical and Bioinformatics Methods

Hardy-Weinberg equilibrium was tested for each SNP within the control samples. χ2 tests were performed for each SNP to evaluate the differences in allelic and genotypic distributions between UL cases and controls. Linkage disequilibrium (LD) blocks were constructed for both genes, and haplotype-based analyses were conducted for each block. Plink was utilized for the analyses mentioned above18. In addition to genetic association analyses focusing on disease status, we also analysed the potential link between significant SNPs and four clinical features of UL, including bleeding, pain, number of fibroid nodes, and size of the node, in a subset of our samples that included UL cases only. χ2 tests were performed for these analyses. In general, Bonferroni correction was applied to address multiple comparisons. For single marker-based association analyses, the threshold P value was 0.05/55 ≈ 9 × 10−4. Genomic control was applied to correct for the potential effects of population stratification19,20. The null distribution of genomic inflation factor λ was constructed by 10,000 bootstrapping.

The potential biological functions of our selected SNPs were evaluated through RegulomeDB (http://www.regulomedb.org/)21. RegulomeDB is a database that annotates SNPs based on known and predicted regulatory element data from the ENCODE project. A score ranging from 1–6 was assigned to each SNP, and a lower score indicated a more significant biological function. In addition, we also extracted eQTL data from the GTEx database (https://www.gtexportal.org/home/)22 to examine differences in gene expression associated with our significant SNPs.

Results

We identified two significant SNPs in our two candidate genes: rs2280543 in BET1L (3′-untranslated region, χ2 = 18.3, OR = 0.64, P = 1.87 × 10−5) and rs12484776 in TNRC6B (Intron, χ2 = 19.7, OR = 1.40, P = 8.91 × 10−6) (Table 2 and Supplemental Table S2). Genotypic analyses verified this result. Genomic controls applied on the results of single marker-based association analyses showed no significant inflations in χ2 statistics. The inflation factor was less than 1, as was the upper boundary of the 95% confidence interval (Supplemental Figure S1). Six LD blocks were constructed for BET1L, and another seven blocks were constructed for TNRC6B. Haplotype-based analyses identified 2 significant two-SNP LD blocks (Table 3). LD block rs2280543-rs4980319 in BET1L2 = 77.56, P = 1.44 × 10−17) and LD block rs12485003-rs12484776 in TNRC6B2 = 27.69, P = 9.70 × 10−7) were identified to be significantly associated with the disease status of UL. Further analyses using UL case only samples identified rs2280543 as significantly associated with the number of fibroid nodes (P = 0.0007), while rs12484776 was significantly associated with the size of the node (χ2 = 54.88, P = 3.44 × 10−11) (Table 4).

Table 2 Genetic association of rs2280543 and rs12484776 with UL.
Table 3 Haplotype based genetic associations of BET1L and TNRC6B with UL.
Table 4 Genetic associations of rs2280543 and rs12484776 with clinical characteristics of UL patients.

Data extracted from RegulomeDB showed that both SNPs, rs2280543 and rs12484776, had a RegulomeDB score of 5 (Supplemental Table S1). This score indicates that there was very limited evidence indicating the potential regulatory role of these two SNPs. Expression quantitative trait loci (eQTL) data from GTEx for both rs2280543 and rs12484776 were extracted and examined. Significant findings are summarized in Table 5. The threshold P values were 0.05/47 ≈ 0.001. SNP rs2280543 was found to be significantly associated with BET1L gene expression in 15 of 47 human tissues, while rs12484776 was identified to be significant only in oesophagus muscularis (effect size = −0.17, P = 4.60 × 10−4). Neither SNP was significantly associated with gene expression in the uterus (Supplemental Table S3).

Table 5 Significant eQTL results for rs2280543 and rs12484776.

Discussion

With the widespread application of sequencing and genetic association analyses for studying the genetics of complex diseases, candidate gene-based association studies have successfully mapped susceptibility for many complex diseases23,24,25,26,27,28,29. Our data based on ~2000 study subjects from a Chinese Han population provide strong evidence for the genetic association between UL and two candidate genes, BET1L and TNRC6B. To the best of our knowledge, this study is the first genetic association study for BET1L and TNRC6B and UL based on Chinese populations. Our findings of single marker-based associations for both rs2280543 and rs12484776 replicate initial reports from Cha et al.13. Given that it is not sufficient to draw conclusions from limited SNPs analyses30,31,32, we performed haplotype analyses, which indicated a similar pattern with single marker-based associations. However, Bondagji et al. performed a replication study based on Saudi women, and rs2280543 from BET1L was not reported to be significant16. This difference might be due to the different LD structures from different genetic backgrounds. Both Japanese and Chinese Han populations belong to the Asian population and are therefore more genetically similar than Saudi women from the Middle East. In addition, different sample sizes between the two studies might be a reason for this difference. We have compared our association analyses results of rs2280543 and rs12484776 with the other 3 previous reports (Supplemental Table S4). Among these studies, the directions of effects for both SNPs were basically the same. The only different one was rs2280543 from the study of Bondagji et al. This might be due to its small sample size compared to the other 3 studies.

In the UL case only sub-group, we identified significant associations between two targeted SNPs and relevant clinical features of UL. Our data showed that SNP rs2280543 from BET1L was significantly associated with the number of fibroid nodes, while the SNP rs12484776 from TNRC6B was significantly associated with node size. rs12484776 of TNRC6B has been reported to be related to node size (volume) in at least two previous studies based on European populations14,15. However, to the best of our knowledge, rs2280543 from BET1L has never been reported to be associated with the number of fibroid nodes. Our finding indicated that the TT and CT genotypes of rs2280543 were related to multiple fibroid nodes rather than a single fibroid node in the Han Chinese population. Studies with comparative sample sizes based on other populations are needed to verify these findings in the future.

In this study, we investigated the potential association between UL and two loci, BET1L and TNRC6B. BET1L is a protein coding gene located at 11p15.5. It encodes a protein, BET1L, that facilitates the Golgi vesicular membrane trafficking process33. TNRC6B, which is located at chromosome 22q13.1, is a tri-nucleotide repeat containing the 6B protein, which was identified to be co-purified with a cytoplasmic HeLa cell protein complex. In addition, the TNRC6B protein was also reported to be required to mediate microRNA-guided mRNA cleavage in HeLa cell culture34. Despite these primary studies, no more specific functions of TNRC6B have been reported. As a population-based study, it is beyond our scope to investigate the underlying biological mechanisms of these two loci and relate them to the pathogenesis of UL. Experimental studies based on animal models are needed in the future to unravel the roles of both loci in the onset and development of UL.

Both significant SNPs, rs2280543 and rs12484776, seemed to have very limited functional significance based on their RegulomeDB scores, which are derived from regulatory element annotations based on ENCODE data. However, eQTL analyses based on GTEx data showed that both SNPs are significantly associated with the expression of their genes. This eQTL effect was relatively weaker for rs12484776, for which a significant difference in expression was identified in only 1 of 47 human tissues. On the other hand, this effect was more universal and widespread for rs2280543 and its gene, BET1L. Expression of BET1L was significantly associated with rs2280543 in 15 of 47 human tissues, and the most significant hit in skeletal muscle has a significance level of 10−18. Interestingly, a similar eQTL pattern was also reported in the initial GWAS conducted by Cha et al.13. They also identified that rs2280543 is significantly associated with transcript levels of BET1L in three cell types: lymphoblastoid cell lines, peripheral blood mononucleated cells and cortical brains based on in silico analysis. The findings of the functional consequence for these candidate SNPs indicate that these SNPs might be more than surrogates but rather have real biological functions contributing to the susceptibility of UL. A potential limitation for our eQTL results is that these data were based on human tissues from normal samples rather than from UL patients. Therefore, we need to be careful in making any premature conclusions. One thing interesting to note is that the protective allele T of rs2280543 from BET1L was significantly related to the up-regulated expression of BET1L in multiple human tissues. This connection between disease risk of UL and gene expression of BET1L might indicate some underlying pathogenesis mechanisms of UL, and further studies are still needed in future to unravel this biological mechanism.

In the study, we have tried our best to restrict population stratification when recruiting subjects by restricting the study subjects with stable living area35,36, but the potential population stratification could not be completely ruled out. Moreover, as a candidate gene-based study, we mainly focused on several pre-selected and common tagged polymorphisms. This strategy minimizes the experimental expense at the cost of dropping >90% of the variants of a particular gene. Structural variations and low-frequency and rare variants were not detected in this study. Several recent studies have shown that these undetected DNA variants might play an important role in the susceptibility to complex disorders37. Sequencing technology-based studies are needed in the future to systematically evaluate the genetic risk of UL.

In conclusion, in this study, we showed that both BET1L and TNRC6B contribute to the risk of UL in Chinese women. Significant hits were identified by both single marker-based and haplotype-based analyses. Significant SNPs from BET1L and TNRC6B were also identified to be significantly associated with the number of fibroid nodes and the size of the nodes, respectively.