Bioinformatic analysis reveals the importance of epithelial-mesenchymal transition in the development of endometriosis

Background: Endometriosis is a frequently occurring disease in women, which seriously affects their quality of life. However, its etiology and pathogenesis are still unclear. Methods: To identify key genes/pathways involved in the pathogenesis of endometriosis, we recruited 3 raw microarray datasets (GSE11691, GSE7305, and GSE12768) from Gene Expression Omnibus database (GEO), which contain endometriosis tissues and normal endometrial tissues. We then performed in-depth bioinformatic analysis to determine differentially expressed genes (DEGs), followed by gene ontology (GO), Hallmark pathway enrichment and protein-protein interaction (PPI) network analysis. The findings were further validated by immunohistochemistry (IHC) staining in endometrial tissues from endometriosis or control patients. Results: We identified 186 DEGs, of which 118 were up-regulated and 68 were down-regulated. The most enriched DEGs in GO functional analysis were mainly associated with cell adhesion, inflammatory response, and extracellular exosome. We found that epithelial-mesenchymal transition (EMT) ranked first in the Hallmark pathway enrichment. EMT may potentially be induced by inflammatory cytokines such as CXCL12. IHC confirmed the down-regulation of E-cadherin (CDH1) and up-regulation of CXCL12 in endometriosis tissues. Conclusions: Utilizing bioinformatics and patient samples, we provide evidence of EMT in endometriosis. Elucidating the role of EMT will improve the understanding of the molecular mechanisms involved in the development of endometriosis.


Protein-protein interaction (PPI) network analysis.
The PPI of DEGs-encoded proteins was demonstrated by STRING (version 11.0) 17 , with search limited to "Homo sapiens" and a score> 0.700 corresponding to high confidence interaction as significant. Network construction and analyses were performed by Cytoscape (version 3.7.1). In addition, the function and pathway enrichment analysis were performed for DEGs in the modules by ClueGo (version 2.5.4), P value <0.05 was considered to be significant.

Clinical sample collection. From June to October 2019, laparoscopic surgeries were performed in Jiangxi
Maternal and Child Health Hospital (Nanchang, China), and 6 cases were pathologically diagnosed as ovarian endometriosis. On the staging criteria of endometriosis as stipulated by American Fertility Society revised (AFS-r), all patients with endometriosis were stage IV. Eutopic endometrial tissues were collected. The average age of the patients was (32.71 ± 1.12) years. Meanwhile, 6 cases of endometrial tissue were selected from patients with benign ovarian teratoma as the control group. The average age of patients was (32.18 ± 1.22) years.
All the collected endometrial tissues were diagnosed as proliferative endometrium after pathological histological diagnosis. There was no significant difference in the age of patients in each group (P value> 0.05). All menstrual cycles were normal, non-pregnant or non-lactation, and no hormonal medication was taken 6 months before the operation, and no obvious medical and surgical diseases and complications were found.
This study was approved by the Ethics Committee of Jiangxi Maternal and Child Health Hospital, China (No. EC-KT-201904). All patients had signed the informed consent for the study protocol. The experimental scheme was approved by the academic committee of Jiangxi Maternal and Child Health Hospital, and the experimental methods were carried out in accordance with the guidelines of the academic committee. Immunohistochemistry (IHC) and image analysis. Fresh tissue specimens were taken during the operation, rinsed with physiological saline to remove blood and other impurities, fixed with 10% formaldehyde, dehydrated with conventional gradient ethanol and embedded in paraffin, continuously sliced with a paraffin microtome, and baked at 65 °C for 1 h to dewax, and removed the glass. Tablets, soak in xylene for 40 min, and soak in absolute ethanol for 20 min. Rinse once in PBS, add the configured sodium citrate solution (pure water: sodium citrate = 1000:1), and heat to boiling. Discard the sodium citrate solution after cooling, wash with PBS, and anti-CXCL12 antibody (1:200; Proteintech, Wuhan, China, 17402-1-AP) or anti-E-cadherin (CDH1) antibody (1:200; Proteintech, Wuhan, China, 20874-1-AP) was incubated, followed by incubation with goat anti-mouse/rabbit IgG polymer antibody. After rinsing with PBS three times, staining was visualised using the peroxide substrate solution diaminobenzidine. Counterstained by haematoxylin, the slides were dehydrated in graded alcohol and mounted.
Image-pro Plus software was used to convert the image format and the grayscale units into optical density (IOD) units. Then area, density and IOD were selected for measure according to the manufactor's protocol.
Statistical analysis. Student's t-test was used for statistical analysis between two different groups when variables were normally distributed, which was confirmed by Q-Q plots and the Shapiro-Wilk test (SPSS 18.0, Armonk, NY, USA). P value <0.05 was considered statistically significant.
Ethics approval and consent to participate. This study was approved by the Ethics Committee of Jiangxi Provincial Maternal and Child Health Hospital, China (No. EC-KT-201904). All patients have signed the informed consent for the study protocol and reserve the right to withdraw at any time.  www.nature.com/scientificreports www.nature.com/scientificreports/ were identified, with 573 up-regulated and 304 down-regulated. In GSE12768, 3,212 DEGs were identified, with 1,627 up-regulated and 1,585 down-regulated. The expression of the top 50 DEGs for all three datasets were visualised on heat maps (Fig. 1a-c). All DEGs were highlighted in Volcano plots (Fig. 2a-c). By comparing DEGs, which appeared in all 3 datasets, 186 DEGs were identified (Table 1) Table 2). In the cellular component, DEGs were mainly involved in extracellular exosome, extracellular space and extracellular region ( Fig. 3a; Table 3). In the biological process, DEGs were mainly involved in cell adhesion, epithelial cell differentiation, inflammatory response and extracellular exosome ( Fig. 3a; Table 4). www.nature.com/scientificreports www.nature.com/scientificreports/ Signaling pathway enrichment in DEGs. Signaling pathway enrichment of DEGs in endometriosis was performed using Metascape. The most significantly enriched pathways were submitted to Hallmark genes hit analysis. Hallmark pathway enrichment analysis identified epithelial mesenchymal transition (EMT), estrogen response late and estrogen response early as top pathways ( Fig. 3b; Table 5).

Protein-protein interaction (PPI) network analysis in DEGs.
PPI analysis was performed using the online STRING database and Cytoscape software. After removing the isolated nodes and the partially connected nodes, a grid network was constructed using the Cytoscape software (Fig. 4). Pathway enrichment analysis revealed that the genes were mainly involved in vascular smooth muscle contraction, cell adhesion molecules, NF-κB pathway, complement and coagulation cascade.  (Table 5). In PPI network analysis, CXCL12 was found to be connected to a hub gene C3, while ACTG2, ACTA2, MYL9 and MYH11 formed a connected component sub-network. In addition, a change in the expression of E-cadherin (CDH1) is the prototypical epithelial cell marker of EMT. As a result, although CDH1 is not listed in Gene Set Hallmark_EMT, it was included in further analysis. Expression levels of these 6 genes (CXCL2, ACTA2, MYL9, ACTG2, MYH11 and CDH1) were analysed in these three databases (Fig. 5). Significant increases were observed in CXCL2, ACTA2, MYL9, ACTG2 and MYH11 across all three databases. A significant decrease in CDH1 was observed in all three databases. We further investigated the expression of E-cadherin (CDH1) and CXCL12 in endometriosis or control tissues by IHC. As shown in Fig. 6, E-cadherin was significantly down-regulated in endometriosis ( Fig. 6a; P value = 0.028), while CXCL12 was significantly increased in endometriosis ( Fig. 6b; P value = 0.015).

Discussion
Endometriosis occurs in about 10-15% of reproductive age females and the etiology is unknown 1,2 . At present there is no cure and the treatment options available are limited. The disease has a high recurrence rate, which adds to its large socio-economic impact 18 . Endometriosis is the growth of cells derived from the endometrium outside the uterus, such as the ovaries, peritoneum, intestines and vagina 19 . In a small number of cases (0.5-1%) endometriosis can lead to tumor formation 20 . The underlying mechanisms of the disease are similar to malignant tumors such as cell proliferation, differentiation, apoptosis, migration, cell adhesion, invasion, and neurovascularisation 21 .
Utilising data from 3 microarray datasets (GSE11691 11 , GSE7305 12 , GSE12768 13 ), we identified DEGs between endometriosis tissues and normal endometrial samples, including 118 up-regulated and 68 down-regulated genes. GO functional analysis based on these DEGs shows that DEGs are mainly enriched in cell adhesion, inflammatory response, and extracellular exosome. These findings are similar to those previously published 22 .
Importantly, Hallmark pathway enrichment analysis identified EMT as the most significant pathway. A number of studies have implicated EMT in the development of endometriosis [23][24][25] . EMT is a biological process   where immotile epithelial cells acquire phenotypes of motile mesenchymal cells, this is accompanied by changes in cell morphology and gene expression 26 . It creates favourable conditions for the implantation and growth of endometriotic lesions 27 . During EMT the expression of a number of epithelial surface markers are lost including E-cadherin (CDH1), keratin, Desmoplakin, Mucin-1 and claudin; whilst a number of mesenchymal makers are up-regulated such as N-cadherin, vimentin, and fibronectin 28,29 . Numerous signaling pathways are suggested to participate in EMT induction, including transforming growth factor β (TGF-β) 30 , Wnt/β-catenin signaling pathway 31 , estrogen receptor β (ER-β) 32 , epidermal growth factor (EGF) 33 , mitogen-activated protein kinase (MAPK)/ extracellular signal-regulated kinase (ERK) 34 , NF-κB 35 , estrogen receptor (ER)-α 36 and hypoxia-inducible factor (HIF)-1α 37 . The activities of these pathways appear to be interconnected to one another, and depend on the particular epithelial or endothelial cell type affected, different signaling molecules mediate their interconnection or crosstalk. Previous studies have also found that EMT can be induced by pro-inflammatory cytokines in endometriosis, such as TGF-β 38 , tumor necrosis factor (TNF)-α 39 and interleukin (IL)-6 40 . The mechanisms that present or activate TGF-β in the tissue microenvironment are of importance for the EMT response 41 . TGF-β induced EMT mediated by inflammatory cells in the tumor microenvironment is promoted by leukotriene B4 receptor 2, which, in response to leukotriene B4, activates reactive oxygen species (ROS) and NF-κB transcriptional activity that facilitates the establishment of EMT by TGF-β 42 .  Table 4. Biological process analysis of DEGs in endometriosis.

M5930
Epithelial mesenchymal transition 15 4.75E-11 Estrogen response early 11 9.58E-09 www.nature.com/scientificreports www.nature.com/scientificreports/ In this unbiased study, we found EMT in endometriosis could be potentially induced by inflammatory cytokines such as C-X-C motif chemokine ligand 12 (CXCL12), also known as stromal cell-derived factor 1 (SDF1). CXCL12 is highly expressed in endometriosis in our analysis, which is consistent with a previous report 43 . CXCL12 interacts with its specific receptor, C-X-C motif chemokine receptor 4 (CXCR4), which is not consistently over-expressed in these three datasets though. The CXCL12-CXCR4 axis promotes proliferation, migration, and invasion of endometriotic cells 44,45 . In human papillary thyroid carcinoma, the CXCL12-CXCR4 axis promotes EMT processes by activating the NF-κB signaling pathway 46 . In a murine model of endometriosis both C-X-C motif chemokine receptor 7 (CXCR7) and CXCL12 expression increased with grafting time 47 . Expression of CXCR7 is enhanced during pathological inflammation and tumor development, and CXCR7 mediates TGFβ1-induced EMT 48 . However, there were no probes for CXCR7 in the microarrays analysed in our studies. In endometriosis, it is still unclear whether CXCL12 promotes EMT through the CXCL12-CXCR4 axis or the CXCL12-CXCR7 axis. PPI analysis showed that CXCL12 interacts directly with complement C3 and C-C motif chemokine ligand 21 (CCL21), and a previous study showede CCL21 is up-regulated in endometriosis, which acts through inflammatory responses 49 . In TGF-β-induced EMT, the expression of C-C motif chemokine receptor 7 (CCR7), the CCL21 receptor, is increased and this facilitates breast cancer cell migration 50 . Through IHC, we confirmed that CXCL12 is significantly increased in endometriosis, accompanied by a decrease in the expression E-cadherin (CDH1), which is consistent with bioinformatics analysis. These findings, together, suggest that CXCL12 may lead to endometriosis through EMT, although further research is required. www.nature.com/scientificreports www.nature.com/scientificreports/ EMT in endometriosis has been suggested to be associated with smooth muscle metaplasia and fibrogenesis 51,52 . We found various markers for smooth muscle cells in our analysis, including ACTA2 and MYL9, which interact with ACTG2 and MYH11 in the PPI network analysis. ACTA2 (α-SMA), is considered to be a marker of fibrosis and is up-regulated in endometriosis 53 , which is consistent with our findings. Previous studies 54,55 have shown that platelet-derived TGF-β1 can activate the TGF-β1/Smad3 signaling pathway, subsequently promoting EMT and fibroblast-to-myofibroblast trans-differentiation (FMT) in endometriotic lesions in turn, promoting smooth muscle metaplasia and ultimately leading to fibrosis.

Conclusion
By comparing 3 microarray datasets, we have identified 186 DEGs (118 up-regulated, 68 down-regulated) which may be involved in the progression of endometriosis. GO functional analysis determined DEGs were mainly enriched in cell adhesion, inflammatory response, and extracellular exosome. EMT was the highest ranked Hallmark pathway enrichment and we proposed that it could be induced by inflammatory cytokines and