Introduction

Pelvic organ prolapse (POP), the dropping of the pelvic organs due to the loss of normal support of the vagina, is an age-related condition associated with enormous physical and emotional discomfort for a vast number of women worldwide. In total, POP affects 40–50% of women, being one of the most common reasons for gynecological surgery1,2,3. Despite a relatively large number of scientific papers on POP, mechanisms of its occurrence remain unclear2, whereas understanding of POP pathophysiology is necessary for its prevention and treatment.

Expression studies provide valuable information for deciphering molecular mechanisms of diseases. Several reviews on POP have included results of the investigations of expression changes in POP mainly focusing on the genes/proteins of collagen, elastin, matrix metalloproteinases and their tissue inhibitors4, 5. The comparison of the expression patterns of multiple genes in different studies may be useful for understanding of disease pathogenesis at the gene level. To the best of our knowledge, no research has been performed so far to analyze all available data from expression studies in POP, on both specific gene and whole-genome/proteome levels. In order to identify new genes and biological processes implicated in POP pathogenesis, we conducted a systematic review of the expression studies and an in silico analysis of publicly available data sets related to POP development.

Results

Overview of studies assessing individual gene expression profile in POP

From a total of 465 studies found on the theme of investigation by searching PubMed, Embase and Web of Knowledge resources (Supplementary Figure S1) 78 papers were selected for data analysis (Supplementary Table S1). One hundred twenty two gene or protein products were studied, 113 of them corresponded to specific genes (Supplementary Table S1). In general, papers on associations between studied genes or proteins and POP are characterized by high heterogeneity in experimental design and data presentation. The visualization of survey data for the genes/proteins investigated in five or more studies was conducted to display proportions of data for up-regulation, down-regulation or non-significant associations for all studied genes/proteins and POP (Fig. 1). Proportions are indicated for the number of studies and for the number of subjects in these studies. The least variable results with the largest number of studies being performed were found for MMP2/ MMP2 followed by MMP1/MMP1. Data for MMP1 were in agreement with results of the only meta-analysis in the field, namely with a higher expression level of MMP1 protein in POP cases in comparison with controls6. Decreased activity or non-significant results were mainly registered for TIMP Metallopeptidase Inhibitors, Collagen type I alpha 1, Lysyl oxidase, Fibulin 5 and Elastin. Other data appeared to be far more contradictory.

Figure 1
figure 1

Summary of literature data on POP-related expression changes for selected genes. Numbers in boxes indicate the number of studies (Fig. 1a) and subjects (Fig. 1b) for each marker. Letter symbols (a–g) in the small boxes are decrypted in the right upper corner.

Next, we applied KOBAS 3.0 resource7 to perform GO (Gene Ontology)8 gene set enrichment analysis for the gene spectrum considered in the expression studies. REVIGO9 summary for the cluster GO representatives is provided in Fig. 2. As expected, the highest enrichment (the lowest P-value) was found for the term “extracellular matrix organization” which is the most specific relative to other displayed GO terms (has the lowest frequency in the underlying GO annotation database). From 13 genes presented in Fig. 1, 11 genes were related to extracellular matrix organization (ECM). These genes also contributed to the other biological processes which are indicated for the whole set of the studied genes (Fig. 2).

Figure 2
figure 2

(a) Heat map for GO terms cluster representatives for genes considered in POP studies. (b) GO terms associated with the selected genes. Color indicates the user-supplied P- value; the term ‘frequency’ means frequency of the specific term in the underlying GO annotation database.

Overview of whole genome/proteome studies of POP

Papers assessing the whole genome/proteome expression profile in POP did not provide consistent results (Table 1). Two studies on pubococcygeus tissue in POP patients have shown both up- and down-regulation of cytoskeletal genes or proteins10, 11. In full-thickness vaginal wall biopsies, genes related to smooth muscle contraction, proteolysis, response to oxidative stress, transcriptional regulation, cytoskeletal organization, and lipid catabolism were over-expressed12. The analysis of 34 arrays (17 round ligaments (RLs) and 17 uterosacral ligaments (USLs)) has revealed that ‘immunity and defense’ genes were up-regulated in POP13. Expression changes of transcriptional response and signal transduction genes associated with estrogen were detected in the USLs of POP patients14. Genes relevant to cell cycle, proliferation and embryonic development as well as genes related to cell adhesion were down-regulated in USLs of females with uterine prolapse15. In USL samples, differently expressed genes (DEG) between POP patients and controls were significantly enriched with those related to canonical Wnt receptor signaling pathway (GO term) and neuroactive ligand-receptor interaction (pathway)16.

Table 1 Characteristics of the whole genome/proteome studies of POP.

Overview of available GEO datasets for POP

Among published whole genome studies, only the study of Brizzolara13 has been represented in the GEO database repository17. An unpublished study assessing gene expression profile in the sites of prolapsed versus non-prolapsed vaginal tissues is also available in the resource (Table 2).

Table 2 Characteristics of the POP datasets in the Gene Expression Omnibus (GEO) database repository.

Since the focus of our study was to find consistent patterns of gene expression profiles in different POP-related datasets, we subjected the whole dataset of Brizzolara13 stratified by the sort of ligaments (USLs or RLs) and the menopausal status (premenopausal (PrM) or postmenopausal (PM)) to the analysis. We were guided by the fact that the USLs, being the main supportive structures of the uterus and vagina18, have demonstrated higher tensile biomechanical properties (stiffness and maximum stress) than RLs19. These differences may be linked with distinctive features of gene expression profile. Additionally, in the study of Brizzolara13 USLs and RLs differed by smooth muscle cells composition; moreover, among the top 250 DEG genes only five genes coincided for USLs and RLs (the genes F5, NR4A3, CLC, SLC24A4 and GDF15 were up-regulated in both series). With regard to menopausal status, we have taken into account that expression patterns often differ between PrM and PM females20,21,22,23,24,25. Given that menopause is one of the main risk factors for POP, sample stratification based on menopausal status may provide better comparable results.

GO enrichment analysis for the GEO datasets

DEG were analyzed with the KOBAS 3.0 application. Data for the series of not less than five genes associated with GO terms in the studied sets are presented in Supplementary Table S2. The number of enriched terms was noticeably larger for up- than for down-regulated genes. Biological process gene ontology terms for the five sets of over-expressed genes were summarized with the REVIGO application (Fig. 3). This analysis showed massive enrichment for genes implicated in tissue renewal and regeneration.

Figure 3
figure 3

Graph for the results of enrichment analysis for five gene sets up-regulated in POP. More similar nodes are placed closer together. The line width indicates the degree of similarity between GO terms cluster representatives. Clustering by color into super clusters was obtained by using REVIGO TreeMap application.

Up-regulated genes were associated with a number of shared terms for all five gene sets. Interestingly, among the most specific GO categories (not more than 500 background genes in the GO database) terms that include the wording ‘positive regulation’ covered biological processes related to: adhesion (n = 3), locomotion (n = 4), activation (n = 3), transport (n = 2) as well as cytokine production, MAPK cascade and nervous system development. These mechanistic considerations also provide some evidence that genes that control tissue repair are over-represented among the up-regulated genes in all data sets under study.

A bird’s eye view on co-expression data for DEG

Co-expression analysis may give insights into altered regulatory mechanisms between disease and healthy controls, since co-regulated genes tend to exhibit similar expression patterns. Given that up-regulated genes in POP tissues were enriched with those relevant to tissue repair, we found it interesting to perform a comparative co-expression analysis. Pearson correlation coefficients were plotted in heat maps and density plots for pared genes (Fig. 4). Higher levels of co-expression in prolapsed versus healthy tissues were revealed among sets of up-regulated genes in all premenopausal groups (PrM_RLs, PrM_USLs, AVW) and down-regulated genes in the groups PrM_RLs and PrM_USLs. This observation may reflect the underlying activity of transcriptional networks, indirectly supporting the assumption on activation of regeneration processes in POP tissues; however, co-expression of down-regulated genes may reflect some intrinsic problems in realization of these processes. In postmenopausal sets (PM_RLs and PM_USLs), density plots for up- and down-regulated genes were somewhat similar, thus indicating relatively high levels of positive and negative correlations in both control and POP specimens. A possible explanation of this phenomenon is that in aging tissues disturbances in regulatory mechanisms can predominate and/or obscure the activity of the repair processes. The results may also be random due to the small samples size.

Figure 4
figure 4

Co-expression heatmaps and density plots. Heatmaps display Pearson correlation coefficients for gene expression in controls (above the diagonal) and cases (below the diagonal). Density plots present correlation coefficients distribution for controls (blue) and cases (red) with addition of percentage of Pearson correlations (r-values) ≥ 0.7 and ≤ -0.7. Density plots x-axis: Pearson correlation coefficient (r), y-axis: density.

Enrichment analysis (GO, KEGG, GWAS Catalog) for the shared gene set

A total of 142 up-regulated and 12 down-regulated genes were shared between two or more datasets (Supplementary Table S3). All top DEG in the study of Brizzolara13 appeared to be in the list of shared genes.

The results of GO enrichment analysis for shared up-regulated genes correlated with those obtained for the individual sets (Fig. 5). For down-regulated genes, GO analysis did not yield significant terms.

Figure 5
figure 5

REVIGO scatterplot for GO terms cluster representatives for shared genes up-regulated in POP tissues. Bubble color indicates the user-supplied P - value; size shows the frequency of the GO term in the GO annotation database.

All shared up-regulated genes were tested for enrichment of metabolic pathways and association signals from GWAS data (Table 3). The enriched KEGG pathways were mainly linked with inflammatory and immune-mediated diseases. Of great interest were the results of the enrichment analysis of GWAS association data which indicated that genes associated with inflammatory bowel disease (IBD) and Crohn’s disease (one of the two main forms of IBD) were overrepresented among shared up-regulated genes. These genes (with the exception for the genes SLC22A4, BORCS5 and CNNM1) are involved in ‘immune system process’ (GO term).

Table 3 The results of gene set enrichment analysis for shared genes up-regulated in POP tissues.

Literature data supporting associations revealed for the shared genes

Several associations revealed in the set of shared genes were partially supported by literature data. Three different kinds of evidence were presented: (i) results on animal models, (ii) evidence from the studies of gene polymorphisms and (iii) expression data for any type of prolapse with the same direction of association. Usage of (iii) as an evidence was based on the rationale that heart valve, cartilage, tendon, and bone development share common regulatory pathways26. The genes ADAMTS1 12 , MYH3 27 and SERPINE1 28 were up-regulated in POP tissues. Polymorphic variants in the genes LIN28B 29 and AGT 30 were associated with POP and mitral valve prolapse (MVP) respectively. Nfil3-/- mice developed colitis with high prevalence of rectal prolapse31. High expression of Nlrp3 was found in Il10-/-mice with colitis combined with rectal prolapse32. The genes CTSK 33, MMP19 and THBS4 34 were over-expressed in MVP. Egr2-/- mice had features of human aortic valve disease, in particular excess of proteoglycan deposition and reduction of collagen fibres35. This information is given in more detail in Supplementary Table S4.

Discussion

This study highlighted some important biological processes and putative candidate genes most likely linked with POP development. We present below our vision of POP pathogenesis which is based on our findings and literature data.

The majority of individual gene expression studies in POP were focused on genes related to ECM. Summary data obtained on dozens and even hundreds of patients (Fig. 2) basically supported the common opinion on expression changes of these genes in POP. The results presented in the whole-genome/proteome studies as well as the results of the in silico analysis did not confirm these observations. The discrepancies may be linked with the small samples for the whole genome sets, with a highly variable design in terms of investigated tissue, menopausal status, ethnicity, experimental methods and other. Different designs were also used in the studies of candidate genes; however, in larger samples the differences are smoothed out resulting in more accurate parameter estimates, which, in turn, lead to a greater probability to find the desired results. Correction for multiplicity was rarely applied in the studies considering several individual genes; however, in the genome-based approach, any method of selecting the most significant genes was used always (FDR correction, top DEG, top gene sets) with a probability to miss less pronounced but biologically plausible correlations.

Our in silico analysis included three independent (AVW, PrM and PM) and two dependent (USL and RL from the same subjects) sets. The design was focused on the search of expression changes in one set and the validation of found differences in other sets. Genes related to ECM-structure were not represented among the sets of shared genes according to the GO results. The shared terms for up-regulated genes revealed the need for response to stimulus, locomotion, adhesion, immune, rhythmic, developmental and some other biological processes which should precede ECM synthesis. In this context, given the substantial histological differences between prolapsed versus non-prolapsed tissues36, 37, changes in expression for the most studied ECM genes may be found as a consequence of prolapse rather than an underlying cause38. This assumption is in line with the results of a few studies on the role of germline genetic variations in POP. Association studies on the whole-genome level have not revealed genes expected to be linked with connective tissue disorders29, 39. Meta-analyses in this field yielded unstable results40, 41. Contradictions in the studies of germline genetic variations may partially depend on the underestimation of important risk factors such as perineal trauma in childbirth in the majority of POP genetic association studies42, 43. The genetic component of prolapse is rather high44, 45 and, given the presence of different kinds of biological activities of proteins encoded by the studied ECM genes (Fig. 2), the question of whether they are only markers or to a certain extent causative genes for POP is still open.

Our findings that in POP tissues genes involved in tissue repair were up-regulated appeared to be unexpected but biologically plausible. Several levels of evidence supported this statement: (i) in all five gene sets under study as well as (ii) in the shared gene set, DEG were enriched with those involved in biological processes implicated in tissue regeneration; (iii) in three PrM sets there was a high co-expression (conceivably co-regulation) of DEG just in POP specimens. Compensatory mechanisms aimed at the recovery of damaged tissues require coordinated regulation of many biological processes with a crucial step being the recruitment of blood cells which mediate inflammatory and immune responses, promoting tissue repair. Inefficiency of these processes may be linked with many reasons, among which the most important are age-related changes. The proportion of women with pelvic floor disorders is dramatically increasing with age reaching up to 10% in women aged 20 to 39 years and up to 50% in women aged 80 years or older1.

ECM is constantly being remodeled by degrading and reassembling. In response to injury and other stimuli, remodeling rates increase significantly. ECM degradation products induce inflammation46, which is in turn associated with acceleration of proteolytic cascades leading to further destruction of ECM. Inflammation linked or non-linked with infection and other diseases increases with age and age-related clinical conditions. Balanced immune and anti-inflammatory response is crucial for successive regeneration of vaginal tissues. In youth and maturity clinically asymptomatic damage is compensated by repair processes and/or other components of the pelvic floor support mechanism. Aging and menopause are associated with oxidative stress and hormonal disturbances, both conditions strongly exacerbating ECM breakdown processes47,48,49. Tissue remodeling may be successful in youth and maturity but in old age many processes are deregulated and this factor may at least partially explain the delayed onset of the disease with major risk factors (parity and perineal trauma in childbirth) linked to youth.

An interesting finding in our work is that up-regulated genes in prolapsed tissues were enriched with those related to IBD in the GWAS Catalog. IBD includes Crohn’s disease and ulcerative colitis. Persons with constipation-predominant symptoms may suffer from pelvic floor muscular incoordination and failure of normal relaxation of pelvic floor muscles during attempted defecation. GWAS-implicated variants lie on genes that may be linked with immune-mediated disturbances in physiological homeostasis in both diseases. Other hypothesis interpreting shared susceptibility to both disorders might come from the results of animal studies: IBD frequently co-exists with rectal prolapse which is in turn associated with other types of genital prolapse since rectocele leads to overdistension of the perineal body50.

The study has some limitations with the main problem being in a small number of datasets and a small number of samples in these datasets. These data appeared to be insufficient for construction of co-expression networks. The results of the enrichment analysis for the overlapping up-regulated genes with GWAS association signals should be discussed as preliminary. These results raise a question rather than provide an answer on a possible shared genetic component for IBD and POP.

Finally, our analysis provided some in-depth data important for understanding POP pathogenesis. In terms of genetic overlap between IBD and POP, the work has translational impact. The study findings are biologically plausible; however, they require verification in independent studies.

Materials and Methods

Selection of studies assessing individual gene expression profile

The final search was performed on 11 January 2017 of the PubMed, EMBASE and Web of Science databases in compliance with PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) guidelines51. The following keyword terms were used as criteriae for searching: pelvic organ prolapse, vaginal prolapse, genital prolapse, uterovaginal prolapse, uterine prolapse, prolapse of vaginal vault, pelvic floor dysfunction, pelvic floor disorder, cystocele, rectocele in combination with the terms: expression, production, secretion, gene and protein. Additional articles were identified by checking reference lists of relevant articles. We used the following inclusion criteria. The article had to be published in English and had to have evaluated POP-related expression changes in pelvic floor supportive tissues in a case-control study in vivo. For overlapping studies, we selected those with larger number of subjects.

Microarray data processing

Results of two studies submitted to the repository Gene Expression Omnibus (GEO) were processed with GEO2R – an interactive web tool which exploits Limma R packages from Bioconductor project17 for comparison of user-defined groups of samples under the same experimental conditions (http://www.ncbi.nlm.nih.gov/geo/geo2r/). The use of the ‘Value distribution’ option showed that the data have been normalized and therefore cross-comparable.

Gene set enrichment analysis

We generated gene sets of DEG from the GEO2R data by setting the cut-off P – value < 0.05 (without correction for multiplicity) and fold change ≥2.0. These gene sets were treated with KOBAS 3.0 resource7 for Gene Ontology (GO) enrichment analysis. The following settings were applied: the minimum number of genes per category was five, while Benjamini and Hochberg false discovery rate (FDR) corrected P - value threshold was 0.05. Web server REVIGO was utilized for summarizing GO terms, which was guided by the P-value9. We took into account the hierarchical structure among GO terms for data interpretation: when a gene is associated with a term, it is automatically associated with its parent terms8.

KOBAS 3.0 tool was additionally used for other types of enrichment analyses for shared genes between gene sets under study, namely metabolic pathways analysis (KEGG PATHWAY) and comparison with the NHGRI GWAS Catalog associations.

Statistical considerations

Our study was performed to identify common molecular features for POP phenotype. Many promising markers selected in a single data set appeared to be not so promising or even non-significant in independent sets (a winner’s curse problem). External/independent validation is recommended for large-scale (OMICS) studies52. Taking into account these recommendations, we searched for shared DEG with the same direction (down-regulation or up-regulation) of the association in independent sets.

Data analyses and visualization were conducted with the R statistical software53.

Compliance with ethical standards

As a secondary analysis of public data the study does not require IRB approval.