Introduction

Pelvic floor disorders (PFDs), are considered as one of the most common complaints among women refering to gynecology clinics. PFDs include a wide range of clinical manifestations that have direct effects on the urinary system, digestive system, and sexual activity. It significantly disrupts women's daily activities and quality of life (QOL). It also imposes a huge financial burden on the health care system every year1.

PFDs include a wide range of symptoms and anatomical changes related to the abnormal behavior of the pelvic floor muscles, classified by the range of symptoms2. The ranges of pelvic floor symptoms in women include lower urinary tract symptoms (LUTS) (urinary incontinence (UI), urgency and increased frequency of urination, feeling of incomplete emptying, impaired urinary elimination), bowel symptoms (fecal incontinence (FI), constipation, obstructed defecation) and sexual problems (dyspareunia, orgasm disorder), pelvic organ prolapse (POP) and genito-pelvic pain in women3.

The pelvic floor is an anatomic unit with a wide range of functionalities, including pelvic organs support and sexual function. Thus, it is expected that the PFDs will lead to more than one simultanous symptom, in more than one area of the pelvic floor4. In this regard, a review study by Vries et al. in 2022 reported that about a quarter to a third of women experienced one or more pelvic floor symptoms (such as urinary incontinence, fecal incontinence, POP) at the same time4. Also, the results of Kenne et al.’s study (2022) showed that at least 32 percent of women suffer from one category of PFDs. The most common of which is bowel dysfunction, with a 24.6 percent share, followed by urinary incontinence and POP, which had a 11.1 and 4.4 percent share, respectively5.

Generally, the prevalence of PFDs varies around the world, according to most studies, 11.5 to 35 percent of women worldwide suffer from PFDs in silence6,7,8. In this regard, the results of a study in the United States of America reported that the demand for the treatment of PFDs will increase by 35% by 20309.

The main cause of PFDs is unknown. Several factors are involved in its occurrence10. Age11, ethnicity12, multiparity13, mode of delivery14,15, traumatic injuries16, history of pelvic surgery17, pregnancy18,19, vaginal delivery20, chronic cough21,22, obesity23, spinal disorders24, family history25, pelvic floor muscle dysfunction and genetics25 are the most common risk factors that can be identified.

Overally, women of reproductive age are at greater risk for PFDs26. Given that early treatment of PFDs allows for conservative, non-surgical treatment options such as Kegel exercises (pelvic floor muscle training (PFMT)) or pessary placement which have shown their effectiveness, therefore, in order to identify this condition early, screening tools will be very useful. The screening of this condition in health centers with a reliable instrument provides an optimal opportunity for counseling and prevention27,28. According to the International Consultation on Incontinence (ICI), the most reliable measure to assess the presence, severity, and impact of PFDs on the patient's activities and health is the patient-reported outcomes measures (PROMs).

PROMs are defined by the Food and Drug Administration (FDA) as "any report of health status that is provided directly by the patient, without interpretation of the patient's response by a physician or any other person"29,30. PROMs, an old and reliable method of data collection, are usually in the form of questionnaires that provide a clear perspective of the patient's problem, especially in multifactorial conditions, such as PFDs. Over hundreds of PROMs are available for use in the screening of pelvic floor disorders and to measure the results of treatment. Although these questionnaires were well-accepted, they often fail to address all aspects of pelvic floor dysfunction. APFQ is a simple but comprehensive tool designed to evaluate PFDs in clinical practice. Although most aspects of the pelvic floor function can be evaluated using several overlapping ICI questionnaires, completing it for patients in a typical clinical environment is very time-consuming because a single questionnaire does not cover all areas31,32.

APFQ, as a validated PROM for routine urogenycological evaluation and research on PFDs, designed in Australia by Baessler et al. in 2009, evaluates PFDs with 43 questions in 4 different factors (bladder function (BLF), bowel function (BF), prolapse symptom (PS) and sexual function (SF)), emphasizing the severity, degree of bother experienced and their impact on the QOL of women33. It has also been translated recently and used in several languages including Chinese34, Turkish35, Serbian36, Spanish37 and French38.

Considering the high prevalence of pelvic floor symptoms and numerous negative consequences both physically and psychologically for Iranian women, there is a great need to develop instruments for screening PFDs and use them in health centers. Since the current version of the Australian Pelvic Floor Questionnaire has not been validated in Iran and its reliability has not been investigated. Therefore, we decided to conduct this study with the aim of determining the psychometric properties of the APFQ in Iranian reproductive age women.

Methods

Research population and setting

This study was a methodological cross-sectional study. It was conducted on reproductive age women referring to health centers in Tabriz, Iran, from May 2022 to September 2022. The purpose of this study was to determine the psychometric properties of the Persian version of the APFQ-IR during five stages, including translation, content validity, face validity, construct validity and reliability assessment in reproductive age of women in Tabriz, Iran.

Translation process

In order to carry out the translation process and determine the psychometric properties of the instrument, in the first step, a permission was requested from the main designers of the tool33. Then, to increase the accuracy of the translation process, the translation was done using two approaches, Dual panel (DP) and Beaton et al.’s five steps (Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measures). In the first approach (DP), the translation process is done in three steps39. The first panel (expert panel) consisted of 10 reproductive health, obstetrics and nursing education specialists. The second panel (layman panel) consisted of 10 eligible women. In the third stage (the target group panel), 400 eligible women of reproductive age completed the questionnaire in the presence of the researcher. In the second approach, according to Beaton et al.'s guidelines, the translation process was implemented through 5 stages including Stage I: Initial Translation, Stage II: Synthesis of The Translations, Stage III: Back Translation, Stage IV: Expert Committee, Stage VI: Pretesting40.

In the first stage, two translators (T1, T2) whose mother tongue was Farsi, performed the translation completely independently using the Forward-Translation method. The first translator was well learned in the field of PFDs and the second translator should preferably not have a medical or clinical background. This translator is called a naive translator and is more likely to recognize a different meaning of the original text than the first translator, and the translation she/he provides reflects the language used by that population and often highlights ambiguous meanings in the original questionnaire. Finally, two translators and a supervisor combined the results of the translations during sessions using the original questionnaire as well as the versions of the first translator (T1) and the second translator (T2) (production of a joint translation, T-12)41.

In the Backward-Translation stage, two other native English translators (BT1, BT2), who were completely blind to the original version of the questionnaire, re-translated the questionnaire into English using the T-12 version of the questionnaire. The translators also ensured that the final version of the questionnaire was comprehensible to a 12-year-old (approximately 6th grade reading level). The fourth stage is the expert committee’s review.They reviewed all the translations (T1, T2, T12, BT1, and BT2) along with the written reports from four viewpoints of Semantic Equivalence, Idiomatic Equivalence, Experiential Equivalence, and Conceptual Equivalence. The final stage is the pre-test, which seeks to use the pre-final version in the target group. In the pre-test phase, the researcher provided the final translated version of the questionnaire to 30 women of reproductive age who meet the criteria in health centers. This was done to evaluate the clarity and comprehensibility of the final version for the target group and also to examine its internal consistency. After responding, women were asked again about their understanding of the questions, the level of difficulty, and the cultural appropriateness of the phrases. Participants in this stage were encouraged to provide feedback on all sections of the questionnaire so that the final Iranian version of the questionnaire would be more culturaly acceptable in the Iranian community41.

Content and face validity

After the final version of the questionnaire was prepared, in order to assess the content and face validity of AFPQ-IR, the content validity determination form was given to 10 reproductive health, obstetrics and nursing education specialists with expertise in the field of instruments and the field of PFDs and 10 eligible women. In the qualitative section, in order to measure the content validity of the AFPQ-IR, experts' opinions were received in terms of the overall structure of the questionnaire, the content of the items, Persian grammar and correct scoring and then, corrections were made. In addition, in the quantitative section, content validity ratio (CVR) and content validity index (CVI) were calculated42. In order to perform CVR, experts’ opinions were received about each of the items of the instrument using a 3-point Likert scale in terms of the necessity (necessary, useful but unnecessary, and unnecessary) in the instrument. After obtaining the opinions of experts in the present study, according to the analysis of content validity by using Lawshe’s CVR technique, the minimum acceptable value for 10 experts is 0.62. As a result, cases with CVR > 0.62 were kept43. After that, in order to determine CVI, experts were asked to determine the relevance, clarity and simplicity of the items using a 4-point Likert scale based on the Waltz and Bausell index44. CVI values vary between 0 and 1. Items with a CVI more than 0.79 were accepted45. Then, in order to determine the qualitative face validity, the items were assessed in terms of difficulty level, relevance and ambiguity by 10 eligible women (target group). In the quantitative face validity, the item impact method using a 5-point Likert scale from unimportant (1) to very important (5) was used to determine the impact score, and finally, the items with Impact score ≥ 1.5 were kept46.

Construct validity

Finally, construct validity was assessed using exploratory factor analysis (EFA), with Kaiser-Meyer Olkin (KMO) and Bartlett’s test of sphericity criteria. Also, in order to determine the factors, principal component analysis method with varimax rotation (direct oblimin) was used. The amount of factor loading was considered above 0.347,48. In confirmatory factor analysis (CFA), a series of indices such as the root mean square error of approximation (RMSEA < 0.08), standardized root mean square residual (SRMR < 0.10), normed Chi2 (χ2/df) < 5, comparative fit Indices including comparative fit index (CFI > 0.90) and Tucker–Lewis Index (TLI) > 0.90 were used to assess the fit of the model49,50. Finally, after factor analysis and removal of inappropriate items from the instrument, the floor and ceiling effect (F/C) was assessed, i.e. the samples with the highest and the lowest possible scores were judged whether the ceiling and floor effects are true about them or not, respectively.

Based on a rule of thumb, the sample size for exploratory factor analysis is classified as 50 = very poor, 100 = poor, 200 = fair, 300 = good, 500 = very good and 1000 = excellent51. The number of samples for construct validity assessment in factor analysis is 5 to 10 samples for each instrument item. Therefore, by considering 5 samples in each item with design effect equal to 1.5 and considering 30% attrition, 400 eligible women who had referred to Tabriz health centers were selected using cluster sampling method. For sampling, a quarter of the centers were randomly selected using the website http://www.random.org and the list of samples were selected based on SIB system (integrated health system). The inclusion criteria included all women regardless of having diagnosis of PFD or not, women of reproductive age (15–49 years), having sexual activity, monogamous husband and not having a known pregnancy at the time of the study. Women with a dementia, psychological disorders such as depression, intellectual disabilities, schizophrenia, addiction to drugs and/or alcohol, previous or current malignancy, being in 12 months after delivery, recent history of urinary tract infection (UTI), a history of gynecological surgery including reconstructive, cosmetic surgery and pelvic surgery, sexually transmitted diseases (STDs), and White Blood Cell (WBC) > 3 in the urine analysis test (U/A) and illiterate women were excluded from the study.

Then, after providing a comprehensive explanation about the research to the participants and receiving informed consent, the researcher provided them with socio-demographic and obstetric characteristics questionnaire and the Persian version of the Australian Pelvic Floor Questionnaire (APFQ-IR). Socio-demographic and obstetric characteristics included information such as age, body mass index (BMI), gravidity, parity, education level, occupation, income, smoking status, type of delivery, hysterectomy, prolapse surgery, and family history of PFDs. The APFQ questionnaire, was used to investigate PFDs. Higher scores indicate more severe pelvic floor disorders. Its validity (content validity, face validity, construct validity) and reliability were assessed in this study. This instrument was designed by Baessler et al. in Australia, and it contains 43 questions and is divided into four factors: BLF (Q1-15), BF (Q16-27), PS (Q28-32), and SF (Q33-43). The scoring is not based on the Likert scale. Most of the questions are scored from 0 to 3 using different descriptions such as Never, Occasionally, Frequently, and Daily to evaluate intensity/repetition, and Not at all, Slightly, Moderately, and Greatly to estimate bothersome symptoms. The scores in each area are calculated separately, divided by the number of questions in each field and then multiplied by 10. The overall score for each area is between 0 and 10, and the maximum score for PFDs is 4052. The higher the score, the more intense the PFDS.

Reliability

On the other hand, to determine the reliability of the questionnaire, test–retest reliability and internal consistency were used53. To determine the test–retest reliability, the questionnaire was completed by 30 eligible women of reproductive age who had referred to the health centers of Tabriz city by random sampling method in two stages with a time interval of two weeks. Internal consistency was also assessed by determining Cronbach's alpha coefficient and Mcdonald’s omega coefficient for each factor and the whole instrument. Intra-class correlation coefficient (ICC) greater than 0.6 and Cronbach's alpha coefficient and Mcdonald’s omega coefficient above 0.7 were considered favorable54.

Ethical consideration

The present study was approved by the Ethics Committee of Tabriz University of Medical Sciences (Ethics code: IR.TBZMED.REC.1400.1073). All ethical principles, including obtaining necessary permission from the main designers of the instrument (Baessler et al.), obtaining written informed consent from all participants, ensuring the confidentiality of their information, and freedom to withdraw from the study were observed at every step.

Statistical analysis

SPSS Statistics 14 (IBM Corp, Armonk, NY, USA) and STATA 14 (Statcorp, college station, Texas, USA) and R software 4.2 (Psych package) were used for data analysis. In this study, the socio-demographic and obstetric characteristics, content validity, face validity, construct validity and reliability were determined respectively through Mean (SD) for quantitative variables and frequencies (percentages) for qualitative variables, CVR and CVI, Impact score, EFA and CFA and finally, Cronbach's alpha coefficient, McDonald's omega and ICC were assessed.

Ethics approval and consent to participate

The current study was approved by the Ethics Committee of Tabriz University of Medical Sciences [ref: IR.TBZMED.REC.1400.1073]. Written informed consent to participate in the study was obtained from all the participants before enrolment. All methods were performed in accordance with the Declaration of Helsinki.

Results

400 women of reproductive age with a mean (SD) age of 34.4 (7.2) (range 16–49) were included in this study between May 2022 and September 2022, with cluster sampling method. The mean (SD) of body mass index was 26.9 (4.1) and more than three quarters of them (82.3%) were housewives. Other socio-demographic and obstetric characteristics of the participants are summarized in Table 1.

Table 1 Socio-demographic and obstetric characteristics of participants for factor analysis of APFQ-IR (n = 400).

The mean (SD) of the entire APFQ in the present study was equal to 0.72 (0.67), and for the 4 extracted factors including SF, BLF, BOF, and PS, it was 0.56 (0.77), 0.97 (1.06), 0.20 (0.66) and 1.16 (1.32) respectively (Table 1).

Assessing the content validity of the tool, CVI, CVR were obtained as 0.94 and 0.96, respectively, which indicates the good reliability of the instrument. But on the other hand, question 36 of the instrument (vaginal sensation during intercourse) (SF4) was corrected because it had a CVI = 0.58. Moreover, in the face validity review, all the items were described as appropriate and without ambiguity and difficulty and received a minimum score of 1.5. The details of the results of content and face validity assessment are shown in Table 2.

Table 2 The results for the content and face validity of the Iranian version of APFQ-IR (n = 10).

In the construct validity assessment, the total number of questions in the original version of the instrument was 43, but due to the difference in the way of answering and missing (because only sexually inactive women had to answer them), 3 questions (SF1-SF3) were not included in the exploratory factor analysis. Finally, exploratory factor analysis was performed only on 40 items, which led to the extraction of 4 factors that explained 45.53% of the variance. During the process of exploratory factor analysis, questions 17 (BOF2), 24 (BOF9), 28 (PS1) and 41 (SF9) were also removed due to factor loading less than 0.3 and finally the number of questions was reduced from 43 to 36 questions (Fig. 1).

Figure 1
figure 1

Factor structure model of the APFQ-IR based on CFA. (All factor loadings are significant at P < 0.001). BLF bladder function, BOF bowel function, PS prolapse symptom, SF sexual function.

The four factors extracted during exploratory factor analysis are: The first factor was BLF, which includes 15 questions, accounting for 14.99% of the total variance. The second factor is BOF and has 10 questions, which explains 11.90% of the total variance. Finally, the third and fourth factors are PS, with 4 questions, and SF, with 7 questions, explaining 8.75% and 9.88% of the total variance, respectively (Table 3). The results indicating the adequacy of the sample size (Kaiser–Meyer–Olkin = 0.750) were obtained at a significance level of less than 0.001. Also, the result of Bartlett's sphericity test was significant, which indicated the acceptable performance of factor analysis according to the correlation matrix in the studied sample (P ≤ 0.001) (Table 4).

Table 3 Result of factor analysis of the APFQ-IR scale based on EFA (n = 400).
Table 4 KMO and Bartlett's test of the Iranian version of APFQ-IR (n = 400).

In confirmatory factor analysis (CFA), 4 factors obtained in exploratory factor analysis (36 items) were examined. The results showed that this model achieved a good level of fit, based on which the factorial structure can be confirmed (RMSEA = 0.08, SRMR = 0.08, TLI = 0.90, CFI = 0.93, χ2/df = 3.52) (Table 5).

Table 5 The model fit indicators of the APFQ-IR (n = 400).

Finally, to determine the reliability of the tool, (Cronbach's alpha = 0.85; McDonald's omega (95% CI) = 0.85 (0.83–0.87) and Intraclass Correlation Coefficient (95% CI) = 0.88 (0.74–0.94)), were obtained, showing a good instrument reliability. Moreover, ceiling effect was not observed in the overall value and sub-domains, but the floor effect in the overall score (APFQ-IR) was equal to 8.5% and the values of the sub-domains are detailed in Table 6.

Table 6 Reliability statistics, floor and celling effect of the APFQ-IR.

Discussion

Pelvic floor disorders (PFDs) are a significant health problem among women living in low and middle-income countries. This problem exists because many women with PFDs, due to misconceptions and lack of awareness of the existence of treatment options, fear of discrimination, feeling of shame and society's culture, hide their problem55,56. Therefore, the existence of a reliable instrument to measure the symptoms of PFDs, seems necessary. This study, aiming to psychometrically evaluate the APFQ among Iranian women, indicates that the Persian version of this questionnaire (APFQ-IR) has acceptable psychometric properties for evaluating PFDS, and can be used as a valid and reliable tool among Iranian women.

At the present time, despite the design of numerous questionnaires to evaluate PFDS with emphasis on the domains of urinary incontinence57,58, fecal incontinence59 and some for pelvic organ prolapse60, there are only a few valid questionnaires that cover all domains (bladder, bowel, prolapse and sexual domains), merged together. Among them, the ICI questionnaires (www.iciq.net), despite having strong criteria for assessment, are not designed to be used in clinical practice61.

Although the Pelvic Floor Distress Inventory-20 (PFDI-20) questionnaire, and the Pelvic Floor Impact questionnaire (PFIQ-7) are recommended by ICI with grade B60,62, these questionnaires (long and short form), are not designed to be used in routine urogenycological operations, since they are aimed at measuring the intensity and frequency of symptoms (never, occasionally, frequently, etc.), not at specifically evaluating sexual function. Therefore, these questionnares, alone are not useful in clinical practice. The only questionnaire that integrates all areas is The Global Pelvic Floor Bother Questionnaire, which has 9 questions, but one of its disadvantages is the lack of dedicated sections for each area and the allocation of only one question (question 9) to sexual activity63.

During the process of exploratory factor analysis in this study, 4 factors were extracted for 36 questions of the questionnaire, including BLF, BOF, PS and SF, which explained about 45.53% of the variance. As a part of the instrument’s validity assessment, the value of KMO and the significance of Bartlett's test was also assessed which confirmed the adequacy of the model. Although the psychometric properties of APFQ have been investigated in several languages worldwide, the construct validity has only been examined in Arabic and Spanish versions. The first study was conducted on the Arabic version (2021) by Malaekah et al., who identified four factors explaining 36.64% of the variance. The results of exploratory analysis showed KMO = 0.806 and Bartlett sphericity test = 4150.4664. The second study was performed on the Spanish version (2022) by Molina-Torres et al., who identified two factors explaining 31.26% of the variance. The results of exploratory factor analysis showed KMO = 0.858, with a significant value in the Bartlett sphericity test (P < 0.001)37. The factors extracted during exploratory factor analysis are parallel and in line with the factors reported in Pelvic Floor Distress Inventory-20 (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7) measures62, with the difference that the APFQ questionnaire, having a sexual performance section, is more complete than the two questionnaires mentioned.

Despite the ICC's emphasis on the use of these two questionnaires, APFQ has some advantages over them. The advantage and strength of the APFQ questionnaire compared to other questionnaires is the existence of a separate section for examining women's sexual performance, the answering method which focuses on measuring the frequency and severity of symptoms, level of bother and issues related to the quality of life in these special conditions.

In order to determine internal consistency, Cronbach's alpha coefficient for the APFQ-IR and its range for the subscales were obtained as 0.85, 0.68–0.78, which in the reported values for these parameters in original scale were 0.74–1.00, almost within the range of the original scale33. The obtained results were also similar to the Turkish version (0.73–0.86)35, and Serbian (0.82–0.84)36, but compared to the Spanish version (0.83–0.93)37 and Chinese (0.83–0.89)34 were smaller. Moreover, to determine the reliability of this study, the ICC was obtained 0.77–0.95, which is in accordance with the range of the original instrument (0.74–1.00)33 and higher than the Chinese version (0.22–0.88)34 and the Spanish version (0.59–0.96)37.

Because of the strong role of women in the family’s and society’s health, the high prevalence of the PFDs and the wide range of complications caused by this situation is considered as a silent alarm about reducing the QOL and interfering with the roles of women in society. As a result, the importance of a dedicated instrument to evaluate the symptoms of PFDs by health care providers in health centers, and subsequently, training, diagnostics and therapeutic measures, becomes more prominent.

Strength and limitation

Some of the strengths of this study are: Investigating the psychometric properties of APFQ-IR for the first time in Iran, the comprehensiveness of the tool and coverage of all areas of PFDs, especially the sexual function section, the way of responding with an emphasis on measuring the severity and frequency of symptoms compared to other tools in this field, using the combination of the dual panel method and the five stages method for the purpose of the translation process and finally providing the possibility of comparison with other versions from other countries. Using the same set of data in order to perform exploratory and confirmatory factor analysis, not including general quality of life questions in the questionnaire, and the smaller number of women respondents under the sexual function scale due to sexual inactivity are the limitations of the present study.

Conclusion

The Persian version of the questionnaire, APFQ-IR, has a good validity and reliability and has acceptable psychometric properties, thus can be used both for research purposes and for clinical evaluation of PFDs symptoms in health centers. Thus, health policy makers should do their best to design special programs for screening, training, diagnosis and treatment measures in order to evaluate and follow up women with PFDs with the aim of improving their performance and QOL.