Validation of perinatal post-traumatic stress disorder questionnaire for Spanish women during the postpartum period

To determine the psychometric properties of the Perinatal Post-Traumatic Stress Disorder (PTSD) Questionnaire (PPQ) in Spanish. A cross-sectional study of 432 Spanish puerperal women was conducted, following ethical approval. The PPQ was administered online through midwives' associations across Spain. The Edinburgh Postnatal Depression Scale was used to diagnose postnatal depression for examining criterion validity. Data were collected on sociodemographic, obstetric, and neonatal variables. An exploratory factorial analysis (EFA) was performed with convergence and criterion validation. Internal consistency was evaluated using Cronbach's α. The EFA identified three components that explained 63.3% of variance. The PPQ's convergence validation associated the risk of PTSD with variables including birth plan, type of birth, hospital length of stay, hospital readmission, admission of the newborn to care unit, skin-to-skin contact, maternal feeding at discharge, maternal perception of partner support, and respect shown by healthcare professionals during childbirth and puerperium. The area under the ROC curve for the risk of postnatal depression (criterion validity) was 0.86 (95% CI 0.82–0.91). Internal consistency with Cronbach's α value was 0.896. The PPQ used when screening for PTSD in postpartum Spanish women showed adequate psychometric properties.

Data collection. An online questionnaire was developed and distributed from November to December 2019 with the collaboration of Spanish midwives' associations. This questionnaire included items collecting sociodemographic and clinical variables, the PPQ and the Edinburgh Postnatal Depression Scale (EPDS).
The following sociodemographic and clinical data were recorded: maternal age, education level, whether or not the pregnancy was desired, live newborn, parity, induction of labor, using natural analgesia, using epidural analgesia, using general anesthesia, type of birth, episiotomy, perineal tear, skin to skin, admission of the newborn to a care unit, degree of partner support, feeling respected by healthcare staff, type of feeding after discharge, surgical intervention, and postnatal hospital readmission. Some of the variables were used to describe the population, and some used in convergence validity.
The second component of the survey consisted of a series of questionnaires. First was the modified PPQ (Appendix S1: Spanish version), a 14-item measure assessing post-traumatic symptoms related to the childbirth experience, including intrusiveness or re-experiencing, avoidance behaviors, and hyperarousal or numbing of responsiveness. The PPQ also contains one item pertaining to feelings of guilt. Response options were modified from the original dichotomous scale to a five-level Likert-type scale (scored 0 to 4). Mothers were instructed to provide responses that reflected their experience during the targeted time frame (1 to 18 months postpartum). The total possible score on the modified PPQ ranged from 0 to 56. In the current investigation, internal consistency was superior to previous investigations using the dichotomous scaling, with an α = 0.90 6 .
The EPDS is a 10-items self-reported scale designed as a specific instrument to detect postnatal depression and has been validated in the Spanish population during pregnancy 16 and postnatally 17 . With a cut-off point of ≥ 10, the sensitivity was 79% and specificity 95.5%. The positive predictive value was 63.2% and negative predictive value 97.7% 17 . Moreover, it is a simple and widely accepted tool by clinical practitioners 18 . The EPDS was included to establish the criteria validity.
Data analysis. For sociodemographic and clinical data, the absolute and relative frequencies were used to describe the qualitative variables, and the mean and standard deviation (SD) used to describe the quantitative variables.
First, to determine the validity of the scale used, we analyzed three of the most common validity types: construct validity, convergent validity, and criterion validity.
For construct validity, we opted to carry out an exploratory factor analysis (EFA) to determine the underlying factors through a principal component analysis (PCA). Before carrying out the EFA, we analyzed the Kaiser-Meyer-Olkin (KMO) tests and Bartlett's sphericity tests, to determine whether it was appropriate to apply this analysis. For this to be the case, the KMO should be above 0.6 and as close as possible to 1, and Bartlett's sphericity, which consists of statistical hypothesis testing, should be less than 0.05 to reject the null hypothesis of sphericity and ensure that the factor model is adequate to explain the data. In the EFA, we used Varimax rotation to help clarify the assignation of items to different factors. To determine the number of factors to retain, we used the Kaiser criterion, which is one of the most used criteria. It retains factors with eigenvalues greater than the unit value 19 .
Within the construct validity, we also analyzed convergent validity, in order to establish the relationship between the PPQ and factors which are believed to be associated with PTSD risk, such as type of birth, admission of the newborn to an intensive care unit (NICU), type of feeding, hospital length of stay, among others. Hence, a bivariate analysis was performed using Pearson's chi-squared or Fisher's t-student tests, depending on whether the variable data were qualitative or quantitative. The results were considered statistically significant when p < 0.05.
To study criteria validity, the Edinburgh scale was exerted with a ≥ 10 cut-off point. To do this, we carried out a sensitivity and specificity study with an analysis of the area under the received operating characteristic curve (AUC) obtained using Swets' criteria 20 . We also carried out a bivariate analysis between the scores obtained on the PPQ scale and the Edinburgh scale. We again used non-parametrical statistical tests and considered significant associations with p < 0.05.
The reliability analysis was done by studying the Cronbach's (α) to evaluate the internal consistency (IC). The IC indicates to what extent the items in the questionnaire are correlated with each other, and how they fit together and measure the same concept. The α is one of the most widely used measures to assess the reliability of a scale 21 . Its values range from 0 to 1. One of the most accepted rules is to consider α > 0.9 as excellent, α > 0.8 as good, α > 0.7 as acceptable, α > 0.6 as questionable, α > 0.5 as poor, and α < 0.5 as unacceptable 22  . Before starting the questionnaire, the participants read a fact sheet about the study, its objectives, etc., and marked a box by which they showed their consent to participate in it, i.e., they signed an online informed consent (ticking the option if they wanted to participate or not doing so when refusing to take part in the study). We followed the protocols established to carry out this type of research with the purpose of publication/disclosure to the scientific community. The study was conducted according to the strobe guidelines set in the Declaration of Helsinki and all procedures involving human subjects were approved by the Ethics Committee. All women involved in this study filled out informed consent and data treatment forms to enter the study, in accordance with the ethical standards of the Ethics Committee. All participants received written information on the study, including the fact that participation was entirely voluntary with anonymity guaranteed.  Table 2 presents the scale items together with their respective factor weights.

Characteristics of participants.
Convergent validity. Next, the convergent validity was analyzed using bivariate analysis of the scores from the PPQ questionnaire and various sociodemographic and clinical factors. A statistically significant relationship was observed between PTSD risk with the following variables: Birth plan, type of birth, hospital length of stay, hospital readmission, skin to skin, admission of the newborn to NICU, degree of partner support, feeling respected by healthcare staff, and type of feeding on discharge.
Criterion validity. Using the EPDS as a comparative instrument, it was found that the PPQ, translated and transculturally adapted into Spanish, presented an AUC of 0.86 (95% CI 0.82-0.91), with a good capacity to classify the subjects according to Swets' criteria. The ROC curve can be seen in Fig. 1. The bivariate analysis between the scores of the PPQ and Edinburgh scales shows a significant positive relationship (r = 0.69, p < 0.001).
Internal consistency. To evaluate internal consistency, the α of the total of the questionnaire was used, as well as that of each of the dimensions found with the EFA. For the total scale, α was 0.896. All the alfa values scored higher than 0.880 when removing an item, and the general α did not increase by more than 0.01; therefore, we decided to keep them. The α values for each factor are shown in Table 3.

Discussion
Our analyses demonstrate the internal consistency, and construct and criterion validity of the Spanish PPQ. This allows for confidence in the use of the PPQ tool in a Spanish setting, something that had not been asured prior to our study. Another important aspect to consider is the detected prevalence of PTSD risk, which stood at 11.1% in our sample. Apparently, it can be high, however, the Prevalence of PTSD is variable depending on the established cut-off point and the study population type. In 2017, a systematic review and meta-analysis of PTSD reported prevalence rates of 4.0% (95% CI 2.77-5.71) in the general population, with 18.5% (95% CI 10.6-30.38) of women at risk 23 . In addition, it should be clarified that the PPQ tool has a screening and not a diagnostic purpose, therefore it is normal that it presents a higher prevalence than the diagnosed cases.
With regard to factor construct validity the values obtained in KMO tests and Bartlett's sphericity test were adequate; thus, we conducted the EFA. Three components accounted for 63.3% of variance. The English, Korean, and Chinese versions explained 65%, 67%, and 51%, respectively, of variance [7][8][9] . Regarding components distribution, none of them match among the published versions, as can be seen in Table 2. Moreover, the Korean, Chinese, and Spanish versions coincide in the first component items.
The questionnaire presents an adequate convergent validity as is associated with variables linked previously with PTSD risk, such as type of birth 11, 12, 23-30 , prematurity 5, 31 , neonatal admission to care unit 25 , skin-to-skin 11 , and type of feeding 23 . Furthermore, other associations with PTSD risk were observed, including the degree of partner support and feeling respected by healthcare staff. We also used the same cut-off point (≥ 19) used by Callahan et al. 6 to consider the risk of PTSD, being the same as that considered by the authors for clinical application; thus bringing the validation closer to its true clinical setting application. www.nature.com/scientificreports/ The criterion validity was then evaluated using the EPDS. I opted for this tool because other authors have observed a strong correlation between PTSD and postpartum depression (33), it is a very well-known instrument used by professionals in clinical practice 18 . Specifically, we used EPD scores of ≥ 10 for determining the predictive capacity, finding almost excellent ROC AUC values.
Finally, internal consistency was evaluated and we found values very close to those found in the English version (Cronbach's α = 0.90) 7 and for the Korean version (Cronbach's α = 0.91) 8 . The lowest internal consistency values were found in the Chinese version (Cronbach's α = 0.837) 9 , and especially in the French version (Cronbach's α = 0.77) 10 . This large difference could be explained because all the versions except the French used Likert-type scale while the French version used dichotomous responses.
With the validation of this questionnaire, practitioners can count on a new tool to identify those women who are at risk of developing PTSD after delivery. The tool is simple and easy to apply, in such a way that it could be included as another assessment tool during the postpartum period, just as the EPDS is used almost systematically for PD screening 18 . Health professionals with this type of tool can direct efforts towards the early detection and prevention of the consequences of a prevalent problem with an increasing trend and that has important consequences for the health of women and their offspring 32,33 .
The validation of this instrument has a special relevance in the field of PTSD research, since to date there is no specific instrument for assessing the risk of perinatal PTSD in the Spanish-speaking population. Validations, as recommended by scientific societies 34 , they are essential so that researchers can use the assessment instruments in future research and can obtain valid results, establish comparisons, and measure the impact on women's health. www.nature.com/scientificreports/ The strengths of our study include the opportunity to evaluate the Spanish PPQ across a diverse sample of a sociodemographically and clinically varied group of puerperal women. We had ample sample size for our evaluation with precisions. We also decided to only include women who had given birth 6 months ago to reduce memory bias as much as possible. There are various limitations of our study. Once the women who declined to take part in the study had been considered, there was no reason to believe there had been any selection bias as the number of non-participants was small, and the sample was consecutively selected. Regarding information bias, using an online questionnaire to collect data could be a limitation due to the lack of access that some women may experience, however, Callahan et al. already used this system for validation previously 6 .