Development and validation of a prediction model on severe maternal outcomes among pregnant women with pre-eclampsia: a 10-year cohort study

Pre-eclampsia is a severe hypertensive disorder of pregnancy and could lead to severe maternal morbidities and death. Our study aimed to develop and validate a prognostic prediction model for severe maternal outcomes among Chinese population with pre-eclampsia. We conducted a 10-year cohort study in a referral center by collecting all pregnant women who diagnosed as pre-eclampsia and delivered from 2005 to 2014. A composite of severe maternal outcomes, including maternal near-miss defined by World Health Organization, cortical blindness/retinal detachment, temporary facial paralysis and maternal death, were adopted. We used logistic regression model to develop Model 1 by retaining the predictors of p < 0.05, and further conducted Model 2 by adding quadratic terms and interaction terms to Model 1. We undertook a bootstrapping validation and estimated the model performance. A total of 397 pregnant women suffered from severe maternal outcomes among 2,793 eligible participants, with an incidence of 14.21% (95% confidence interval (CI) 12.91%–15.51%). Of 13 predictors were finally selected in Model 1. Combined with quadratic and interactive terms, the Model 2 showed higher area under the ROC curve (82.2%, 95% CI 79.6%–84.7%) and good calibration. By the bootstrapping validation, similar model performances were present.

Regarding to the classification and definition of potentially candidate predictors, the registered residential place was divided into city and rural area. The pre-pregnancy BMI was calculated as pre-pregnancy body weight divided by square of height. The parity was divided into multipara and nullipara. The number of fetuses was divided into singleton and multiple gestations. Placenta previa was defined as a placenta that covered or was close to the cervix, and diagnosed by prenatal ultrasound. Oligohydramnios was defined as an amniotic fluid index with less than five centimeter or maximum depth of the amniotic pool with less than 2 cm as diagnosed by ultrasound. Chronic nephritis included diabetic nephropathy, nephrotic syndrome, immunoglobulin A nephropathy, hypertensive nephropathy, and lupus nephropathy. Other urinary system diseases included hydronephrosis and kidney stones. Cardiac diseases included congenital heart disease, rheumatic heart disease, and cardiomyopathy. Neurological and psychiatric disorders included any type of epilepsy, depression, and anxiety. Immune system diseases included systemic lupus erythematosus and antiphospholipid syndrome. Qualitative proteinuria at admission was graded as −, + , ++, +++ and ++++. www.nature.com/scientificreports/ Development and validation of the model. First, the association between each candidate predictor and the target outcomes was analyzed by univariable analysis. The significance test of candidate predictors with p < 0.1 were used for further analysis. Second, a correlation matrix of all of the candidate predictors with p < 0.1 was then established. If the correlation coefficient between variables was more than 0.8, then combined with clinical judgment, one of the variables was selected for the model, and the other highly correlated variable was abandoned. Furthermore, according to the missing ratio of the original data, no further treatment was conducted for variables with smaller than 5% of missing values. For predictors with a missing ratio more than 5%, the data were imputed using multivariate imputation by chained equations (MICE), and the iterations were set to 10 times. Finally, the Rubin's rules were then used to combine the estimation results by logistic regression model 22 .
Variables were used as predictors if the p-value of partial regression coefficient is less than 0.05. We develop Model 1 based on above progress.
Next, a nonlinear term (quadratic term) was considered for all quantitative variables selected by above progress. We added the quadratic term of quantitative variables that were established in Model 1. If regression coefficients of quadratic term were statistically significant (p < 0.05), the linear and quadratic terms of the quantitative variable were included as predictors; otherwise, retained the linear term.
Additionally, on the basis of clinical judgment, possible interactions between the included predictors were considered. If the regression coefficient of the interaction term was statistically significant (p < 0.05), the interaction term was included in the prediction model; otherwise, the interaction term was abandoned. We finally conducted Model 2 by adding quadratic terms and interaction terms to Model 1.
To assess the performance of prediction model, we conducted an internal validation with bootstrapping. After 400 times of bootstrapping, the performance and the optimistic value of the model was calculated. Performance of the model. We reported the χ 2 statistics, pseudo R 2 , Pearson χ 2 and Akaike Information Criterion (AIC) to evaluate the overall goodness-of-fit of model. The discrimination ability of the model was measured by the area under the receiver operating characteristics (ROC) curve; we also reported the sensitivity, specificity, predictive accuracy, positive predictive value, negative predictive value and the discriminant slope, respectively. The Hosmer-Lemeshow test was used to estimate the calibration ability of the model and also draw the calibration plot. Smoothing technology and the Loess algorithm were used to describe the relationship between the observed probability and the predicted probability of the outcomes. To screen the high-risk population as possible, the prediction model was considered to be more valuable when the sensitivity was > 70%. Therefore, an optimum cut-off value of the model was selected to achieve a sensitivity of > 70% and a prediction accuracy of > 70%. All statistical analyses were conducted by STATA 13.0 (StataCorp, College Station, TX, USA).

Results
Characteristics of the cohort. A total of 2,793 eligible pregnant women with pre-eclampsia who delivered between May 1, 2005 and December 31, 2014 were enrolled in this study. The number of participants per year ranged from 108 to 390. Among these, the median age was 30 years (interquartile range, 27-35 years). The median gestational age was 36 weeks (33-38 weeks). A total of 56.1% of the women were urban residents, 36.9% of them were multipara, and 13.9% had multiple pregnancies. The characteristics of included population were presented (Table 1). A total of 397 pregnant women with pre-eclampsia suffered from severe maternal outcomes, with an incidence of 14.21% (95%CI: 12.91-15.51%). The median time from diagnosis of pre-eclampsia to delivery was 48 h (interquartile range, 24-120 h). The numbers of pregnant women with respiratory dysfunction, cardiovascular dysfunction, renal dysfunction, coagulation dysfunction, hepatic dysfunction, neurological dysfunction, uterine dysfunction, specific symptom and maternal death were 124, 113, 155, 233, 8, 31, 6, 13, and 5 respectively ( Table 2).

Selection of predictors.
Normality tests were conducted for all quantitative variable; of those, the original data of hemoglobin, platelet, fibrinogen, alanine transferase, aspartate transferase, total bilirubin, urea nitrogen, and creatinine levels at admission with skewed distribution were included in our analysis after logarithmic transformation. Univariable analysis of the associations between candidate predictors and outcomes showed that there were 32 variables with p < 0.1, which included maternal age, pre-pregnancy BMI, registered residential place, gestational week at delivery, multiple gestations, use of ART, placenta previa, diabetes mellitus, GDM, HBsAg positivity, heart diseases, IDA, chronic glomerulonephritis, immune system diseases, fatty liver, hypoproteinemia, edema, chest pain, dyspnea, nausea and vomiting, dizziness and headache, blurred vision, admission systolic pressure, admission diastolic pressure, log-transformed platelets, log-transformed fibrinogen, log-transformed alanine transferase, log-transformed aspartate transferase, log-transformed total bilirubin, logtransformed urea nitrogen, log-transformed creatinine, and qualitative proteinuria (Table S1).
The correlation coefficient matrix analysis showed that log-transformed alanine transferase levels and logtransformed aspartate transferase levels were highly correlated, with a correlation coefficient of 0.914, the systolic blood pressure and diastolic blood pressure at admission with a correlation coefficient of 0.803, thus we retained log-transformed aspartate transferase levels and systolic blood pressure at admission in consideration of clinical values.
Model performance. The multiple indicators indicated that two logistic regression models had overall significance, with a well goodness-of-fit (p > 0.05). After inclusion of quadratic and interaction term, the pseudo R 2 significantly increased and AIC value deceased (Table 4). By adding quadratic and interactive terms, Model 2 showed the higher area under the ROC curve (82.2%, 95% CI 79.6-84.7%) (Fig. 1, Table 4). When we set the cut-off as 0.14 in consideration of the incidence of outcome, Model 2 had the higher sensitivity (72.7%), specificity (76.1%), predictive accuracy (75.6%), and discrimination slope (0.27) ( Table 4).
The Hosmer-Lemeshow tests in two models showed p > 0.05. Combing with calibration plot, we suggested that the calibration abilities of Model 2 was better than Model 1 (Fig. 2, Table 4). Therefore, Model 2 was finally selected as the prediction model of severe maternal outcomes in pregnant women with pre-eclampsia.
Internal validation by bootstrapping showed that the optimism of the prediction model was 0.004, which indicated that the model was not apparent over-fitting (Tables S3).
Given that severe acute azotemia and massive transfusion accounted for two-thirds of all composite outcomes, we conducted a sensitivity regardless of gestational age analysis by merely including these two subjective outcomes. We acquired the slightly better model performance of discrimination and calibration (Tables S4).

Discussion
Main findings. Using a 10-year cohort of 2,793 pregnant women who diagnosed as pre-eclampsia in West China Second University Hospital of Sichuan University from 2005 to 2014, we developed and validated internally a prediction model of severe maternal outcomes in pregnant women with pre-eclampsia , and got a good discrimination (e.g. 82.2% of area under the ROC curve). The model was presented as follows: logit (P) = 33.468 − 0.051 gestational week + 1.033 placenta previa + 0.690 HBsAg positivity + 1.308 cardiac diseases + 1.060 IDA + 1.075 dyspnea + 0.006 systolic blood pressure − 11.976 log-transformed platelets − 0.964 log-fibrinogen + 0.198 logtransformed aspartate transferase + 0.391 log-transformed bilirubin-0.497 log-transformed creatinine + 0.946 Comparison with previous studies. A few number of prediction models aiming at severe maternal outcomes among pregnant women with pre-eclampsia were developed before 9,23-26 . Of those, fullPIERS model 9 and miniPIERS model 25 were the main parts of PIERS study, while other two extended models were also reported (A miniPIERS model with SpO2 24 and a combined cardiorespiratory symptom model 23 ). Another recent PREP model was conducted in 2017 by enrolling 946 individuals in UK 26 . Compared with these models, our study firstly involved Asian population and fully consider the characteristics of Chinese pregnant women.  www.nature.com/scientificreports/ Second, we finally used 13 variables for predicting and got a similar model performance to previous models (e.g. area under ROC curve more than 0.80, sensitivity and specificity more than 0.70); of those, three predictors have not been considered in previous models, involving placenta previa, HBsAg positivity and IDA. It emphasized the specific characteristics and heterogeneous baseline-risk for Chinese population, and more attention should be paid for these subgroups. We further discussed the implications of three predictors in later paragraph. Specially, all the predictors in our model are easily available across Chinese setting; however, some predictors such as SpO2, has significant predictive ability in PIERS model, whereas it is difficult to universally acquire in Chinese setting, and also had more than 50% missing values in PREP's study.
Third, our study included all kind of pregnant women diagnosed with pre-eclampsia, whereas PREP model only included the pregnant women with early-onset pre-eclampsia diagnosed before 34 weeks gestation.  predictors from six aspects (demographic and sociological characteristics, basic characteristics of pregnancy, pregnancy and childbirth history, admission diagnosis, admission symptoms, and laboratory tests) were considered in this study. A total of 13 variables were included in the final optimal logistic regression model as predictors. We found that gestational age at admission was a predictor of severe adverse outcomes. Pregnant women who had a lower gestational age at admission appeared to have worse severe adverse outcomes compared with pregnant women with a higher gestational age. Pre-eclampsia can be divided into two types according to the difference in occurrence time: early-onset pre-eclampsia and late-onset pre-eclampsia. Early-onset pre-eclampsia usually occurs before 34 weeks of gestation, while late-onset pre-eclampsia occurs after 34 weeks of gestation 27,28 .
Previous studies have shown that there are significant differences in etiology and prognosis between the two groups, and early-onset pre-eclampsia has worse maternal and perinatal outcomes than late-onset pre-eclampsia 28 . Therefore, the predictor of gestational age at admission in our study indicates the admission time of pregnant women after diagnosis of pre-eclampsia, rather than encouraging pregnant women to delay admission. Timely diagnosis and treatments were beneficial to the prognosis. Placenta previa, HBsAg positivity and IDA were firstly included in homogeneous prediction models. Placenta previa has always been deemed as an important predictor for severe adverse outcomes 29 . Several published studies have shown that there might be an interaction between pre-eclampsia and placenta previa. The incidence of placenta previa in women with pre-eclampsia was significantly lower than that in pregnant women without pre-eclampsia 30,31 , and the reverse is also true 32 . Although the mechanism of this interaction is still unclear, both pre-eclampsia and placenta previa are associated with abnormal infiltration of villous trophoblasts. Shallow trophoblast infiltration in women with pre-eclampsia (shallow implantation) leads to a decrease in placental perfusion, whereas deep trophoblast infiltration in pregnant women with placenta previa (deep implantation) leads to an increase in placental perfusion. Interaction of these two factors may reduce the risk of placenta previa to a certain extent 33 . With the above-mentioned mechanism, some pregnant women with pre-eclampsia still have placenta previa, thus severe adverse outcomes in these patients manifest as the combined effect of placenta previa and pre-eclampsia. This may significantly increase the risk of placental abruption, postpartum hemorrhage, hysterectomy, and other adverse outcomes 33 . In condition of the high cesarean section rate in China 34 and the increased risk of placenta previa in the subsequent pregnancy after cesarean section delivery 35 , we suspected placenta previa was the main reason for massive blood transfusion. It accounted for the large proportion of composite outcomes.
Besides, cardiac diseases, HBsAg positivity and IDA had a significant effect on outcome. The main reason for this finding is that these gestational comorbidities may aggravate damage to the heart, liver, and circulation system of pregnant women. Patients with pre-eclampsia commonly occur high total vascular resistance index, partly mediated by a substantial increase in sympathetic vasoconstrictor activity 36 , which could increase the burden of heart. On the other hand, the cardiac output and stroke volume increase from the early pregnancy to delivery, thus those women with cardiovascular compromise were more likely to occur pulmonary edema during the second stage of labor and postpartum period 37,38 . The risk of severe adverse outcomes is then significantly increased.
The previous study reported that nearly 3% of pregnancies are complicated by a variety of liver disease, which can have fatal consequences for pregnant women and offspring 39 . More notably, Hepatitis B is high-endemic in China. A meta-analysis estimated that approximately 7.6% of pregnant women were affected 40 . Although the association between hepatitis B virus infection and pre-eclampsia is still uncertain [41][42][43] , hepatitis B virus carrier and replication may aggravate the abnormalities of liver function in pregnancy and pre-eclampsia that occur coincidentally. To our knowledge, the predictive ability of HBsAg positivity on severe maternal outcomes among pre-eclampsia women is firstly reported. With regard to the relatively higher prevalence of hepatitis B in Asia Pacific and sub-Saharan African 44 , the effects could not be ignored in similar population.
In our study, pre-eclampsia was defined as a composite of such disorders as maternal acute kidney injury, liver dysfunction, neurological features, hemolysis or thrombocytopenia. For this condition, several laboratory markers, such as platelets and creatinine, are important prognostic factors of the disease. The outcome of our interest is a composite of severe maternal outcomes that composes of multiple organ dysfunctions. One would observe that some laboratory markers measured at baseline were used for predicting the occurrence of the composite outcome which included a same variable given a predefined level (e.g. creatinine over 300). Although it is obvious that the laboratory markers at baseline may well predict their level in the future, inclusion of such laboratory markers as predictors would also help predict the occurrence of other outcomes. For instance, systolic blood pressure and proteinuria are core indicators for diagnosis and prognosis of pre-eclampsia 45,46 . Our study suggested that increased systolic blood pressure may lead to worse outcomes. Proteinuria is an important indicator of renal dysfunction in pre-eclampsia, but in raw data of this study, we found that more than 20% of pregnant women received no quantitative proteinuria measurement. Therefore, in the current study, qualitative analysis of proteinuria was used as a candidate predictor instead of quantitative detection. In future studies, our team will further determine the effect of quantitative proteinuria to further optimize the model. Other predictors, including log-transformed platelets, log-fibrinogen, log-transformed aspartate transferase, log-transformed bilirubin, and log-transformed creatinine are the main indicators of functional status of vital organs. This is consistent with the mechanism of important organ damage caused by pre-eclampsia.
In addition, we included quadratic and interaction terms in Model 2. The choice of quadratic term and interaction term in prediction model was mainly driven by model performance, and the clinical significance of these mathematically sophisticated terms may be less likely uninterpretable. As such, we presented two models in our study. Model 1 did not include the quadratic and interaction terms, but model 2 did. In addition, these Scientific RepoRtS | (2020) 10:15590 | https://doi.org/10.1038/s41598-020-72527-0 www.nature.com/scientificreports/ two models require different computational capacities. Clinicians may choose the one that work best for their own purposes. Implication of our model. Our model provided a plausible predictive tool for identifying the high-risk pregnant women diagnosed pre-eclampsia among Chinese population. When a pregnant woman was diagnosed with pre-eclampsia, this model may offer a probability of severe maternal outcome which would assist clinician to determine whether the pregnant women need to be hospitalized and monitored more closely, as well as the decision-making about timing of delivery. This is particularly the case for junior clinicians and midwives in primary healthcare institutions. Given that the medical resources and healthcare capability are heterogenous among Chinese medical institutions, timely transferal for pre-eclampsia women at high risk might be an effective response to reduce the risk of severe maternal outcomes. Our model showed that setting a cut-off of 0.14 as the predicted probability presented a relatively good predictive ability (e.g. the sensitivity was 72.70% and the specificity was 76.13%). Note that this threshold in our model might be more useful in similar setting to our hospital. Further external validation and recalibration of this model are warranted before clinical use in different medical environment and broader population.
Strengths and limitations of the study. This study has several strengths. First, to the best of our knowledge, this study is the first prediction model aiming at severe maternal outcomes in women with pre-eclampsia based on the Asian population; by collecting data from a regional maternal referral center over 10 years (2005-2014), we presents the largest sample size among similar studies. Second, it emphasized the accessibility of all predictors across Chinese setting and identified three gestational comorbidities without inclusion in previous studies, involving placenta previa, HBsAg positivity and IDA. Third, we have got a good discrimination and calibration by a strict data collection process and transparently statistical approach for model development and internal validation.
Our study also has the following limitations. First, this study was a retrospective cohort study. As result of comparatively lower incidence of pre-eclampsia among pregnant women, collecting a sufficient number of subjects in a short period of time is difficult if prospective studies are carried out. Therefore, a retrospective design was adopted in our study. Second, according to the definition of maternal severe outcomes by WHO, some subjective symptoms were included in the composite outcome, such as cyanosis, gasping, and jaundice. Although we used a pre-designed CRF to extract those from the medical records by chart review, the accuracy of symptoms is subject to the judgment and records by clinicians. Third, the current study just conducted development of the model and internal validation. External validation in other medical institutions will be conducted in the next stage, and research results will then be reported.

conclusions
In this study, a prediction model for severe maternal outcomes in pregnant women with pre-eclampsia was established using a multivariable logistic regression model. This model has a good predictive ability by internal validation. Further external validation is required to clarify the clinical applicability of our model.

Data availability
Data sharing is possible upon ethnical approval and mutual agreement on authorship. Both the Chinese Evidence-based Medicine Center and West China Second Hospital are jointly responsible for granting the use of the study data.