Routinely collected antenatal data for longitudinal prediction of preeclampsia in nulliparous women: a population-based study

The objective was to evaluate the sequentially updated predictive capacity for preeclampsia during pregnancy, using multivariable longitudinal models including data from antenatal care. This population-based cohort study in the Stockholm-Gotland Counties, Sweden, included 58,899 pregnancies of nulliparous women 2008–2013. Prospectively collected data from each antenatal care visit was used, including maternal characteristics, reproductive and medical history, and repeated measurements of blood pressure, weight, symphysis-fundal height, proteinuria, hemoglobin and blood glucose levels. We used a shared-effects joint longitudinal model including all available information up until a given gestational length (week 24, 28, 32, 34 and 36), to update preeclampsia prediction sequentially. Outcome measures were prediction of preeclampsia, preeclampsia with delivery < 37, and preeclampsia with delivery ≥ 37 weeks’ gestation. The area under the curve (AUC) increased with gestational length. AUC for preeclampsia with delivery < 37 weeks’ gestation was 0.73 (95% CI 0.68–0.79) at week 24, and increased to 0.87 (95% CI 0.84–0.90) in week 34. For preeclampsia with delivery ≥ 37 weeks’ gestation, the AUC in week 24 was 0.65 (95% CI 0.63–0.68), but increased to 0.79 (95% CI 0.78–0.80) in week 36. The addition of routinely collected clinical measurements throughout pregnancy improve preeclampsia prediction and may be useful to individualize antenatal care.

illustrates these features with an example for systolic blood pressure, one of the timevarying predictors. The average trajectory of systolic blood pressure in the non-preeclamptic population is indicted by the solid curve, and that of a fictitious woman by a dashed curve. The observed measures of her systolic blood pressures are displayed as dots. The woman's trajectory has higher level, steeper trend, and more pronounced curvature, than the population average. Figure S1. The non-preeclamptic population average trajectory of systolic blood pressure (solid curve), and the predicted trajectory (dashed curve) and the observed measures (dots) of a fictitious woman.
We estimated the predictive models in three steps: first, we defined and estimated a mixedeffects model with data from the population of women without pre-eclampsia; second, we used the empirical best linear unbiased predictor to obtain the u-scores for level, trend, and curvature, for all women; third, we included the u-scores along with the other baseline predictors in logistic regression models. The following three subsections describe the three steps in detail.
Although the above second-order polynomial model was an approximation to the actual trajectories of the different time-varying predictors, we deemed it adequate to capture their main features, namely level, trend, and curvature. The level represented the average of the predictor if this was constant over time. The level was 0 for the population and ( 0 + ,0 ) for the -th woman. The trend represented the slope of the trajectory if this was linear. The trend was 1 for the population and ( 1 + ,1 ) for the -th woman. Finally, the curvature represented the convexity of the trajectory as measured by its second derivative. The curvature was 2 for the population and ( 2 + ,2 ) for the -th woman.
The random effects ,0 , ,1 , ,2 represented the departure of the -th woman's trajectory from the non-preeclamptic population-average trajectory of the level, trend, and curvature, respectively. Figure S2 helps visualize their interpretation. Figure S2. The distribution of the u-score in the non-preeclamptic population for any given feature (level, trend, and curvature) of a woman's trajectory. The x-axis indicates the standard deviations from the population average. When the u-score of a given feature and woman is equal to zero, that feature of that woman' trajectory is equal to population average. A positive (negative) u-score indicates that the feature is larger (smaller) than that of population.
The random vector = ( ,0 , ,1 , ,2 )′ was assumed to follow a multivariate normal distribution with mean equal to the three-dimensional vector of zeros and covariance matrix equal to = [ 0,0 0,1 0,2 0,1 1,1 1,2 0,2 1,2 2,2 ] The variance parameters 0,0 , 1,1 , and 2,2 were constrained to be positive. The remaining covariance parameters 0,1 , 0,2 , and 1,2 were left unconstrained. The residual term , in the mixed model was assumed to follow a zero-mean normal distribution with variance 2 . The vector and the residual , were assumed independent of each other and of the gestational week. The parameters and were estimated by maximizing the corresponding likelihood function (McCulloch et al 2008).
Second step: obtain the u-scores For all the women in the sample, we obtained the u-scores with the empirical best linear unbiased predictor (EBLUP), where the vector contained the measures of the time-varying predictor at each visit, indicated the matrix with the -th row equal to (1, week , , week , 2 ) for the -th visit for the -th woman, and indicated the identity matrix. The number of rows of the matrices and was equal to the number of visits for the -th woman, which varied across women. The parameters and contained in the above expression were replaced with the estimates obtained in the first step, as described in the previous section.
The EBLUP has desirable properties and an interesting interpretation (McCulloch et al 2008). To predict a given woman's trajectory, it optimally merges the information contained in the observations available for that woman and that contained in the sample of all women. More specifically, the EBLUP predicts the departure of her trajectory from the population-average trajectory by shrinking the observed residual ( − ) by a factor ′( ′ + 2 ) −1 . When the shrinkage factor is large, the predicted trajectory is close the non-preeclamptic population-average trajectory. When shrinkage factor is small, the predicted trajectory is close to the woman's observed values. The level of shrinkage depends on two quantities: (1) the number of observations (visits) available, and (2) the relative magnitude of the variability between and within women. When the number of available observations is large and the trajectories vary substantially from woman to woman, little shrinkage takes place for that woman. Conversely, when the number of visits and the woman to woman variability are small, the shrinkage is considerable. Because the shrinkage depends on the number of observations available on each woman, it varies across women.
Third step: predict the probability of pre-eclampsia For each binary outcome, (pre-eclampsia, preeclampsia with delivery before 37 weeks' gestation, preeclampsia with delivery from 37 weeks' gestation, diagnosis of preeclampsia before, and from 37 weeks' gestation), we estimated a generalized linear model with logit link and normal family distribution: logit ( = 1) = 0 + 1 ,1 + ⋯ + , + +1 ,1 + ⋯ + , + , where indicates the binary outcome of the -th woman, , the -th of her covariates, and , the -th of her u-scores. The baseline predictors were: maternal age at baseline, region of birth, family situation, height, smoking 3 months before pregnancy and at first antenatal visit, previous miscarriage, infertility duration, infertility treatment, family history of preeclampsia and hypertension, chronic diseases (cardiovascular disease, endocrine disease, pre-excisting diabetes, thrombosis history, systemic lupus erythematosus (SLE), chronic hypertension, Mb Crohn/Ulcerative colitis, kidney disease and blood group . The time-varying predictors capillary glucose and proteinuria were treated as binary and categorical, respectively. The u-scores were calculated for systolic blood pressure, diastolic blood pressure, maternal weight, symphysis fundal height and haemoglobin level. The baseline predictors, capillary glucose, proteinuria and the u-scores entered the predictive models without any further selection.
On the estimation of the standard errors The three-step approach described above implied that the standard errors for the coefficients of the logistic regression calculated in Step 3 were possibly underestimated, as they did not take into account the uncertainty inherent in the estimates of the quantities obtained in Steps 1 and 2. Bootstrapping the full three-step process or maximizing the joint likelihood would provide correct standard errors. We performed neither of these alternative approaches, however, because standard errors, p-values, and confidence intervals, were inconsequential in any of the above steps. In addition, bootstrapping or maximizing a joint likelihood with our large sample was unfeasible with the computing resources available to us.

Assessment of the performance of the predictive models
The goodness of fit of the model as assessed with the Hosmer-Lemeshow test and its sensitivity and specificity were summarized by the area under the curve (AUC), and by sensitivity for 10% false positive rate.
We assessed the performance of the predictive models under five different scenarios: 1) included all available visits up to 24 fully gestational weeks (168 days), 2) all visits up to 28 gestational weeks (196 days), 3) all visits up to 32 gestational weeks (224 days), 4) all visits up to 34 gestational weeks (238 days) and 5) all visits up to 36 gestational weeks (252 days). The scenarios allowed evaluating the performance of the predictive models when the measures of the time-varying predictors at future visits are still unknown. At the 24 th gestational week prediction, the u-scores for symphysis fundal height were unavailable, as this predictor is generally measured at later times during gestation.
The number of visits available on each women varied across the five scenarios, which meant that the number of rows of the vector and of the matrices and for the calculation of the uscores in the second step also varied across scenarios. The parameters , , and , however, were constant across the scenarios, as these were estimated in the first and third step.  * For each time-varying predictor, a set of three u-scores captured the departure of each woman's trajectory from the non-preeclamptic population average trajectory with respect to three features: level, trend, and curvature. A positive or negative u-score indicates that the feature is larger or smaller than that of the non-preeclamptic population.