Early prediction of live birth for assisted reproductive technology patients: a convenient and practical prediction model

Live birth is the most important concern for assisted reproductive technology (ART) patients. Therefore, in the medical reproductive centre, obstetricians often need to answer the following question: “What are the chances that I will have a healthy baby after ART treatment?” To date, our obstetricians have no reference on which to base the answer to this question. Our research aimed to solve this problem by establishing prediction models of live birth for ART patients. Between January 1, 2010, and May 1, 2017, we conducted a retrospective cohort study of women undergoing ART treatment at the Reproductive Medicine Centre, Xiangya Hospital of Central South University, Hunan, China. The birth of at least one live-born baby per initiated cycle or embryo transfer procedure was defined as a live birth, and all other pregnancy outcomes were classified as no live birth. A live birth prediction model was established by stepwise multivariate logistic regression. All eligible subjects were randomly allocated to two groups: group 1 (80% of subjects) for the establishment of the prediction models and group 2 (20% of subjects) for the validation of the established prediction models. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of each prediction model at different cut-off values were calculated. The prediction model of live birth included nine variables. The area under the ROC curve was 0.743 in the validation group. The sensitivity, specificity, PPV, and NPV of the established model ranged from 97.9–24.8%, 7.2–96.3%, 44.8–83.8% and 81.7–62.5%, respectively, at different cut-off values. A stable, reliable, convenient, and satisfactory prediction model for live birth by ART patients was established and validated, and this model could be a useful tool for obstetricians to predict the live rate of ART patients. Meanwhile, it is also a reference for obstetricians to create good conditions for infertility patients in preparation for pregnancy.


Scientific Reports
| (2021) 11:331 | https://doi.org/10.1038/s41598-020-79308-9 www.nature.com/scientificreports/ subsequent treatments, this value decreased to less than 0.68. Although many live birth prediction models have been established, they are rarely used in clinical practice. The main reasons may include the following: (1) they cannot be applied to all ART patients because the model is based on only 1-2 types of ART patients; (2) some predictors need more complicated and expensive laboratory tests; (3) the use of these models is not sufficiently convenient, and (4) some models are less accurate than others for predicting live birth. We aimed to establish a convenient and practical live birth prediction model that has higher predictive value and that can be applied to all ART patients.

Results
There were 15,717 ART treatments performed from 2012 to 2017. Of these, 1891 subjects who had missing information on their live birth outcomes were excluded, leaving 13,826 subjects for analysis. Among them, 80% of the subjects (11,071) were allocated to group 1 (establishment model), and 20% of them (2755) were allocated to group 2 (validation model).
Univariate analysis results. We analysed the relationships between the variables and live birth outcomes by univariate analysis. Twenty-two variables were found to be significantly associated with live birth outcomes (p-value < 0.05, Tables 1, 2).

Logistic regression analysis and prediction model establishment.
Based on our univariate analysis results, we found that maternal age, body mass index, number of previous ART treatments, female infertility duration, number of previous pregnancies, number of abortions, basal FSH, sperm concentration, endometrial thickness before embryo transfer, number of antral follicles, total number of oocytes, sperm viability, sperm progressive motility, type of embryo transfer, quality of transferred embryos, maternal education, infertility diagnosis, uterine volume, artificial insemination technology used, stimulation protocol, total number of transferred embryos, and total dose of gonadotropin were significantly associated with live birth. We used these variables as independent variables to perform multiple logistic regression analysis. The variables are presented in Table 3. We found there was not multicollinearity among those variables (Tol > 0.1, VIF < 10, shown in Table 4). Likelihood ratio forward stepwise method (α = 0.05 for entry, and α = 0.10 for removal) was used in the logistic regression. Finally, the prediction model of live birth was established, including nine variables (shown in Table 4). Each step of this model is shown in Supplementary Table S1. The area under the ROC curve was 0.722 (95% CI, 0.709-0.735). The model is as follows: Logit P = − 1.857 + 0.199X 1 + 0.150X 2 + 0.276X 3 + 0.077X 5 − 0.149X 8 + 1.20 5X 9 + 0.690X 12 + 0.770X 13 + 0.534X 19. Verification prediction model. We performed a validation of the prediction model of live birth by using the model in the validation group [with 20% of the subjects (2755)]. The area under the ROC curve was 0.743 (95% CI, 0.719-0.768) (Fig. 1). Table 5 shows the practical predictive value for live birth with different cut-off values. The sensitivity(SE) and specificity(SP) ranged from 97.9-24.8% and 7.2-96.3%, respectively, at different cut-off values. www.nature.com/scientificreports/

Discussion
We established a prediction model of live birth by multivariable logistic regression analysis, and this model included 9 common variables. In the establishment model group, the ROC value was 0.722, and there was good calibration. In the validation model group set, the ROC value was 0.743, and the model was calibrated well. Our model has several obvious advantages. Our model is a convenient and practical prediction model because information on all the variables included in the model is generally available in the clinic, and there is no need for any special test. Some predictors in our model, such as endometrial thickness, stimulation protocol, and embryo parameters, can be the focus of interventions. Therefore, the model has a certain predictive value and instructional clinical value in early treatments. The model can predict a live birth in the early pregnancy stage because information on almost all the variables included in the model is available at the beginning of pregnancy. Moreover, our model has acceptable clinical predictive value, and the area under the ROC curve reached 0.743 in the validation group, which is higher than the values of most of the previously reported models that were similar to our model 10,13,16. Although the ROC values of some models are larger than ours, the variables in these models, such as gene or granulosa cell biomarkers, may be inconvenient to assess 1,17 . Moreover, a variable, such as gene expression, may be unchangeable and have no preventive value 1 . Furthermore, information on variables, such as HCG and progesterone may only be attainable after pregnancy is achieved and cannot be used for the prediction of a live birth 9,18,19 . Many previous prediction models of live birth are not applicable to all ART patients but are instead only applicable for a specific infertility subgroup 10,[20][21][22] ; however, our model is suitable for different artificial insemination technologies and all ART patients.
We have established a highly discriminatory, well-calibrated, robust, and practical prediction model that can use available clinical data to predict the live birth rate and may be transferred to corresponding computer Table 2. Comparison of the live birth rates of the different sub-groups of the 13,826 ART patients from 2010-2017 (categorical variables). a IUI, intrauterine insemination. b Quality of the transferred embryos: I is the best-quality embryo, followed by II and III. www.nature.com/scientificreports/ software for easy operation. Clinicians and public health workers can easily use this model to identify high-risk populations for the management by ART.
As we all know, breeding a new life is a very complex process, which will be affected by many known and unknown factors. Especially, for infertility patients, the condition will be more complex and changeable. Medical technology level of different hospitals and doctors quite naturally plays a fundamental role. To date, it is hard to predict live birth rates before embryo transfer. In the early stage of infertility treatment, patients and doctors are most concerned about "How many normal oocytes are there?", "How many normal sperm are there?" and "How many embryos can be transferred?" Therefore, the prediction of live birth rate is at least based on the successful embryo transfer. However, this finding can be used as a guidance to try to create good conditions for infertility patients in preparation for pregnancy. Table 3. Variable assignment in the multivariate logistic regression analysis. a X 2 , X 4 , X 7 , X 9 , X 10 , X 13 , X 22 were entered into the models as dummy variables, and the group with the highest live birth rate was selected as the reference group, which was assigned "0".

= 1, 2 = 2, 3 = ≥ 3
Uterine volume (mL) a X 4 1 = < 30, 0 = 30~, 2 = 50~, 3 = ≥ 70 No. of abortions X 5 1 = 0, 2 = 1, 3 = 2, 4 = ≥ 3 Sperm concentration (million/mL) X 6   www.nature.com/scientificreports/ Our live birth prediction models were further validated with a separate sample, allowing us the ability to evaluate the true predictive performance of the models when they are being used in other populations. We also examined the impacts of different cut-off values on sensitivity, specificity, PPV and NPV, to establish an appropriate reference range. Clinicians and public health workers could conveniently select different cut-off values in their live birth assessment process to obtain optional results.
There are several limitations to this study. Our data were obtained only from a large reproductive medicine centre, and the ROC values of our model are not the largest among the reported models. Therefore, we do not think the model is unchangeable. With the development of medical technology, new variables will be added to our model, and our prediction model will be continuously optimized. In addition, the applicability of the model in other clinics needs to be further verified, which will be our next research work.
In conclusion, a prediction model for live birth by ART patients was established and validated. The model is stable, reliable, convenient, and satisfactory; furthermore, this model could be a useful tool for early-gestation predictions by obstetricians of whether or not a ART patient has a high probability of a live birth and for trying to create good conditions for ART patients in preparation for pregnancy.  Outcome measures. In our study, we focused on live birth. The birth of at least one live-born baby per initiated cycle or embryo transfer procedure was defined as a live birth, and all the other adverse pregnancy outcomes were classified as no live birth.

Statistical analysis.
A live birth prediction model was established by stepwise multivariate logistic regression (α = 0.05 for entry, and α = 0.10 for removal). When establishing the model, the criteria for selecting predictive variables were as follows: first, p value was less than 0.05; second, it contributed to improving the area under the ROC curve. All eligible subjects were randomly allocated to two groups: group 1 (80% of subjects) for the establishment of the prediction models and group 2 (20% of subjects) for the validation of the established prediction models. Subjects of the two groups were independent without repetition. The areas under the ROC curve generated by the logistic regression model were applied to evaluate the effectiveness of the prediction models. We further calculated the sensitivity, specificity, positive predictive value, and negative predictive value of the prediction models with different cut-off values. All data were managed and analysed using the statistical package for social sciences (SPSS) software version 17.0 (Chicago, IL, SPSS Inc. 2008) and Excel (Microsoft Corp., Redmond, WA, USA). Measurement data are described as the mean ± standard deviation (SD), and enumeration data are described as numbers (percentages). All p values corresponded to two-sided tests, and p < 0.05 was considered statistically significant.
Ethical approval and consent to participate. The study was approved by the Ethics Committee of Xiangya Hospital of Central South University, mainly including the use of data from various clinical examination and laboratory tests of patients. All infertility patients presenting to the Reproductive Medicine Centre, Xiangya Hospital of Central South University, Hunan, China, who were planned for ART treatments and who signed the informed consent were enrolled in the study from January 1 2010 and May 1 2017. In addition, we confirmed that all methods were performed in accordance with the assisted reproductive technology relevant guidelines and regulations.

Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.