Introduction

Since the first baby was born after in vitro fertilization (IVF) in the United Kingdom in 19781, assisted reproductive technology (ART), including IVF and embryo transfers (ETs), has been widely used for infertility treatment worldwide. The International Committee for Monitoring Assisted Reproductive Technologies reported that more than one million babies were born after ART between 2008 and 20102. An increased use of ART is also found in Japan, with 51,001 babies reportedly born following ART in 2015, accounting for approximately 1 in 19.7 births3.

Despite the dramatic increase in pregnancies following ART, the safety of these techniques continues to be a matter of concern. Observational studies have suggested that babies born after fresh ETs are associated with adverse perinatal outcomes, such as lower birth weight, preterm delivery (PTD) and perinatal deaths, compared with frozen ETs4. Recent randomized controlled trials (RCTs) demonstrated that babies born after fresh ETs were significantly smaller than babies born after frozen ETs for women with or without polycystic ovary syndrome5,6. Although various processes and procedures related to ART, such as multiple gestations and vanishing twins following multiple embryo transfers, can carry a risk for these adverse perinatal outcomes, the hormonal environment caused by ovarian stimulation in fresh ET may also influence these perinatal outcomes7,8,9.

Ovarian stimulation plays a vital part in ART, allowing the retrieval of multiple oocytes and increasing the success rate of live births per fresh cycles. Several ovarian stimulation protocols have been developed to optimize the number of oocytes retrieved and minimize risks of complications, such as using gonadotropin-releasing hormone (GnRH) agonist10, GnRH antagonist11,12, and mild ovarian stimulation using clomiphene citrate (CC) or natural cycle IVF (natural cycle)13,14. It was suggested that children born following ovarian stimulation may exhibit lower birth weight and higher risk of PTD compared with those following natural cycles7,8. Whether the increased risk differs between distinct ovarian stimulation protocols used in fresh ET cycles remains unknown.

We investigated whether ovarian stimulation protocols were associated with birth weight and gestational length in singletons born after fresh single ETs using a nationally-representative ART sample from Japan.

Results

Baseline characteristics

Baseline characteristics stratified by ovarian stimulation protocols are shown in Table 1. The sample included natural (n = 4058), CC (n = 4715), CC + gonadotropin (n = 5443), GnRH agonist (n = 16,566) and GnRH antagonist (n = 7483) protocols. Mean maternal age was higher for the CC and natural cycle cohorts, in which 15.4% and 12.6%, respectively, were more than 40 years of age. The proportion of cases with tubal factor/endometriosis was highest for the GnRH agonist protocol, while unexplained infertility was highest in the natural cycle and CC cohorts. The number of oocytes retrieved was highest for the GnRH agonist protocol, followed by the GnRH antagonist protocol, in which approximately 30% of cases had retrieved more than 10 oocytes. For the ovarian stimulation protocols using GnRH agonist or antagonist, over 40% of each cohort used blastocyst ET, while early cleavage ET dominated for the natural cycle, CC and CC + gonadotropin protocols. For luteal support, progesterone was most frequently used in natural cycle and CC, while estrogen + progesterone was used frequently in GnRH agonist and antagonist protocols.

Table 1 Baseline characteristics of sample population stratified by ovarian stimulation protocols (n = 38,220)a.

Neonatal outcomes according to ovarian stimulation protocols

Pregnancy and neonatal outcomes stratified by ovarian stimulation protocols are shown in Table 2. For the natural cycle, term deliveries were the most frequent (90.1%), while PTD and very PTD (VPTD) were the least frequent (5.4% and 0.89%, respectively). Similarly, low birth weight (LBW) and very LBW (VLBW) were least frequent (8.2% and 0.69%, respectively) in the natural cycle cohort. The proportion of small for gestational age (SGA) was highest in the CC + gonadotropin cohort (9.5%), whereas the natural cycle cohort had the significantly lowest frequency (5.4%) of SGA. Cesarean section (CS) was most frequent in the CC cohort (31.0%).

Table 2 Pregnancy and neonatal outcomes stratified by ovarian stimulation protocolsa.

Ovarian stimulation protocols and neonatal outcomes

Crude and adjusted ORs of ovarian stimulation protocols for pregnancy and neonatal outcomes are shown in Table 3. Compared with the natural cycle, all ovarian stimulation protocols showed a significantly increased risk for PTD, LBW, and SGA. The CC and CC + gonadotropin protocols showed the highest crude and adjusted odds ratios (ORs) for LBW, VLBW and SGA compared with other protocols. These protocols also exhibited a significantly decreased risk for large for gestational age (LGA), and the CC and CC + gonadotropin protocols were significantly associated with CS.

Table 3 Crude and adjusted ORs of ovarian stimulation protocols compared with natural cycle for pregnancy and neonatal outcomes.

Subgroup analysis according to different ART treatments

The results of subgroup analysis with a maternal age under 35, luteal support using progesterone, and early cleavage stage ET are shown in Table 4. For PTD, the CC and GnRH antagonist protocols demonstrated a significant association throughout the three-subgroup analysis. Similar significant associations were observed between CC or CC + gonadotropin protocols and LBW, VLBW, SGA and CS. In GnRH agonist and antagonist protocols, significant associations were observed for LBW and for SGA in GnRH antagonist protocol throughout the three-subgroup analysis, but for VLBW, the results were attenuated in some of the subgroup analyses, resulting in non-significant associations.

Table 4 Adjusted ORs of ovarian stimulation protocols compared with natural cycle for pregnancy and neonatal outcomes among subgroup of different ART treatment.

Subgroup analysis restricting the number of oocyte retrievals

Results of the subgroup analysis comparing CC with natural cycle and restricting the number of oocyte retrievals to one are shown in Table 5. Even after restricting the analysis to retrievals that collected a single oocyte, there was a significantly increased risk of PTD, LBW, SGA and CS for ovarian stimulation using CC compared with the natural cycle.

Table 5 Crude and adjusted ORs of ovarian stimulation using clomiphene citrate compared with natural cycle for pregnancy and neonatal outcomes among subgroup of one oocyte retrieval.

Sensitivity analysis

Results of the subgroup analysis restricting samples with term deliveries are shown in Supplemental Table 1. Even restricting samples at term deliveries, all the ovarian stimulation protocols were associated with LBW, and significant associations were observed between CC or CC + gonadotropin protocols and SGA, LGA and CS. Further, sensitivity analysis excluding cycles with missing values demonstrated almost the same results, although several significant associations were attenuated and became marginally significant or non-significant (Supplemental Tables 24).

Discussion

Using a nationally-representative ART sample from Japan, we found that ovarian stimulation protocols were significantly associated with lower birth weight compared with natural cycles, even for singleton deliveries following fresh single ET. In particular, ovarian stimulation using CC produced worse neonatal outcomes compared with other stimulation protocols, and was significantly associated with PTD, SGA and CS. Our study suggests that ovarian stimulation may affect birth weight, and CC may have an adverse effect on neonatal outcomes in fresh cycles.

Few studies have investigated the association between ovarian stimulation protocols and neonatal outcomes, and these limited findings have been conflicting. Mak et al. recently reported perinatal outcomes among singleton deliveries following natural cycle IVF (n = 190) and stimulated IVF using GnRH agonist or antagonist (n = 174) in a single center between 2007–20138. This recent study suggested that neonates born following natural cycle IVF had a significantly lower risk for LBW (adjusted OR, 0.07, 95% confidence interval [CI], 0.014–0.35). The PTD rates were typically high in both groups, but significantly smaller in natural cycle IVF than in stimulated IVF (31.5% vs. 42.0%, respectively, P = 0.03). However, another study used nationwide U.K. data to investigate perinatal outcomes of singleton births following natural (n = 262) and stimulated IVF cycles (n = 98,667) from 1991–2011. The analysis of U.K. data found ovarian stimulation had no significantly increased risk for LBW (adjusted OR, 1.58, 95% CI, 0.96–2.58) and PTD (adjusted OR, 1.43, 95% CI, 0.91–2.26). Both studies included natural cycle sample sizes that were too small to draw strong conclusions, and did not stratify ovarian stimulation protocols. In Japan, mild ovarian stimulation using CC or natural cycle IVF has been broadly applied in ART institutions15,16,17, resulting in adequate sample numbers, especially for natural cycles, to investigate the association between ovarian stimulation and neonatal outcomes.

Among ovarian stimulation protocols, those using CC demonstrated a higher risk for PTD, LBW, SGA and CS. Similar adverse outcomes following CC have been suggested in non-ART populations. A nationwide retrospective cohort study from Denmark reported that intrauterine insemination with ovulation induction using CC had a significantly increased risk for LBW (adjusted OR, 1.5, 95% CI, 1.1–2.1) and SGA (adjusted OR, 1.6, 95% CI, 1.1–2.4) compared with natural cycle intrauterine insemination18. Another study investigating perinatal outcomes of 623 infants born naturally or following CC or letrozole protocols found that birthweight was significantly smaller in the CC group compared with natural (P < 0.02) or letrozole cycles (P < 0.02), even among singletons19. These results do not eliminate the possibility that multiple ovulation, resulting in higher serum estradiol levels, may mediate the association between CC and adverse perinatal outcomes9. However, our study demonstrated a significant association even after restricting the analysis to one oocyte collected per retrieval cycle, suggesting CC itself may have an adverse effect on perinatal outcomes. CC has both estrogen agonistic and antagonistic properties, which cause depletion of estrogen receptors in the hypothalamus leading to increased GnRH secretion20. However, the antiestrogenic effects of CC on the endometrium, implantation and subsequent gestation remain unknown. One study reported that although more than 85% of CC was eliminated in approximately 6 days, significant plasma concentrations of the Z-isomer of CC was detected 1 month after administration21. Other research suggested that CC may suppress endometrium receptivity22,23 or cause morphological changes in the endometrium24,25. An ovarian stimulation protocol administering CC during the whole stimulation phase was reported to prevent the premature surge of luteinizing hormone15,26. For such cases, the negative effect of CC may be strengthened compared with the normal shorter dosage regime for ovulation induction.

One strength of the current study is that we restricted our analysis to singleton deliveries following fresh single ET from ovulatory women to eliminate the influence of multiple pregnancies, vanishing twins and PCOS on neonatal outcomes. After the introduction of the SET policy in 2007, single ET now represents more than 70% of all ETs in Japan27, resulting in improvements in perinatal outcomes. However, there are several limitations in our study. First, specific indicators for selecting an ovarian stimulation protocol were unavailable, which may give rise to the possibility of residual confounding effects from underlying indicator factors. Second, we lacked data on important confounders such as parity, duration of infertility, numbers of previous ART failures, maternal body mass index and smoking status, which may also confound the findings. Third, other mediating factors such as embryo quality may play a role in the association between ovarian stimulation protocols and neonatal outcomes. Finally, the registry consists of cycle-specific information, and it is not possible to adjust for correlations if women had multiple deliveries during the study period. However, since Japan has one of the lowest birth rates in the world (total fertility rate of 1.5 in 2015)28, the number of women who had multiple deliveries between 2007 and 2013 would be small. Based on the above limitations, further studies, especially randomized controlled trials investigating the effect of ovarian stimulation protocols upon neonatal outcomes, are essential.

Although it has been reported that perinatal outcomes of fresh ET cycles tend to be worse compared with those of frozen cycles, even for singletons, our study suggests that ovarian stimulation protocols play an important role in birth weight and gestational length in fresh cycles. Considering that the endometrium can be affected by ovarian stimulation29, and the improvements in vitrification, it is possible that a frozen ET may provide a better option instead of fresh ET following ovarian stimulation, in order to achieve better perinatal outcomes.

In conclusion, using a nationally-representative Japanese ART sample, we found that ovarian stimulation was significantly associated with lower birthweight after fresh cycles. In particular, the use of CC in ovarian stimulation had a higher risk of adverse perinatal outcomes compared with other stimulation protocols, and was significantly associated with PTD, SGA and CS. Considering our current findings, frozen ET may be an alternative option from the perspective of perinatal outcomes. Further studies, especially randomized controlled trials, are needed to investigate the effect of ovarian stimulation using CC on endometrium, implantation and subsequent gestation.

Methods

Study sample

This is a retrospective cohort study using a Japanese national ART registry assembled by the Japan Society of Obstetrics and Gynecology (JSOG). The JSOG launched the ongoing registration system in 2007 for all ART clinics and hospitals to report cycle-specific information on-line. The registry has mandatory reporting, and patients cannot receive government subsidies if a clinic or hospital does not register their information. The database included cycle-specific information such as infertility diagnosis, ovarian stimulation protocols, IVF or intracytoplasmic sperm injection (ICSI), embryo stage at transfer, and pregnancy and obstetric outcomes. The JSOG requires all participating clinics and hospitals to report pregnancy and obstetric outcomes. ART clinics without delivery facilities usually receive a hospital delivery report, and if they do not obtain the delivery report, the JSOG recommends ART facilities contact mothers directly to obtain obstetrical outcomes. Since the use of donor oocytes or embryos is prohibited during ART in Japan, all embryos transferred were autologous. Preimplantation genetic testing for chromosomal aneuploidy is prohibited in Japan.

We included singleton live births after 22 weeks of gestation, or birth weight > 500 g with unknown gestational length, following fresh single ETs between 2007 and 2013. We excluded cycles with polycystic ovary syndrome or anovulation, ICSI using testicular sperm extraction, and gamete intra-fallopian transfers. A detailed flow diagram of the cohort selection process is shown in Fig. 1. Among 248,848 single embryo transfer cycles, 52,603 cycles resulted in clinical pregnancy. After excluding cycles with miscarriages, ectopic pregnancies, single fetal demise in twin pregnancies, terminated cases, still births, delivery before 22/after 42 weeks and multiple pregnancies, 38,220 cases were included in this study.

Figure 1
figure 1

Flow diagram of cohort selection and comparison groups.

Ethical approval

This study was approved by the institutional review board at the National Center for Child Health and Development, Saitama Medical University and ethics committee of the JSOG. After approval of the study, the JSOG provided data without any personal identifying information. The study was conducted in accordance with Japanese law and the STROBE Guidelines. No informed consent was obtained from the patients because the study was retrospective.

Outcomes examined

Our main outcomes were birth weight and gestational length. LBW was defined as birth weight less than 2500 g. VLBW was defined as birth weight less than 1500 g. PTD was defined as gestational weeks at delivery less than 37 weeks, VPTD was defined as gestational weeks at delivery less than 32 weeks. Similarly, SGA and LGA were defined below/above the 10th percentile for neonates born between 22 and 41 weeks according to the national reference30. We also investigated delivery methods of CS as a secondary outcome.

Other variables

Ovarian stimulation protocols included natural (i.e., unstimulated), CC alone, CC with gonadotropin (CC + gonadotropin), GnRH agonist and GnRH antagonist protocols. We also used maternal age, infertility diagnosis, number of oocytes retrieved, fertilization method (IVF, ICSI or split-ICSI) and embryo stage at transfer (early cleavage or blastocyst).

Statistical analysis

We compared baseline characteristics and perinatal outcomes according to ovarian stimulation protocols using the χ2 test or one-way analysis of variance. We calculated the crude and adjusted OR of each ovarian stimulation protocol compared with natural cycles for neonatal outcomes using generalized estimating equations with robust variance estimation adjusting for correlations within ART institutions. The a priori covariates for adjusted analysis were maternal age (categorized into 5-year age groups), infertility diagnosis, fertilization method (i.e. IVF/ICSI), fetal sex and reported year of cycles. Since we included cycles with incomplete data about obstetric outcomes, there were missing values in delivery method (8.7%), gestational age at delivery (8.8%), birth weight (7.7%) and sex of neonates (7.5%). For those variables, we performed multiple imputation by chained equations to impute missing data with 10 sets of imputations, and then conducted regression analysis. Further, we conducted subgroup analysis with maternal age under 35 years to exclude the effect of advanced maternal age on perinatal outcomes. Since luteal support and embryo stage at transfer are mediating factors between ovarian stimulation and perinatal outcomes, and adjusting for those variables is not appropriate31, we conducted subgroup analysis restricting luteal support to progesterone alone, or cycles with early cleavage ETs. Finally, in order to remove the effect of multiple oocytes collected in a single retrieval on outcomes, we compared neonatal outcomes following ovarian stimulation using CC alone with a natural cycle from ART cycles with just one oocyte retrieved.

We conducted two sensitivity analyses. The first analysis was restricting samples within term deliveries (gestational age at delivery between 37 and 41 weeks of gestation). Second, we performed all analysis with complete-case analysis (i.e. excluding cycles with missing values). All analyses were performed using the STATA SE statistical package, version 13.1 (Stata, College Station, TX, USA). A two-tailed value of P < 0.05 was considered statistically significant.