Recurrent pregnancy loss (RPL) is defined as two or more losses at any time during pregnancy1,2. Most of these occur before 10 weeks of gestation. Identifiable causes of RPL include antiphospholipid syndrome, uterine anomalies, and parental and embryonic chromosomal abnormalities1,2. Embryonic or fatal aneuploidy is the most common cause of RPL, with a frequency of 40–50%3,4. As maternal age rises, the probability of a newborn baby with aneuploidy increases. The frequency of aneuploidy in embryos and miscarried products of conception (POC) is high in elderly women as well. Oocyte aneuploidy increases as the dictyate arrest is longer due to aging, and the division error occurs mainly in the first division of meiosis.

Sister chromosomes replicated during the DNA synthesis phase (S phase) of the cell cycle are attached by the cohesin protein complex. Between somatic division and meiosis, the components of cohesin are different. Stromal antigen 3 (STAG3) is a component of cohesin and is a meiosis-specific gene expressed only in the early embryonic ovary. It has been reported that Stag3-deficient mice are sterile, with oocyte formation stopping early in development5. In humans, familial ovarian insufficiency can occur through STAG3 deletion6. It is known that no new cohesin is produced in the meiotic arrest phase, and it disappears from the chromosome with advancing age7.

On the other hand, Polo-like kinase 4 (PLK4) is a unique member of the PLK family that plays vital roles in centriole biogenesis during mitosis. It has been well characterized as the master regulator of centriole duplication, a key component of the centrosome cycle and is essential for mediating bipolar spindle formation during the first cell division in mouse embryos8,9,10. The depletion of maternal PLK4 prevents nucleation and growth of microtubules and results in abnormal spindle formation, which leads to cytokinesis failure10.

Recently, 472 variants in 187 genes have been reported to be associated with unexplained RPL. A meta-analysis revealed a significant association between recurrent miscarriage and 21 genetic variants with ORs of 0.51–2.3711. However, genes related to embryonic aneuploidy have not yet been identified. Although PLK4 is a mitotic-origin aneuploidy-associated gene, there is a report that the rs2305957 polymorphism is related to unexplained recurrent miscarriage12. However, the study did not focus on patients whose POC showed aneuploidy.

In the present study, we examined the respective association between RPL and STAG3 or PLK4 rs2305957 polymorphism in RPL patients whose previous POC showed aneuploidy.

Materials and methods

Study population

We analyzed the data of 184 Japanese patients with a history of two or more unexplained pregnancy losses and at least one miscarriage whose POC showed aneuploidy. All patients were recruited from Nagoya City University Hospital and Aoki Ladies Clinic. All patients underwent a systematic examination, including 4D-ultrasound sonography and/or hysterosalpingography, chromosome analysis of both partners, determination of antiphospholipid antibody, including lupus anticoagulant, diluted activated partial prothrombin time, diluted Russell Russel viper venom time and β2 glycoprotein I-dependent anticardiolipin antibody, and blood tests for hypothyroidism and diabetes mellitus, before a subsequent pregnancy. Patients with antiphospholipid syndrome, an abnormal chromosome in either partner, or uterine anomaly were excluded. Patients whose previously miscarried POC exhibited triploidy or 45,X were excluded.

A total of 190 women with at least one child and no history of miscarriage were examined as control subjects. The control subjects were recruited from Nagoya City University Hospital.

Ethical approval

This study was conducted with the approval of the Research Ethics Committee of Nagoya City University Graduate School of Medical Sciences. Each patient provided written consent after being given a full explanation of the purpose of the study and the methods to be employed.

Single-nucleotide polymorphism selection of the genes

We focused on STAG3 as a meiosis-related gene and PLK4 as a mitotic-origin aneuploidy risk. Genotype information on 218 STAG3 Single-nucleotide polymorphisms (SNPs) was downloaded from Phase III of the 1000 Genomes JPT population database ( The 1000 Genomes data were analyzed using HaploView software (ver4.2). SNPs were chosen by applying the following selection criteria: (i) a minor allele frequency (MAF) threshold of >0.05 in the 1000 Genomes JPT population, (ii) an r2 threshold of ≥0.8. A total of 115 SNPs meeting the criteria were selected (Fig. 1).

Fig. 1
figure 1

Single-nucleotide polymorphism (SNP) selection of the STAG3 genes.

From the 115 SNPs, those that were predicted to be binding sites of transcription factors and those that showed significant associations in e-QTL analysis were selected using the Regulome DB database ( Next, 27 SNPs with Regulome DB scores higher than 2b (supported by the data of e-QTL and transcription factor binding/DNase peak) were selected as potential candidates. The Regulome DB collects and annotates SNPs with known and predicted regulatory elements in the intergenic regions of the human genome. The RegulomeDB score provides a model integrating functional genomics features along with continuous values such as the ChIP-seq signal, DNase-seq signal, informational content changes, and DeepSEA scores13. Of these, 5 tag SNPs were selected by the Tagger algorithm implemented in HaploView, with an additional SNP reported previously, to validate their association with miscarriages with an abnormal number of embryonic chromosomes.

DNA analysis

Genomic DNA was extracted from peripheral blood samples using the Midi Blood DNA Extraction Kit (QIAGEN).

All genotyping was carried out using TaqMan polymerase chain reaction (PCR) assays (Applied Biosystems) in 96-well arrays that included blank wells as negative controls according to the manufacturer’s instructions. TaqMan Predesigned SNP Genotyping and TaqMan MGB probes were used. TaqMan PCR and genotyping analyses were carried out on the Applied Biosystems 7500 Fast Real-Time PCR system. The reaction mixtures consisted of 2 µL template DNA (5 ng/µL), 5 µL 2×TaqMan GTXpress Master Mix, 0.25 µL 40×primer/probe mix, and 2.75 µL double-distilled H2O in a 10 µL volume. The cycling conditions were as follows: initial denaturation at 95 °C for 20 s, followed by 40 cycles at 95 °C for 3 s and then 60 °C for 30 s. The results were automatically analyzed on the Applied Biosystems 7500 Real-Time PCR system with an allelic discrimination assay program14.

Statistical analysis

Departure from Hardy–Weinberg equilibrium for the six SNPs was determined using the exact test15. Because previous studies have shown that the mainland Japanese population is genetically similar, we did not examine or perform corrections for the population substructure of our sample16,17.

For association analyses of individual SNPs, univariate logistic regression analyses were performed using the presence or absence of RPL as the dependent variable. There remains some uncertainty about the underlying genetic model. Therefore, to examine the right model of inheritance statistically as well as to avoid multiple comparisons by fitting multiple inheritance models, we used the max-statistic, which selects the largest test statistic from the dominant, recessive, and additive models, using the SNPassoc package18. Since the AA genotype of rs941288 was not observed in the sample, we conducted Fisher’s exact test for the SNP instead of the max-test.

To characterize the linkage disequilibrium (LD) pattern, we estimated the r2 values for all pairs of SNPs. Haplotype analysis was conducted within blocks of LD as defined r2 ≥ 0.6, using the R haplo.stats package19. The association of common haplotypes with the risk of recurrent pregnancy loss was estimated with the R function haplo.glm. Haplo.glm applies a haplotype-trait association test based on a general linear model framework using maximum likelihood estimates for haplotype effects, allowing for ambiguity of the haplotype phase. A log-additive risk model was assumed, in which haplotype-specific regression coefficients represent the change in the log odds of disease for every additional copy of the haplotype compared with the homozygote reference haplotype.

The max-test is similar to or marginally better in power than the chi-square test based on the codominant model20. Therefore, the power analysis was carried out based on the codominant model using CaTS21. Given our sample size, our study has 60–80% power to detect a genotype risk ratio of 1.5 under the multiplicative model when the allele frequency is within the range observed in the sample (20–50%) (Supplementary Fig.). Note that the power estimate based on the codominant model is slightly conservative for the max-test.

All statistical analyses were conducted with R software (v.3.6.1)22. P < 0.05 was considered statistically significant.


Characteristics of RPL patients and control subjects are presented in Table 1. The mean (SD) ages of the patients and control subjects were 36.8 (4.3) and 36.8 (5.5) years, respectively. The mean (SD) number of previous miscarriages was 3.09 (1.13).

Table 1 Characteristics of patients with RPL whose previous aborted conceptus was ascertained to exhibit aneuploidy.

A total of six SNPs were analyzed: rs2305957 (G>A at PLK4;SNP1), rs941288 (G>A at STAG3;SNP2), rs13230744 (G>A at STAG3;SNP3), rs1061230 (G>A at STAG3;SNP4), rs1624099 (G>A at STAG3;SNP5), and rs4727450 (G>A at STAG3;SNP6).

The genotype frequencies for the six SNPs were found to be in Hardy-Weinberg equilibrium, suggesting that there was neither sampling bias nor mistyping in genotyping. The MAF of PLK4 (rs2305957) was 0.367 for patients and 0.361 for controls (Table 2), similar to that in the Japanese population. The MAF(A) of that in the JPT was reported to be 0.365 in the Integrative Japanese Genome Variation Database. Based on the results of the max-statistics, the recessive model was adopted. No significant difference in the distribution of the recessive model of rs2305957 (G>A) was found between the patients and controls (Table 3).

Table 2 Genotype allele frequencies of PLK4 and STAG3 polymorphisms among RPL patients and controls.
Table 3 Results of maximizing association statistics.

The MAF of STAG3 (rs941288) was 0.079 for patients and 0.053 for controls. The MAF of STAG3 (rs13230744) was 0.364 for patients and 0.393 for controls. The MAF of STAG3 (rs1061230) was 0.440 for patients and 0.411 for controls. The MAF of STAG3 (rs1624099) was 0.359 for patients and 0.374 for controls. The MAF of STAG3 (rs4727450) was 0.245 for patients and 0.261 for controls. Since the AA genotype of rs941288 was not observed in the sample, we conducted Fisher’s exact test for the SNP and found no association with RPL (p-value = 0.168). For the remaining five SNPs, associations were assessed using maximized χ2 statistics, where selections corresponded to the largest test statistics from the dominant, recessive and additive models.

The results of the max-statistics are shown for all five polymorphisms (Table 3), and they suggested no statistically significant associations with any of the five SNPs. We also performed an LD analysis of the five STAG3 SNPs (Fig. 2), which revealed an LD block among SNPs 3, 4, and 5. Analysis of the region within the LD block indicated the presence of four major haplotypes. G-G-G (57.3%), A-A-A (32.1%), A-A-G (5.56%), and G-A-A (3.56%) comprised the major haplotypes and accounted for > 98% of the subjects, and no haplotype was significantly associated with an RPL risk compared with the most common haplotype, G-G-G, under the log-additive model (Table 4).

Fig. 2
figure 2

Linkage disequilibrium (r2 value) of the five STAG3 single-nucleotide polymorphism.

Table 4 Results of haplotype analysis.


In the present study, we found no significant differences in MAF or any of the genetic models of the 6 SNPs of PLK4 and STAG3 between the RPL patients and controls.

We believe that the findings of the present study, where the PLK4 rs2305957 polymorphism was not associated with RPL, are important. McCoy et al.23 examined day 3 blastomeres and blastocysts used for preimplantation genetic testing for aneuploidy and found an association between PLK4 rs2305957 and maternal mitotic-origin aneuploidy in a genome-wide association study. The study did not find a gene associated with meiotic-origin aneuploidy, and these mitotic errors produce mosaic embryos with two or more cell lineages, each possessing distinct chromosomal complements. Women with the genotypes contributed fewer blastocysts, suggesting that their embryos were less likely to survive blastocyst formation. In a study of 2015 infertile women, Zhang et al.12 demonstrated that women with an AA genotype showed significantly lower blastocyst formation and that the variant rs2305957 might represent a maternal risk factor for RPL. The MAF(A) of RPL was reported to be 0.489 and that of controls was 0.425. However, no significant difference in early miscarriage rates or live birth rates was observed among women with AA, AG, and GG genotypes who underwent the standard IVF protocol12.

In contrast, we found no association between maternal rs2305957 spanning PLK4 and RPL. We focused on RPL patients whose POC showed autosomal aneuploidy. Almost all were trisomies. Early studies demonstrated that most aneuploidies in miscarriages and affected children are due to errors in maternal meiosis and that increasing maternal age is a strong contributor to the occurrence of aneuploidy24. It was speculated that the risk allele of PLK4 might contribute not to miscarriage but rather to mosaicism and implantation failure because it was a mitotic-origin aneuploidy gene.

Thus, we focused on a meiosis-origin aneuploidy gene, STAG3. In the present study, we focused on the fact that cohesin, which controls chromosome adhesion, decreases with age and performed polymorphism analysis of genes that encode it. Chromosomes replicate in meiosis, and sister chromosomes attach to each other through the action of this protein complex25. Cohesin in the centromere is not decomposed, and the spindle microtubule recruits the centromere of the sister chromosome and distributes it to the opposite station. In the second division of meiosis, the remaining cohesin is broken down by separase, and the sister chromosomes are distributed to the opposite poles. Cohesin acts to protect the adhesion of sister chromosomes and the mono-orientation of kinetochores. It is known that no new cohesin is produced in the meiotic arrest phase, and it eventually disappears from the chromosome in an age-dependent manner7. We examined STAG3 since it encodes cohesin and is a meiosis-specific gene expressed only in the early embryonic ovary. However, we failed to show any association with RPL.

REC8 (meiotic recombination protein 8) forms a synaptonemal complex, promotes recombination, recruits Shugosin and MEIKIN protein to protect centromere adhesion and maintains mono-orientation of the centromere structure26,27. MEIKIN (meiosis-specific kinetochore protein) is specifically expressed and localized in the first division of meiosis in mouse germ cells, and in an analysis of knockout mice, meikin was found to inhibit mono-orientation binding of the centromere and to play an important role in centromere adhesion28. One limitation of the present study was that we did not analyze meiosis-associated genes except for STAG3. Even if we narrow down gene polymorphisms acquired from a database, the number of polymorphisms with functional alterations is relatively large and unrealistic for practical use. In addition, relatively recently discovered genes have many unreported polymorphisms, making it difficult to conduct a database-dependent analysis.

We could not find any association between PLK4 or STAG3 and RPL. A low level of anti-Müllerian hormone can indicate a poorly responsive ovarian follicle but cannot predict the future capability of a live birth. In contrast, a combination of genetic risk alleles might be useful for predicting future age-dependent infertility as well as RPL caused by aneuploidy. A greater knowledge of the collective risk of these alleles may cause women to think carefully about their life plan and encourage them to consider getting pregnant at a younger age. Such information could be highly useful. With this in mind, a genome-wide association study should be undertaken in the near future, as it will benefit not only women in Japan but also those elsewhere.