Variant in the synaptonemal complex protein SYCE2 associates with pregnancy loss through effect on recombination

Two-thirds of all human conceptions are lost, in most cases before clinical detection. The lack of detailed understanding of the causes of pregnancy losses constrains focused counseling for future pregnancies. We have previously shown that a missense variant in synaptonemal complex central element protein 2 (SYCE2), in a key residue for the assembly of the synaptonemal complex backbone, associates with recombination traits. Here we show that it also increases risk of pregnancy loss in a genome-wide association analysis on 114,761 women with reported pregnancy loss. We further show that the variant associates with more random placement of crossovers and lower recombination rate in longer chromosomes but higher in the shorter ones. These results support the hypothesis that some pregnancy losses are due to failures in recombination. They further demonstrate that variants with a substantial effect on the quality of recombination can be maintained in the population.

Human Reproduction and Embryology and American Society for Reproductive Medicine) have varied between two or more or three or more sporadic pregnancy losses, sometimes but not always required to be consecutive.Furthermore, data on women is often incomplete, leading to underrepresentation of recurrent events.
To avoid the complications inherent in the recurrent pregnancy loss phenotype we defined one pregnancy loss phenotype including women with any reported pregnancy loss, sporadic or recurrent.
This resulted in a case group of 114,761 women with one or more pregnancy loss, self-reported or identified through electronic health records.The ICD codes included were ICD10:O02.× 10 −8 , OR = 1.4).We found no association (P = 0.12) between this variant and our general pregnancy loss phenotype that differs from the discovery phenotype in including recurrent pregnancy loss cases.However, given that recurrent pregnancy loss cases are only a small proportion of cases (based on both studies) it seems unlikely that this difference in phenotype definition can account for this lack of association.
Three variants were reported to associate with multiple consecutive miscarriage (rs7859844, MAF = 6.4%,P = 1.3 × 10 −8 , OR = 1.7; rs143445068, MAF = 0.8%, P = 5.2 × 10 −9 , OR = 3.4 and rs183453668, MAF = 0.5%, P = 2.8 × 10 −8 , OR = 3.8).We were unable to test rs183453668 in our data as this variant failed in our sequencing quality control.We further note that this variant was not available in FinnGen (R8) data and it did not pass quality filters in gnomAD (gnomad.broadinstitute.org),raising concerns regarding this quality of this variant in the discovery data.We tested the remaining two variants for association in our subset of 2,253 recurrent pregnancy loss cases, and found no association (P > 0.05).As described above, the definition of recurrence is not the same in the two datasets.Our data is based on ICD codes only (ICD10:N96 and O26.2) while the discovery data of 750 cases only includes ICD10:N96 but in addition uses information derived from questionnaire and other data.It is, therefore, difficult to know if the same phenotype is being tested.We do note that these are low frequency variants, discovered in a small dataset with modest P-values.
Two points indicate that our pregnancy loss phenotype and sporadic pregnancy loss phenotype defined by Laisk et al. are related.We tested the genetic correlation between the two traits and found a correlation of 0.73, P = 0.0001.Due to sample overlap UK Biobank data was excluded from our meta-analysis in this analysis.Furthermore, the SYCE2 signal associates with sporadic miscarriage in the Laisk et al. study with P = 5.68 × 10 -7 , OR = 1.31, consistent with our results.

2 Supplementary Table 1. Pregnancy loss cases
Supplementary Table 2. Association results for SYCE2:p.His89Tyr in individual populationsWe used logistic regression to test for association between genotype count as predictor and diseases status as outcome in each dataset, adjusting for covariates such as YOB, sex and populations structure.P values are two-sided without Bonferroni correction.The table includes MAF in each dataset and the imputation info as an estimate of the imputation quality.The combined results, P value and OR, are obtained from the fixed effect inverse variance weighted meta-analysis of results from individual datasets.

Table 3 . Association of SYCE2:p.His89Tyr with individual phenotypes included in the definition of pregnancy loss
The table includes, for each trait, the total number of cases and controls in the meta-analysis, and the OR, P-value and 95% CI from a fixed effect inverse variance weighted meta-analysis of results for individuals cohorts included for each trait.P values are two-sided without Bonferroni correction.

Table 5 . Effects of SYCE2:p.His89Tyr on the average telomere distance of maternal crossovers in offspring of carriers.
The table shows the association results with 95% CI.Effects are listed in units of Mb and standard deviation (SD).Association analysis was performed with linear regression under an additive genetic model, where the phenotype corresponds to the average telomere distance of crossovers in all offspring.CIs are inferred from P-values, which are computed with a two-sided t-test and unadjusted for multiple comparisons.The P-values shown in column 4 pertain to the association results in SD units.The singleton results in the second half of the table show the association results when we restrict to probands with a single crossover chromosome.

Table 6 . Effects of SYCE2.pHis89Tyr on the average per-chromosome maternal recombination rate in offspring of carriers.
The table shows the association results with 95% CI.Effects are listed in units of cM and standard deviation (SD).Association analysis was performed with linear regression under an additive genetic model, where the phenotype corresponds to the average of the per-chromosome recombination rate in all offspring.CIs are inferred from P-values, which are computed with a two-sided t-test and not adjusted for multiple comparisons.The P-values shown in column 4 pertain to the association results in SD units.Recombination rates are corrected for maternal age before the carrier effects are computed.

Supplementary Table 7 Effect of SYCE2:p.His89Tyr on number of children
The table includes the number of mothers included for each dataset and their mean (and 95% CI) number of children broken down by genotype status of SYCE2:p.His89Tyr.The association between genotype count and number of children, in each dataset, was tested assuming a Conway-Maxwell Poisson distribution implemented in the glmmTMB package in R, both for the additive and recessive genotype model, and the resulting P values and OR combined using a fixed effect inverse variance weighted meta-analysis.All P values are two-sided.
*Icelandic women born1918 -1983 , Danish women born 1957 -1973and all women from UK Biobank were included in the analysis The Laisk et al. study defined two phenotypes.Their sporadic miscarriage phenotype(N = 49,996)was limited to one or two self-reported miscarriages, or ICD-10 codes O02.1 and O03 on one or two separate time-points (at least 90 days between episodes).Their second phenotype, multiple consecutive miscarriage (N =750) was defined as follows: (i) five or more self-reported miscarriages, one live birth, no pregnancy terminations, (ii) three or more self-reported miscarriages, no live births, no pregnancy terminations, or (iii) three or more consecutive miscarriages.The first two criteria were used to ensure the consecutive nature of the miscarriages; and (iv) ICD-10 diagnosis code N96.Unlike our study they do not include ICD10 O26.2, pregnancy care for patient with recurrent pregnancy loss, in their definition of pregnancy loss cases.Other than that, the two studies use the same ICD codes to define pregnancy loss.It is worth noting that both studies include data from the UK Biobank and based on phenotype definitions the overlap between our study and the Laisk et al. study likely extends to most of the 37,105 sporadic miscarriage cases as well as the 421 multiple consecutive miscarriage cases from the European part of the UK Biobank cohort included in their study.