Fertilization success is strongly dependent on gamete-level biochemical interactions (reviewed by Kekäläinen and Evans 2018), which are mediated by chemical signals released by the eggs and the reproductive tract of the female (Eisenbach and Giojalas 2006). These female-derived cues guide sperm toward unfertilized eggs (chemotaxis) and trigger remarkable changes in sperm physiology, including capacitation (functional maturation), hyperactivation, and acrosome reaction (Yoshida et al. 2008; Cramer et al. 2016; Flegel et al. 2016; Brown et al. 2017; Kekäläinen and Evans 2017). In mammals, this complex signaling process ensures that only a minor proportion of ejaculated spermatozoa is capable of entering the oviducts, and that only few of these cells eventually reach the egg(s) (Robertson 2005; Eisenbach and Giojalas 2006; Holt and Fazeli 2015). Accumulating evidence from nonhuman species indicates that these cellular-level signaling mechanisms may have an additional function in sexual selection by facilitating mate choice (“partner selection”) at the level of the gametes (Gasparini and Pilastro 2011; Alonzo et al. 2016; Rosengrave et al. 2016; reviewed by Kekäläinen and Evans 2018). Resulting nonrandom interactions between gametes frequently favor assortative fusion between genetically compatible sperm and eggs (Kekäläinen and Evans 2018). Recent studies have suggested that major histocompatibility complex (MHC) immune genes play important role as a mediator of this gamete-level selection process (Løvlie et al. 2013; Gasparini et al. 2015; Lenz et al. 2018).

In humans, mating preferences are known to be influenced by body odors associated with human leukocyte antigens (HLA), a gene complex that encodes MHC genes in humans (Wedekind et al. 1995; Jacob et al. 2002; Roberts et al. 2008; Sorokowska et al. 2018). Interestingly, there is evidence that both HLA molecules (Martín-Villa et al. 1996; Paradisi et al. 2000; Zhu et al. 2019; Sereshki et al. 2019) and HLA-linked olfactory receptors that detect these molecules (Ziegler et al. 2010; Flegel et al. 2016) are present on the surface of human sperm. Consequently, Ziegler et al. (2010) proposed that sperm HLA-linked olfactory receptors might have a key role in signaling the “self” (identity) and HLA genotype of the male (sperm) to females. Similarly, the secretions of the female reproductive tract, such as follicular fluid, are known to contain soluble HLA molecules (e.g., Rizzo et al. 2007; Ouji-Sageshima et al. 2016), which in turn may reveal the identity of the female to the sperm (Ziegler et al. 2010). Together above-mentioned findings suggest that MHC (and HLA) genes may mediate mating preferences both in individual and gamete-level (Ziegler et al. 2010, see also Rülicke et al. 1998; Yeates et al. 2009; Løvlie et al. 2013; Firman et al. 2017; Geßner et al. 2017; Lenz et al. 2018, for gamete-level MHC preferences in animals). However, to best of our knowledge, none of the earlier studies has experimentally tested whether the gamete-level HLA preference could exist in humans (Holt and Fazeli 2015; Kekäläinen and Evans 2018). Furthermore, the relative importance of HLA-mediated gamete interactions for human fertilization success (or infertility) has remained unexplored.

According to the current definition of the World Health Organization (WHO), infertility is a disease of the reproductive system, manifested by the inability to achieve pregnancy after 12 months of unprotected sexual intercourse (Zegers-Hochschild et al. 2009). Accordingly, the causes of infertility are divided into male- and female-derived pathological factors (Gardner et al. 2018). However, reliable assessment of human infertility is currently extremely challenging (e.g. Ray et al. 2012; Gelbaya et al. 2014; Skakkebaek et al. 2016; Oehninger and Ombelet 2019) and in a significant proportion of couples the cause of infertility remains unknown (Ray et al. 2012). Thus, at present, infertility diagnoses have been argued to be essentially prognoses rather than strict medical diagnoses (McLernon et al. 2014; Somigliana et al. 2016). Diagnostic challenges often arise due to high individual- and couple-specific variation in the probability of conceiving. Along with above-mentioned findings, this indicate that rather than being just a pathological condition, infertility could also be caused by the immunogenetic incompatibility of the gametes.

In the present study, we investigated the role of follicular fluid, that is, the bioactive liquid surrounding the egg in an ovarian follicle, as a potential mediator of gamete-level mate preferences in humans. Follicular fluid contains secretions of the cumulus-oocyte complex and these chemical factors are emanated into the oviduct during the ovulation (e.g., Ezzati et al. 2014). Given that cumulus-oocyte complex secretes sperm-activating factors both prior to ovulation within the follicle and after the ovulation in the oviduct (Eisenbach and Giojalas 2006), the biochemical composition of follicular fluid can be expected to have important implications for the fertilization process. Supporting this view, sperm physiological response to follicular fluid has been demonstrated to predict both the fertilization success of the sperm and the success rate of fertility treatments (e.g., Ralt et al. 1991; Huang et al. 2007). We treated the sperm of eight men with the follicular fluid of ten women, in all possible male–female combinations (full-factorial design: n = 80 combinations). We then measured motility, hyperactivation, acrosome reaction, and viability of sperm in all of these combinations. Finally, we genotyped all the participants by a genome-wide SNP array, imputed their HLA class I and II alleles at unique protein sequence level, and then studied whether the degree of HLA similarity (or genome-wide SNP divergence) between males and females predicts sperm fertilization competence. This experimental design allowed us to study functionally important sperm pre-fertilization physiological changes from early capacitation to the final major pre-fertilization physiological modification of the sperm, the acrosome reaction, and to compare how these sperm physiological responses vary among independent male–female (and their HLA genotype) combinations. Importantly, applied full-factorial design permitted us to evaluate gamete-level compatibility differences between multiple male–female pairs without experimentally fertilizing the eggs, which would be practically impossible in humans.


Study subjects and sample collection

The participants (n = 10 women and eight men) of this study were recruited via the fertility clinic of the North Karelia Central Hospital, Joensuu, Finland. All the participants were Caucasian and their mean age was 33.2 (± 4.4 s.d.) years for females and 33.5 (± 4.4 s.d.) years for males. Follicular fluid samples were collected from females undergoing transvaginal follicular aspiration for in vitro fertilization. A transvaginal follicular puncture was performed under local anesthesia, using ultrasound guidance. Prior to collection, follicle maturation was hyperstimulated with follicle-stimulating hormone, and premature ovulation was prevented using a gonadotrophin-releasing hormone antagonist. To control the potential impact of hyperstimulation on the follicular fluid composition, ovulation induction of all the participants was performed using an identical protocol. When the diameter of the largest follicle reached 18–20 mm, human corion gonadotrophin was administered. After collection, follicular fluid samples were centrifuged at 500 × g for 10 min, and the supernatant was aliquoted and stored in liquid nitrogen for later use (see below). Of ten women, two were diagnosed with an ovulation disorder; the remaining eight had no infertility diagnoses.

All the male participants provided semen samples by masturbation after 2–5 days of sexual abstinence. After collection, semen samples were first allowed to liquefy for 30 min at +37 °C, and the spermatozoa were then separated from the seminal plasma by a two-layer (40 and 80%) density gradient centrifugation (PureSperm® 40 and 80, Nidacon International AB, Mölndal, Sweden) according to manufacturer’s instructions. After the density gradient centrifugation, spermatozoa were rinsed by an additional centrifugation in PureSperm® Wash solution (Nidacon). The resulting final sperm concentration was ca. 40 million cells/ml. All the male subjects were diagnosed as normozoospermic according to the WHO’s criteria (WHO 2010).

Sperm follicular fluid treatment and motility measurements

Follicular fluid samples of each of the ten women were divided in two independent replicate samples, A and B (20 samples in total, Supplementary Fig. 1). Washed sperm aliquots from each of the eight men were then combined with these 20 follicular fluid samples (25 µl sperm + 25 µl follicular fluid = follicular fluid treatment) in all possible male–female combinations (full-factorial design: n = 8 males × 10 females × two subsamples = 160 combinations in total). Selected follicular fluid concentration (50% dilution) was based on earlier findings, demonstrating that follicular fluid stimulates sperm motility across a wide concentration range (Kulin et al. 1994; Ralt et al. 1994; Getpook and Wirotkarun 2007). Thus, even if postovulatory follicular fluid concentration in the human oviducts is unclear, sperm physiological response to 50% follicular fluid dilution can be expected to predict sperm response in vivo. All the above-mentioned sperm follicular fluid dilutions were kept at +37 °C during the entire experimental period. For each male, all the sperm measurements (see below) were performed on the day of semen collection (i.e., by using fresh sperm).

The effect of follicular fluid treatment on sperm motility (curvilinear velocity: VCL, linearity of the swimming trajectory: LIN, and amplitude of the lateral head displacement: ALH) was determined using computer-assisted sperm analysis (Integrated Semen Analysis System, v. 1.2 Proiser, Valencia, Spain), with a negative phase contrast microscope (×100 magnification) and a capture rate of 100 frames s1. Sperm motility was measured by adding 1 μl of each of the 160 different sperm follicular fluid dilutions to pre-warmed (+37 °C) Leja 4-chamber (chamber height 20 μm) microscope slides (Leja, Nieuw-Vennep, the Netherlands). We recorded sperm motility at four different time points (30, 90, 180, and 300 min after the beginning of the follicular fluid treatment). The time points were selected based on the knowledge that the capacitated state of human sperm last ca. 50–240 min in vitro (Eisenbach and Tur‐Kaspa 1999). Within each male–female combination (and time point), sperm motility measurements included two independent recordings within both replicates, resulting in four motility recordings per male–female pair (n = 320 recordings in total). To avoid a potential time effect on the measured sperm traits, sperm motility in the first (A) replicate was always measured in the following order: FF1, FF2, …, FF10, whereas the second (B) replicate was measured in the opposite order: FF10, FF9, …, FF1 (Supplementary Fig. 1). The mean total number of measured sperm cells per male–female combination was 1160 (±23.7 s.e.m.).

Measurement of sperm physiological state

The proportion of hyperactivated sperm cells in each of the above-mentioned four time points was determined based on three motility parameters that have been shown to characterize hyperactivated sperm motility (Kay and Robertson 1998): sperm VCL (VCL > 150 µm/s), LIN (LIN < 50%), and ALH displacement (ALH > 2.0). The resulting hyperactivated sperm subpopulation represented 20.9% of the total sperm count, which closely corresponds with earlier findings that ~20% of human sperm undergoes hyperactivation simultaneously (reviewed by Kay and Robertson 1998).

Sperm acrosome reaction and viability in each male–female combination (300 min after the beginning of the follicular fluid treatment) was determined by staining the sperm samples with fluorescein isothiocyanate-labeled peanut agglutinin (PNA, 25 µg/ml) and propidium iodide (PI, 0.5 µg/ml), respectively. After staining, the sperm was immobilized with 1% formaldehyde on a microscope slide. The proportion of acrosome-reacted (PNA-stained) and/or dead (PI-stained) sperm were determined by imaging the sperm with Zeiss Axioplan 2 fluorescent microscope, Proiser C13-ON camera, and XIMEA CamTool software. The mean total number of imaged sperm cells per male–female combination was 511 (±15.2 s.e.m.).

Genotyping of study subjects

DNA of all the 18 subjects was extracted from EDTA blood using automatized Qiasymphony SP instrument (Qiagen). Extracted DNA samples were genotyped on Illumina GlobalScreeningArray-24v2-0 + Multi-Disease beadchip at the Institute for Molecular Medicine Finland (FIMM). All genotypes were named with GenomeStudio 2.0.3 software, and a subset was checked manually based on pre-determined selection criteria such as low call rates, bad cluster separation, low signal intensity, quality scores, and heterozygote excess. The parameters used in the quality control of X chromosomal and autosomal markers were: call rate < 97%; cluster separation < 0.3; AB R Mean (low intensity SNPs) < 0.3; AB T Mean (SNPs where two clusters are close to each other) < 0.2 or >0.8; heterozygote excess < −0.3 (not X chromosome) or >0.2; AA or BB T Dev > 0.05; and AB T Dev (too many or CNV like clusters) > 0.07. PLINK v1.90b6.6 (Chang et al. 2015) ( was used to manage and filter the genotyping data. For genotyping quality control, the following settings were applied: missing genotype frequency (geno) < 0.05, P value for deviation from Hardy–Weinberg equilibrium > 1e− 5; minor allele frequency > 0.01; individual missing genotype rate (mind) < 0.05. A total of 31,514 of 759,993 variants (4.1%) was discarded during the initial quality control procedure. 402,013 variants and all individuals passed the genotyping filters, and 2977 of the passed variants were located within the MHC region on chromosome 6.

The HLA imputation of seven classical HLA genes (HLA-A, -B, -C, -DRB1, -DQA1, -DQB1, and -DPB1) at four-digit (i.e., protein level) resolution was carried out in R v3.4.4 (R Core Team R 2018), using the package HIBAG v1.18.1 (Zheng et al. 2014) with default settings and the European reference panel (European-HLA4-hg19.RData) for the human genome build GRCh37. The HLA similarity between each male–female combination was determined by (1) calculating the number of shared HLA alleles (0–14) over the seven imputed HLA genes; and (2) calculating the genetic distance between HLA alleles using the hlaDistance function from the HIBAG R package. The whole-genome similarity measure between each male–female combination was calculated as the total sum of the number of shared genotypes for all genotyped and quality-filtered bi-allelic single nucleotide polymorphisms. In other words, genotyped variants were averaged in a single value, which prevented any single variants from dominating the genome-wide similarity estimate, which in turn can be expected to increase the accuracy of the overall estimate.

Statistical analyses

The effect of male, female, and male–female interaction (combination) on sperm motility, hyperactivation, acrosome reaction, and viability were tested in linear mixed-effects models. In the models, sperm follicular fluid treatment replicate (A and B, see above) was used as a fixed factor, and male effect, female effect, and male–female interaction effect were used as random factors. The association between male–female HLA similarity or genome-wide similarity and measured sperm traits was studied by adding the number of shared HLA alleles, genetic similarity of HLA alleles, or genome-wide SNP similarity in each (n = 80) male–female combinations as an additional fixed effect in the above-mentioned models. Model assumptions were graphically verified using Q–Q plots and residual plots. All P values presented are from two-tailed tests, with α = 0.05. Mixed model analyses were conducted using lmerTest package in R (version 3.5.1).


Sperm motility, acrosome reaction, and viability

Sperm swimming velocity (VCL) and the proportion of hyperactivated sperm cells were affected by male, female, and male–female interaction (Table 1). Time point-specific analyses revealed that the male–female interaction effect for sperm swimming velocity and hyperactivation was statistically significant 180 and 300 min after follicular fluid treatment (Figs. 1, 2, Supplementary Tables S1, and S2). Similarly, sperm viability was affected by all three random effects (male, female and male–female interaction), whereas the sperm acrosome reaction was affected only by male and male–female interaction (Table 2 and Fig. 3). The proportion of hyperactivated sperm cells at 300 min was positively associated with the strength of the sperm acrosome reaction (t = 3.99, P < 0.001), but no association was detected between proportion of hyperactivated sperm and sperm viability (t = −0.29, P = 0.78).

Table 1 Linear mixed model statistics for the effect of male, female, and male–female interaction (M × F) on sperm swimming velocity (VCL) and hyperactivation.
Fig. 1: Sperm motility in different male-female combinations.
figure 1

The effect of male–female interaction (combination) on sperm swimming velocity (±s.e.m.) (a) and the proportion of hyperactivated sperm cells (±s.e.m.) (b) 300 min after the initiation of the follicular fluid treatment. Bar colors representfemale identity (n = 10), within each male (1–8).

Fig. 2: Relative contribution of males, females, and male–female interaction on sperm motility.
figure 2

Proportion of variance explained by males, females, and male–female interaction (combination) in sperm swimming velocity (VCL) (a) and hyperactivation (b) 30–300 min after the initiation of the follicular fluid treatment.

Table 2 Linear mixed model statistics for the effects of male, female, and male–female interaction (M × F) on sperm acrosome reaction (%) and proportion of dead sperm.
Fig. 3: Sperm acrosome reaction and viability in different male-female combinations.
figure 3

The effect of male–female interaction (combination) on thesperm acrosome reaction (±s.e.m.) (a) and the proportion of dead sperm cells (±s.e.m.) (b). Bar colors represent female identity (n = 10), within each male (1–8).

The effect of HLA similarity and genome-wide similarity

The number of shared HLA alleles and the genetic similarity of HLA alleles between males and females, were negatively associated with sperm swimming velocity (number of shared alleles: t = −3.14, P = 0.003 (Fig. 4), genetic similarity: t = −2.91, P = 0.005) and the proportion of hyperactivated sperm (number of shared alleles: t = −2.74, P = 0.008; genetic similarity: t = −2.50, P = 0.015). No statistically significant associations were found between HLA similarity and the proportion of acrosome-reacted sperm (number of shared alleles: t = −0.43, P = 0.67; genetic similarity: t = 0.22, P = 0.83) or the proportion of dead sperm cells, although the genetic similarity of HLA alleles tended to be associated with higher sperm mortality (number of shared alleles: t = 1.49, P = 0.142; genetic similarity: t = 1.82, P = 0.073). None of the four measured sperm parameters was associated with genome-wide male–female similarity (P > 0.16, in all cases).

Fig. 4: The association between male–female HLA similarity (number of shared alleles) and sperm swimming velocity (VCL).
figure 4

The slope in the figure describes predicted (i.e., modeled) average effect of HLA similarity on VCL in eight males. To control between male differences in intrinsic sperm quality, the slope was modeled in individual male level, when the average slope of eight males was clearly negative (intercept: 177.6; slope: −1.10, P = 0.003). The slope of the association did not vary across males (P = 0.28).


Our results show that the effect of follicular fluid on sperm motility, acrosome reaction, and viability were all strongly dependent on the male–female combination (interaction). In other words, follicular fluids (i.e., women) that have only a minor effect on measured sperm traits in some males can have a major effect on sperm physiology in the other(s). The relative importance of the male–female interaction effect increased with time, which is in line with earlier findings showing that follicular fluid-induced selective recruitment of sperm for fertilization takes a minimum of ca. 1–4 h (e.g., Cohen-Dayag et al. 1995). Finally, sperm motility and hyperactivation were negatively associated with HLA similarity of male–female combinations, but were not affected by the genome-wide similarity of the males and females. Together, these results indicate that the functional (fertilization) competence of sperm is shaped by the HLA combination (dissimilarity) of males and females. Although previous studies have demonstrated the importance of HLA genes in human mate choice prior to copulation (Milinski et al. 2013; Kromer et al. 2016; Winternitz et al. 2017; Dandine-Roulland et al. 2019), the function of these genes in gamete-level mate choice has not earlier been demonstrated in humans. However, Scofield et al. (1982) have hypothesized that self–nonself recognition of the gametes may be the original adaptive function of ancestral MHC genes. Along with present findings, this indicates that MHC molecules and/or MHC-associated odor receptors play an important role in partner selection both at the individual and gamete level, possibly across the animal kingdom (see also Holt and Fazeli 2016).

Follicular fluid is released and emanated into the oviduct during ovulation, where it is present also during fertilization (e.g. Getpook and Wirotkarun 2007). Earlier studies have shown that follicular fluid has a critical function in the regulation of sperm pre-fertilization physiology, such as sperm motility and directed migration toward unfertilized eggs (chemotaxis) (Ralt et al. 1994; Fabro et al. 2002). It has also been shown that males exhibit high intra-specific differences in the strength of the sperm physiological response to female-derived reproductive secretions (reviewed by Kekäläinen and Evans 2018). However, it has remained unclear if female-derived reproductive fluids could mediate selective fusion between gametes in mammals (but see Satake et al. 2006). This largely reflects the fact that gamete-level communication processes that are naturally occurring within the female reproductive tract, are technically challenging to investigate (Firman et al. 2017). In the present study, conducted full-factorial experimental design allowed us to demonstrate that follicular fluid may have an important function in gamete-level selection process in humans.

Despite the potential relevance of such nonrandom gamete-level interactions both from evolutionary and clinical point of view, the molecular-level mechanisms of gamete-mediated mate choice have remained largely unclear (but see Kekäläinen and Evans 2017; Chen et al. 2019). Ejaculates are known to trigger a strong immune response in the reproductive tract of the females, and the strength of the response shows considerable individual variation across human males (Sharkey et al. 2007). It has therefore been envisaged that the immune system may play an important role in post-copulatory sexual selection (or gamete-mediated mate choice) (Morrow and Innocenti 2012; Wigby et al. 2019). Thus, in addition to its “normal” function in immune defense, female immune system may be also involved in ensuring the immunogenetic compatibility of the gametes prior to their fusion (Robertson 2010; Kekäläinen and Evans 2018; see also Scofield et al. 1982). Interestingly, Sereshki et al. 2019 recently showed that mature human spermatozoa express HLA class I and II molecules on their surface, which indicates that HLA genes act as an important mediator of this gamete-level compatibility verification process. Together with these earlier studies, our present findings raise the novel possibility that infertility problems may not result exclusively from male or female pathological factors. Instead, fertilization problems may also arise from nonrandom gamete-level biochemical interactions that can reduce (or prevent) the fertilization success of HLA-similar partners.

From the evolutionary perspective, gamete-level mechanisms that discourage fusion of such gametes are entirely plausible, since HLA allelic dissimilarity in offspring is associated with broader peptide antigen presentation capability by HLA molecules and thus with a better ability to fight against infections. Given that pathogens are thought to be the strongest selective agent in human evolution (e.g., Winternitz et al. 2017), selective fusion among HLA-dissimilar gametes can facilitate to optimize offspring’s immune response against prevailing pathogens (Wedekind et al. 1996; Lenz et al. 2018). Alternatively, a preference for HLA-dissimilar gametes may be independent of immunogenetic benefits if HLA alleles serve as marker of the genetic relatedness of the mating partners (Winternitz et al. 2017). In this scenario, the primary function of HLA-dissimilarity preferences may be to prevent mating between close relatives (Huchard et al. 2013). In the present study, we found no association between the measured sperm traits and genome-wide similarity of males and females, indicating that our results could be explained by direct gamete-level HLA-dissimilarity preferences rather than by a by-product of inbreeding avoidance based on HLA-independent cues.

In conclusion, our results demonstrate that sperm fertilization capability in humans show major differences between different male–female combinations and is negatively affected by the HLA similarity of the partners. This raise the possibility that along with male and female pathological conditions, infertility problems may also arise as a consequence of cellular-level HLA incompatibility avoidance processes that occur prior to gamete fusion. In other words, fertilization failure may not necessarily represent a disease of the reproductive system but may also indicate that the gametes of some partners may be immunologically less compatible than the others. We envisage that a better integration of the gamete compatibility concept into current clinical practices may open novel possibilities for the development of more accurate (personalized) infertility diagnostics. This would facilitate planning and optimization of infertility treatments to individual couples, and this way reduce the total duration and costs of these treatments.