Genome-wide association study to identify genomic regions and positional candidate genes associated with male fertility in beef cattle

Fertility plays a key role in the success of calf production, but there is evidence that reproductive efficiency in beef cattle has decreased during the past half-century worldwide. Therefore, identifying animals with superior fertility could significantly impact cow-calf production efficiency. The objective of this research was to identify candidate regions affecting bull fertility in beef cattle and positional candidate genes annotated within these regions. A GWAS using a weighted single-step genomic BLUP approach was performed on 265 crossbred beef bulls to identify markers associated with scrotal circumference (SC) and sperm motility (SM). Eight windows containing 32 positional candidate genes and five windows containing 28 positional candidate genes explained more than 1% of the genetic variance for SC and SM, respectively. These windows were selected to perform gene annotation, QTL enrichment, and functional analyses. Functional candidate gene prioritization analysis revealed 14 prioritized candidate genes for SC of which MAP3K1 and VIP were previously found to play roles in male fertility. A different set of 14 prioritized genes were identified for SM and five were previously identified as regulators of male fertility (SOD2, TCP1, PACRG, SPEF2, PRLR). Significant enrichment results were identified for fertility and body conformation QTLs within the candidate windows. Gene ontology enrichment analysis including biological processes, molecular functions, and cellular components revealed significant GO terms associated with male fertility. The identification of these regions contributes to a better understanding of fertility associated traits and facilitates the discovery of positional candidate genes for future investigation of causal mutations and their implications.


Results
Scrotal circumference genome wide association study and QTL enrichment. After quality control, 379,591 markers remained for analysis based on a call rate greater than 95% and a minor allele frequency greater than 5% with non-autosomal markers. The Manhattan plot in Fig. 1a shows the non-overlapping windows 1 Mb apart which explain the highest proportion of variance for SC. Eight windows explained more than 1% of the genetic variance for SC which were located on BTA9, BTA10, BTA20, BTA24 and BTA29, and explained 13.19% of the total genetic variance (Table 1). Thirty-two positional candidate genes were identified within these windows explaining more than 1% of the genetic variance for SC. ToppGene prioritization analysis revealed 14 of the 32 positional candidate genes were prioritized for spermatic-related processes and the windows which these genes are mapped within explained 9.76% of the total genetic variance for SC. The QTL annotation revealed several reproduction QTLs, annotated within the coordinates of the windows explaining > 1% of  Table 1. Significant associated windows explaining more than 1% of the variance for scrotal circumference and sperm motility in crossbreed beef cattle and the overlapping genes located within those windows. *Prioritized candidate gene identified by GUILDify and ToppGene analysis. Sperm motility genome wide association study and QTL enrichment. The Manhattan plot in Fig. 1b shows the non-overlapping SNP windows 1 Mb apart that explain the highest proportion of variance for SM. Five windows explained more than 1% of the genetic variance for SM which were located on BTA9, BTA13, BTA20, and BTA24, and explained 7.17% of the total genetic variance (Table 1). Twenty-eight positional candidate genes were identified within these SNP windows explaining more than 1% of the genetic variance for SM. ToppGene prioritization revealed 14 of the 28 positional candidate genes were prioritized for spermatic-related processes and the windows which these genes are mapped within explained 7.16% of the total genetic variance for SM. The QTL annotation revealed several reproduction QTLs, annotated within the coordinates of the win- . The x-axis shows the richness factor for each QTL, representing the ratio of number of QTLs and the expected number of that QTL. www.nature.com/scientificreports/ dows explaining > 1% of the total genetic variance for SM, account for 8.11% of the QTLs annotated in these regions (Fig. 3a). The QTL enrichment analysis revealed 13 significant QTLs (FDR-corrected p value ≤ 0.05) on BTA9 and BTA13 annotated for traits related to reproduction and exterior conformation ( Table 2; Fig. 3b).
Functional analysis of prioritized candidate genes for scrotal circumference and sperm motility. The network constructed with the 14 prioritized candidate genes for SC and SM (total of 28 genes) using NetworkAnalyst had 1442 nodes (genes) related to 120 GO:BP (Table S2a), 78 GO:MF (Table S3a) and 69 GO:CC (Table S4a) terms. One hundred and twenty GO:BP were significant (FDR-corrected p value ≤ 0.05) and 64 were identified as related to male fertility and reproduction (Table S2b). These 64 GO:BP terms were selected to construct the first module containing 752 genes (Table S2c). A second module was generated composed only by the prioritized candidate genes for SC and SM and its directly connected nodes leading to a network constructed of 20 genes and 26 GO:BP terms (Fig. 4a, Table S2d). None of the GO:BP in this module were significant (the smallest FDR-corrected p value was 0.128), however nine were related to male fertility and reproduction of which regulation of MAPK cascade, spermatid differentiation, and regulation of hormone secretion were the most notable (Table 3). In this second module, five of the 20 genes were prioritized candidate genes for SM and one gene was prioritized for SC. Of the 78 GO:MF, 75 were significant (FDR-corrected p value ≤ 0.05) and 34 were identified as related to male fertility and reproduction functions ( Table S3b). The first module for GO:MF terms was extracted using these 34 GO:MF terms containing 684 genes (Table S3c). A second module for GO:MF was generated composed only by the prioritized candidate genes for SC and SM and its directly connected nodes revealing a network constructed of 32 genes and 20 GO:MF terms (Fig. 4b, Table S3d). In this second module, 10 GO:MF were significant (FDRcorrected p value ≤ 0.05) and six of these were related to male fertility and reproduction including, acetyltransferase activity, zinc ion binding, lipase activity, endonuclease activity, nuclease activity, and cation channel activity ( Table 3). Five of the 32 genes in this second module for GO:MF were prioritized candidate genes for SM and four were prioritized candidate genes for SC.
Of the 69 GO:CC terms, 55 were significant (FDR-corrected p value ≤ 0.05) and 28 were related to male fertility and reproduction functions and processes (Table S4b). These 28 GO:CC terms were selected to extract the first module for GO:CC which contained 992 genes (Table S4c). "Batch Selection" of the prioritized candidate genes for SC and SM was used to extract the second module for GO:CC which revealed a network containing 51 genes and 21 GO:CC terms (Fig. 4c, Table S4d). In this second module, five GO:CC were significant (FDR-corrected p value ≤ 0.05) and three of these were related to male fertility and reproduction including, kinesin complex, spindle microtubule and cytosol ( Table 3). Five of the 51 genes in the second GO:CC module were prioritized candidate genes for SM and seven genes were prioritized candidate genes for SC. A graphical representation of the functional analysis can be found in Fig. 5.

Discussion
The ability to identify cattle with greater reproductive performance would significantly improve the efficiency of the beef cattle industry. Thus, the discovery of genetic markers related to male fertility through GWAS can contribute to the evaluation of fertility traits, such as SC and SM, and their implications on male fertility. Spermatic www.nature.com/scientificreports/ related traits, including SM are commonly measured in dairy cattle where semen is widely used for artificial insemination and measuring semen parameters is a standard practice 15,21,[24][25][26][27] . However, SM measurements are much less common in the beef industry where natural breeding is often used, which can be seen in our sample size. Testicular related traits, including SC have higher heritability values and correlations with fertility traits, although the majority of studies on SC have been on purebred bulls 18,21,[28][29][30][31][32] . Few SC GWAS have been conducted in crossbred beef cattle 26,33 , which represents a significant proportion of the beef industry 20 . To the best of our   26 . used SC records on 392 bulls in a GWAS and identified the STATU2 gene which participates in multiple biological processes, including reproduction, developmental and immune systems. When taken together, the limited studies conducted in crossbred beef cattle, the high heritability of SC and SM, and the evaluation of these traits together results in a study that can bring new considerations for the current stage of the literature regarding male fertility traits in beef cattle. The ssGBLUP method is based on an infinitesimal model and thus assumes equal variance for all SNP effects, posing an unrealistic situation for traits of economic interest, such as fertility 34 . Therefore, a weighted ssGBLUP (WssGBLUP) approached was used that combines pedigree, phenotype, and genotype data with the integration of different weights for markers in an iterative process to update SNP solutions. For SC, 8 windows, located on BTA9, BTA10, BTA24, and BTA29, explained more than 13.19% of the total genetic variance. For SM, 5 windows explained more than 7.17% of the genetic variance and were located on BTA9, BTA13, BTA20, and BTA24. Of these, BTA9 explained the highest proportion of the variance for SC (4.03%) and SM (2.76%), respectively. Chromosome-wise and genome-wise associations have previously been identified in BTA9 for SC at 420 days in Canchim beef cattle 26 . Moreover, a region on BTA9 in Nellore cattle was found to be associated with testicular hypoplasia, which is defined by a reduced size of both testicles, and consequently reduced scrotal circumference and sperm physiology 35 .
Of the 14 prioritized candidate genes identified for SC (SASH1, VIP, FBX05, MTRF1L, RGS17, SLC24A5, SEMA6D, MAP3K1, SRPRA, TIRAP, DCPS, ST3GAL4, KIRREL3, CDON), two, MAP3K1 and VIP, were previously reported as related to male reproduction. Mitogen-activated protein kinase kinase kinase 1 (MAP3K1), was mapped within a window that explained 1% of the variance in SM (BTA20: 22.18-23.18 Mb) and was of interest for its anti-and pro-apoptotic functions in germ cells 36 . Guan et al. 37 found the MAP3K1 gene was differentially expressed in testis from underfed and well-fed sexually mature sheep, indicating it could be a marker of germ cell apoptosis and therefore change SC and the efficiency of sperm production. The vasoactive intestinal peptide (VIP; BTA9: 89.51-90.51 Mb window), another prioritized candidate gene mapped within a window for SC that explained 1.61% of the variance, is known to act directly on the testis, promoting the production of testosterone in mice and rats 38,39 . In a study conducted by Lacombe et al. 39 , VIP null male mice exhibited a reduction in circulating concentrations of testosterone and follicle stimulating hormone (FSH), inhibiting the morphology of testicular seminiferous tubules. This gene has also been found to play a role in mammalian folliculogenesis, ovarian development, and puberty 5,40,41 .
Five of the prioritized candidate genes for SM have also been previously identified as candidate genes in the regulation of male fertility. These include, superoxide dismutase 2, T-complex protein 1, parkin co-regulated gene, sperm flagella 2 gene, and prolactin receptor (SOD2, TCP1, PACRG, SPEF2, PRLR). One of the main causes of sperm chromatin damage is through oxidative stress caused by an imbalance between reactive oxygen species and scavenger systems 42 . Superoxide dismutase isoenzymes, like SOD2 (BTA9: 95.18-96. 18 Mb window), destroy these toxic superoxide radicals that are normally produced within cells 43 . They have been found to be highly expressed in mammalian semen and their activity is positively associated with sperm concentration and motility 44,45 . Another gene found to be associated with spermatogenesis is TCP1 (BTA9: 95.18-96.18 Mb window), a member of the cytosolic chaperonin-containing TCP1 complex 46 , which has been identified in the cytosolic fraction 47 and plasma membrane 48 of bovine spermatozoa. The PACRG and SPEF2 genes are essential for Table 3. List of significant gene ontology terms associated with biological processes, molecular functions, and cellular components related to fertility identified by NetworkAnalyst using the protein-protein interaction network built by the prioritized candidate genes for scrotal circumference and sperm motility and the directly connected nodes. a False discovery rate. *Prioritized candidate gene identified by GUILDify and ToppGene analysis.  52 . This gene was also found to play a role in fertilization and survival rates through SNP-SNP interactions 53 .  18 Mb window) genes have been previously studied in bovine. The KIRREL3 gene was found to be highly expressed in granulosa cells and may act as a metabolic messenger linking metabolism, body composition and fertility 56 . The ACAT2 gene was associated with both daughter pregnancy rate and cow conception rate 57 , and MAS1 gene has been suggested to play a role in the regulation of www.nature.com/scientificreports/ ovulation 58 . Therefore, despite these genes being identified as prioritized candidate genes regulating male fertility traits, SC and SM, they also play a role in female fertility and regulation. Through functional analysis, three modules were created with the prioritized candidate genes for SC and SM for the GO:BP, GO:MF, and GO:CC terms related to male fertility and reproduction. Even though nine GO:BP in the module (created via selection of the GO terms related to reproductive processes and the prioritized candidate genes for SC and SM) were related to male fertility and reproduction, due to the few genes contained in the network, only one or two genes were associated with these BPs, thus, none of the terms were significantly enriched. Despite this, it is worthy to highlight the most notable BPs in this network: regulation of MAPK cascade, spermatid differentiation, and regulation of hormone secretion. The MAPK cascade regulates spermatogenesis, sperm maturation, and the acrosome reaction, thereby playing an important role in male reproductive processes 59 . One of the genes involved in the regulation of MAPK cascade is the T-complex protein 1 subunit alpha, TCP1, a prioritized candidate gene for SM and is involved in the assembly of actin and tubulin filaments 60 . Actin filaments are present in mammalian germ cells and are involved in a number of changes that occur during spermatogenesis such as, the determination of cell shape, motility, maturation of spermatozoa, and capacitation 61 . In particular, actin proteins change their distribution in the sperm head during maturation and control the balance between globular actin and fibrous actin 61 . Another GO:BP, spermatid differentiation, is also an important part of spermatogenesis that involves the differentiation of spermatids into mature spermatozoa in the seminiferous tubules 62 . The prolactin receptor, a SM prioritized candidate gene, was found to be directly linked with spermatid differentiation. Moreover, spermatogenesis and reproductive success is completely dependent on the secretion of various hormones, including FSH, androgen, and testosterone. For example, testosterone levels and SC in bulls are known to be correlated, with peak testosterone levels being lower in bulls with smaller SC 63 . Six of the significant GO:MF were related to male fertility and reproduction, including acetyltransferase activity, zinc ion binding, lipase activity, endonuclease activity, nuclease activity, and cation channel activity. Acetyltransferases, specifically, histone acetyltransferases, act during spermatogenesis through the differentiation of spermatocytes 64 . Zinc is incorporated into spermatozoa where it is bound by seminal fluid Zn-interacting proteins and plays a protective role for sperm chromatin decondensation and sperm motility 65 . Moreover, a number of cation channels, including potassium and chloride channels, are involved in sperm motility, maturation, and the acrosome reaction 66 . Three of the significant GO:CC terms were related to male fertility and reproduction, including, kinesin complex, spindle microtubule and cytosol. A number of kinesin families play key roles in mammalian spermatogenesis, including mitosis, meiosis, acrosome biogenesis, and tail formation 67 . Moreover, kinesins, specifically kinesin-13 s, regulate the spindle microtubule dynamics and control spindle assembly and kinetochore-microtubule attachments 67 . One example of the importance of the cytosol in male fertility is the cytosolic fraction of bovine spermatozoa, which exhibits tyrosine kinase activity associated with sperm capacitation and acrosome reaction 47 . The GO:BP term spindle microtubule contained the TIRAP and RGS17 genes which are prioritized candidate genes for SC and PRLR, a prioritized candidate gene for SM, indicating that these genes could be important biomarkers of bovine fertility.
The QTL annotation revealed reproduction QTLs account for 7.85% and 8.11% of the QTLs annotated in the windows explaining > 1% of the total genetic variance for SC and SM, respectively. These reproduction QTLs included, reproductive efficiency, age at first calving, calving ease, daughter pregnancy rate, interval from first to last insemination, fertility index, conception rate, and interval to first estrus after calving. The largest proportion of QTLs in this study were related to milk production for both SC and SM (31.94% and 54.05%, respectively). A QTL enrichment analysis was conducted as the simple bias of investigation in the QTL database for cattle can result in a larger proportion of records in the database. The enrichment analysis for the windows explaining > 1% of the total genetic variance for SC and SM revealed QTLs annotated for exterior conformation traits, including five different body depth, three different feet and leg conformation, five different strength, and five different stature QTLs annotated within the candidate windows for SC, and three different stature QTLs and three different body weight gain QTLs annotated within the candidate windows for SM ( Table 2; Table S5). In a study conducted by Schenkel et al. 68 , SC was found to be genetically correlated (P < 0.05) to average daily gain, ultrasound backfat thickness, mid-test metabolic weight and hip height in 13,151 bulls (r g = 0.24, 0.19, 0.31, 0.16, respectively). Another study revealed a positive correlation for SC and body weight in both pubertal and postpubertal Holstein bulls (r g = 0.76 and 0.45, respectively 69 ). Therefore, bulls with larger SC have a larger body size and faster growth [68][69][70] . This suggests that these regions may be regulating both fertility and conformation traits, however the biological mechanisms associated with this correlation is not well understood 6,71,72 . Other enriched QTLs for SC included udder depth, udder attachment, teat placement-rear, and udder cleft ( Table 2; Table S5). Cows with tightly attached udders and proper teats tend to remain in a herd longer, are easier to nurse, and less susceptible to mastitis, therefore udder traits may be used in conjunction with body conformation traits for the indirect selection of longevity in beef cattle 73,74 . The QTL enrichment analysis for SM also revealed QTLs related to fertility on BTA9, one for daughter pregnancy rate and two different QTLs for interval to first estrus after calving ( Table 2; Table S5). This is consistent with other studies that reveal a genetic correlation between male and female fertility traits. For example, Johnston et al. 75 found SM at 18 months of age to be highly correlated with female reproduction traits in Brahman and Tropical Composite cows, such as conception (0.53 and 0.72, respectively) and pregnancy rate (0.58 and 0.95, respectively). Gargantini et al. 76 identified a genetic correlation between yearling SC and age at puberty and pregnancy rate in heifers was − 0.57 and 0.35, respectively. Thus, BTA9 may be a candidate region for bovine fertility. A study conducted by McClure et al. 77 identified 41 SC QTLs in Angus cattle, of which three QTLs were identified on BTA9. The overlap between the candidate regions identified in the present study and previous studies reinforces the association of these genomic regions with the regulation of genes and biological processes responsible for male fertility traits, including VIP and SOD2 genes, mapped on BTA9, and the regulation of MAPK cascade and spermatid differentiation. www.nature.com/scientificreports/ To conclude, functional analysis for the prioritized candidate genes identified in this study revealed significant GO terms associated with biological processes and molecular functions related to male fertility and reproduction. For scrotal circumference, both MAP3K1 and VIP identified genes control testis function and could be used as potential biomarkers of spermatogenesis and apoptosis. For sperm motility, the SOD2, TCP1, PACRG, SPEF2, and PRLR genes were related to sperm concentration, development, and motility. These results help to better understand the genetic bases of scrotal circumference and sperm motility specifically in crossbred beef bulls and revealed positional candidate genes with additional functional evidence that might ultimately improve bull genomic prediction for these traits. Moreover, these candidate regions, specifically those mapped on BTA9, have known genetic correlations to other economically important traits including, conformation, female fertility, and udder structure. However, future research on these candidate genes and their impact on bull fertility is warranted.

Materials and methods
Population structure and phenotypic data. Data used in this study are from animals cared for under protocols approved by the University of Guelph Animal Care Committee, which follows guidelines of the Canadian Council on Animal Care (1993).
The population for this study came from the Ontario Beef Research Centre; the University of Guelph research farm located in Elora, ON, Canada and consisted of 265 crossbred bulls. At the time of collection, bulls had an average age of 384 days and average weight of 555 kg. The predominant breeds (and corresponding average composition of that breed in the test group, see Table S1) of these crossbred bulls were Angus (AN: 52%), Simmental (SM: 24%), Piedmontese (PI: 6.6%), Gelbvieh (GV: 6.3%), Charolais (CH: 3.8%), and Limousine (LM: 1.1%).
The fertility traits used in this study were SC and SM. Scrotal circumference was measured by palpating the testes into the lower half of the scrotum and measuring the greatest circumference using a looped tape as described by Awada et al. 19 . Semen ejaculates were collected between 12 and 15 months of age immediately after the SC measurement using electroejaculation (Pulsator IV-Auto Adjust Electro-Ejaculator: Lane Manufacturing, Inc, Denver, CO). Sperm motility was visually estimated immediately and ejaculates were then extended, cooled, placed in liquid nitrogen and thawed as previously described by Pursel & Johnson 78 . Sperm motility was assessed as described by Awada et al. 79  Genotyping and quality control. Genotyping was completed from 265 animals using the Affymetrix Genechip Bovine Genome High Density Array, which included 648,874 SNPs. Marker coordinates were converted to the new bovine genome assembly ARS-UCD 1.2. Quality control was performed using Plink 80 and the following criteria were used for the exclusion of SNPs: non-autosomal regions; minor allele frequency < 0.05; and a call rate < 95%.
Genome-wide association study. The programs of BLUPF90 family were used for the weighted singlestep genomic BLUP (WssGBLUP) analysis 81 . The GWAS results were reported as the proportion of the variance explained by non-overlapping genomic windows of 1 Mb 82 .
A WssGBLUP method was used to estimate SNP effects. The observed phenotypes of SC and SM were used as dependent variables in a single-trait animal model: where y represented a vector of observed phenotypes for animals (SC or SM); X is the incidence matrix of fixed effects; b is the vector of fixed effects, Z a is the incidence matrix of additive genetic effects; a is the random vector of additive genetic effects; and e is the vector of residual effects. Fixed effects for SC included 25 levels of herd-year-season (HYS), body weight, age as a third-degree polynomial, and breed composition for the most prevalent breeds (AN, SM, PI, GV, CH, and LM). Fixed effects for SM included 25 levels of HYS, age, and breed composition for the most prevalent breeds.
The solutions of the SNP effects (û) were obtained using the AIREMLF90 81 algorithm with two iterations, as proposed by Wang et al. 34 .
For each iteration of the algorithm D (n) = I and G (n) = ZD (n) Z'λ, where D is a diagonal matrix of weights for SNP variances, G is the weighted genomic relationship matrix, Z is a matrix relating genotypes of each locus, λ is a variance ratio, and n is the iteration number; the breeding values (â v ) were calculated using single-step genomic best linear unbiased predictor (ssGBLUP); SNP effects were calculated via û (n) = λD (n) Z′G (n) −1 â v ; SNP weights were calculated via d i(n+1) = û 2 i(n) 2p i (1 − p i ), where i is the ith SNP; the weights were normalized via D (n+1)= tr(D (0) ) tr(D (n+1) ) D(n+1) so the additive genetic variance remains constant; and the Genomic matrix was recalculated via G (n+1) = ZD ( n+1) Z'λ to obtain the SNP effects. These iterations were used to calculate the proportion of variance explained by non-overlapping windows.
The proportion of variance explained by non-overlapping windows were estimated using the PostGSf90 algorithm by summing the variance of SNPs within 1 Mb 82 . Windows that explained greater than 1% of the genetic variance for SC and SM were selected for QTL and gene annotation 83  www.nature.com/scientificreports/ enrichment analysis was also conducted using the GALLO R package for all the QTL information annotated within the candidate windows using a chromosome-based enrichment analysis. Briefly, a bootstrap approach was used to compare the observed number of QTL for each trait in each chromosome annotated using GALLO with the expected number for each trait estimated using 1000 iteration rounds of random sampling from the whole Cattle QTLdb. Using this approach, a p value for the QTL enrichment status of each annotated trait within the candidate windows was calculated, Additionally, the p values were corrected for multiple testing using FDR (5%).
Gene prioritization analysis. Functional candidate gene prioritization was conducted using the Topp-Gene Suite 85 . A trained list of genes associated with keywords outlined by Fonseca et al. 21 using the GUILDify 86 software and a species-specific (Homo sapiens) interaction network, including, "scrotal circumference, " "scrotal, " "testicular, " "testis, " "testes, " "sperm, " "semen, " "spermatozoa, " "spermatogenesis, " and "fertility, " made up the "trained list" of genes. From this analysis, the top 100 genes ranked using an algorithm based on network topology on GUILDify were used to create the "trained" gene list. This "trained" gene list was used in conjunction with the "test" gene list containing the genes within windows explaining greater than 1% of the genetic variance for SC and SM. An annotation-based prioritization analysis through a multivariate approach was conducted using ToppGene. Gene Ontology terms for molecular function (MF), biological process (BP), and cellular component (CC); human and mouse phenotypes; metabolic pathways; Pubmed publications; and diseases were utilized to retrieve functional information for the trained list and for the list of positional candidate genes (those genes annotated within windows explaining more than 1% of the genetic variance for the trait). Overall p values were obtained using a combination of the p values obtained from the intermediate values from the above functional information using a random sampling of 5000 genes from the whole genome for each annotation information. Significant prioritized genes were selected based on a FDR 5% multiple correction, meaning these genes have a functional profile that is significantly similar to the functional profile of the "trained" gene list. As the "trained" gene list is known to be associated with fertility, by the guilty by association principle, it is probable that the prioritized genes will also be associated with fertility.

Functional analysis.
A protein-protein interaction network analysis was performed in order to identify interactions between the positional candidate genes and other genes in the genome that are relevant to fertility. Therefore, the prioritized candidate genes for both SC and SM were inputted together into NetworkAnalyst 3.0 ( 87 https ://www.netwo rkana lyst.ca) and the STRING interactome protein-protein interaction database was used with a confidence score cutoff of 900. A second-order interaction network was generated containing 1442 nodes (genes) with 4888 edges. Then, gene ontology (GO) enrichment analyses, including the three main categories biological process (BP), molecular functions (MF), and cellular components (CC), was performed using the genes comprising the network 88 . Functional evidence of the relationship between the significant GO terms (FDR-corrected p value ≤ 0.05) and the target phenotypes (SC and SM) was identified. The GO terms related to reproductive processes were selected to extract the first module from this network composed only by the nodes (genes) associated with the selected GO:BP, GO:MF and GO:CC. Next, a second module was extracted, from the previous sub-network. This second module was composed only by the positional prioritized candidate genes for SC and SM and its direct connected nodes. The GO:BP, GO:MF and GO:CC enriched for this second module were analyzed for its association with reproductive processes.