Machine learning and hypothesis driven optimization of bull semen cryopreservation media

Tu, Frankie; Bhat, Maajid; Blondin, Patrick; Vincent, Patrick; Sharafi, Mohsen; Benson, James D.

doi:10.1038/s41598-022-25104-6

Download PDF

Article
Open access
Published: 25 December 2022

Machine learning and hypothesis driven optimization of bull semen cryopreservation media

Frankie Tu^1,4,
Maajid Bhat^2,4,
Patrick Blondin³,
Patrick Vincent³,
Mohsen Sharafi^3,4 &
…
James D. Benson⁴

Scientific Reports volume 12, Article number: 22328 (2022) Cite this article

1841 Accesses
4 Citations
8 Altmetric
Metrics details

Subjects

Abstract

Cryopreservation provides a critical tool for dairy herd genetics management. Due to widely varying inter- and within-bull post thaw fertility, recent research on cryoprotectant extender medium has not dramatically improved suboptimal post-thaw recovery in industry. This progress is stymied by the interactions between samples and the many components of extender media and is often compounded by industry irrelevant sample sizes. To address these challenges, here we demonstrate blank-slate optimization of bull sperm cryopreservation media by supervised machine learning. We considered two supervised learning models: artificial neural networks and Gaussian process regression (GPR). Eleven media components and initial concentrations were identified from publications in bull semen cryopreservation, and an initial 200 extender-post-thaw motility pairs were used to train and 32 extender-post-thaw motility pairs to test the machine learning algorithms. The median post-thaw motility after coupling differential evolution with GPR the increased from 52.6 ± 6.9% to 68.3 ± 6.0% at generations 7 and 17 respectively, with several media performing dramatically better than control media counterparts. This is the first study in which machine learning was used to determine the best combination of constituents to optimize bull sperm cryopreservation media, and provides a template for optimization in other cell types.

Machine learning for sperm selection

Article 17 May 2021

Freezability biomarkers in bull epididymal spermatozoa

Article Open access 05 September 2019

A novel microfluidic chip-based sperm-sorting device constructed using design of experiment method

Article Open access 13 October 2020

Introduction

Semen cryopreservation is a critical aspect of modern animal agriculture and the economization of animal genetic programs¹. Moreover, semen is at the forefront of cryopreservation research, dating back to the 1600s, and was the tissue type that enabled the discovery of permeating cryoprotectants that define modern cryopreservation². Freezing protocols are associated with detrimental effects on different compartments of sperm, consequently affecting their post-thaw fertility potential³. To ameliorate this damage, sperm cryopreservation protocols use extenders to preclude both lethal and sublethal damage to sperm including preserving sperm motility, metabolic function, and fertility^3,4. However, post-thaw fertility is still dramatically reduced in all species⁵ and varies according to species and strain and with components and formulation of extenders, cryoprotectants used, and cooling rates¹.

Extender components play a critical role in determining the post-thaw fertility of frozen-thawed sperm⁶. There are many ingredients in extenders such as membrane stabilizers (egg yolk, soybean lecithin, or milk), permeating cryoprotectants (glycerol, ethylene glycol, or dimethyl sulfoxide), buffer (TRIS or TES), sugars (glucose, lactose, raffinose, saccharose, or trehalose), salts (sodium citrate, citric acid) and antioxidants (enzymatic and non-enzymatic). In spite of decades of attempts to optimize the formulation of extenders, improvements have been limited, and as such, the majority of extenders cannot support approximately of 40% of sperm during the freeze–thaw process^1,7,8. Conventional optimization of sperm cryopreservation has been attempted using very few changes in ingredients such as adding or removing one, two or three of extender components, but these experimental designs are generally unable to discover the effects of each individual ingredient in interaction with others. Moreover, most sperm cryopreservation studies have been fairly limited in sample size, addressing these single component differences among animal donor counts of less than 20, and typically less than 10. Because of the significant within and among animal variability in cryo-recovery, and the significant herd-to-herd, or strain to strain difference in cryo-recovery, this makes the adoption of individual studies to sperm cryopreservation at large challenging at best. In fact, this variability is a considerable challenge for sperm cryopreservation optimization. In short, there is significant variability in post-thaw recovery in sperm, even among ejaculates from the same bull^9,10. Progress towards optimal sperm cryopreservation protocols is camouflaged by this inter- and intra- bull variability, suggesting that for any protocol to be generally optimal, a very wide selection of bulls must be made, and repeat selections from within a herd must be incorporated into the experimental design.

Sperm cryopreservation is complicated by interactions of extender components¹¹. For example, recent sperm cryopreservation research is dominated by the study of antioxidants as a major component of sperm cryopreservation medium^12,13,14. However, since each step in the cryopreservation process and each component in cryopreservation medium has either oxidative or reductive effects, the choice and concentration of antioxidants to ensure homeostasis is challenging to generalize beyond the scope of a single experimental design. In fact, experimental evaluation of a few factors at a time (e.g. antioxidants with other extender components) can be very resource intensive¹⁵. Therefore, to make progress towards determining optimal media components for sperm cryopreservation that accounts for the multi-factorial interactions, a new experimental strategy is needed.

Recently machine learning was used to optimize a cell culture medium for T cells¹⁶ and to design an integral membrane channel rhodopsin for efficient eukaryotic expression and plasma membrane localization¹⁷. Similarly, a differential evolution algorithm was used to optimize cryopreservation conditions for Jurkat cells and mesenchymal stem cells^18,19. In these experimental designs protocols or medium are optimized via information obtained through physical experiments guided by the algorithm. Machine learning advantageously addresses a large experimental parameter space (e.g. multiple media component concentrations) with significantly fewer total experiments than traditional factorial designs²⁰. This new method can be an alternative to prevalent strategy for optimization of extender compounds with huge efficiency in process and final output¹⁶.

Here we consider two supervised learning models, artificial neural networks and Gaussian process regression, to determine the optimal machine learning model for our system (see Fig. 1). Artificial neural network models require no prior model knowledge or intuition regarding the relationship between the inputs and experimental results during the model training process^21,22. This allows the artificial neural network to approximate a wide range of different models²³. Gaussian process regression models are probabilistic and have been used to analyze and optimize media composition for lipid productivity²⁴. Unlike the artificial neural network, Gaussian process regression incorporates error estimates in its model predictions, and accounts for the error in experimental results²⁴.

A broad range of ingredients with preservative characteristics have been used in semen extender during cryopreservation. For bull sperm, egg yolk^25,26 or milk²⁷ are the major extracellular components to protect the plasma membrane against cryopreservation damage through the dynamic substitution of membrane phospholipids that ameliorate membrane phospholipid loss^28,29. Glycerol is a key cryoprotective agent (CPA) in most bull sperm extender media, though some reports of the success of the CPA ethylene glycol have been published³⁰. Both of these CPAs support dehydration at lower temperatures, reduce intracellular ice formation, and thus increase survival during cryopreservation¹⁴. Sugars such as fructose, sucrose, and trehalose are another common ingredient of extender media that provide some protection during cryopreservation, though typically sugars are not able to diffuse across the plasma membrane, instead creating an osmotic pressure in the media that induces cell dehydration, a lower incidence of intracellular ice formation, and an increased likelihood of vitrification in the unfrozen space surrounding sperm at low temperatures³¹. Sucrose and trehalose are considered standard extracellular CPA and are associated with the promotion of extracellular glass during cooling, but also membrane stabilization and ice recrystallization inhibition^32,33. Fructose is generally an energy source for sperm, but it also facilitates adjusting extender osmolality and acts as a CPA³⁴. The combination of sugar and tris (hydroxyl-methyl) aminomethane (Tris) buffer has a great impact on the success of the cryopreservation medium³⁵.

During cryopreservation, sperm are subjected to many biochemical, mechanical, and ultrastructural stresses, causing detrimental effects on post-thaw parameters and fertility potential³⁶. Reactive oxygen species (ROS) are a major source of this damage and many studies have hypothesized that antioxidants would be able to support sperm by counteracting ROS during the cryopreservation process. While there are myriad choices for antioxidant classes and species, here we focus on two: melatonin, a secretory product of the pineal gland that serves as an effective antioxidant and eliminates ROS during cryopreservation³⁷, and nerve growth factor (NGF), an antioxidant that supports cellular signal transduction and improves capacitation and acrosome protection which is one of the critical characteristics of sperm during fertilization which needs to be protected during freeze-thaw^38,39.

In this manuscript we propose to identify an optimal bull sperm cryopreservation medium from a set of components listed in Table 1. To identify the optimal concentration of each component in this medium, we employ an iterative feedback loop informed by supervised machine learning and experimentation on semen from a large herd of production bulls. This process was repeated over generations of media designs and resulted in significant improvement in post thaw recovery against industry standard medium in a large sample group of bulls.

Table 1 Media components used in iterative design and their purpose. Each component concentration is varied during each trial according to predictions from the simulated annealing algorithm.

Full size table

Results

Optimal theoretical model

To determine the optimal model, 200 extender-post-thaw motility pairs were used to train and 32 extender-post-thaw motility pairs test the machine learning algorithms. Cross validation produced an average mean squared error of 525 and 576 for the artificial neural networks and Gaussian process regression respectively. Post-thaw motility predictions by the artificial neural network and Gaussian process regression had mean squared error 275 and 883 respectively. Overall, the artificial neural networks had less prediction error then the Gaussian process regression.

The median post-thaw motility at generation 6 for the artificial neural networks predictions, Gaussian process regression predictions, and experimental results were 43.6 ± 2.9%, 36.0 ± 20.7%, and 47.4 ± 16.5% (median ± SD) respectively. The Kruskal–Wallis test indicated that there was no significant difference between artificial neural network predictions, Gaussian process regression predictions, and experimental results (χ² = 3.90, df = 2, p = 0.14; (Experimental vs. ANN) α = 0.05, m = 32, n = 32, ε₁ = 0.15, ε₂ = 0.15, W + = 0.56, \(\widehat{\upsigma }\) = 0.08, CRIT = 0.31, REJ = 0; (Experimental vs. GPR) α = 0.05, m = 32, n = 32, ε₁ = 0.1, ε₂ = 0.1, W + = 0.63, \(\widehat{\upsigma }\) = 0.07 CRIT = 0.57, REJ = 0). Comparing pair wise predictions with the Dunn test showed that artificial neural networks and Gaussian process regression predictions were not significantly different (p = 0.196; α = 0.05, m = 32, n = 32, ε₁ = 0.1, ε₂ = 0.1, W + = 0.61, \(\widehat{\upsigma }\) = 0.08, CRIT = 0.31, REJ = 0). The Fligner-Killeen test showed that experimental results and the artificial neural network predictions were heteroscedastic (χ² = 9.23, df = 1, p = 0.002) and homoscedastic with the Gaussian process regression (χ² = 1.20, df = 1, p = 0.27; Fig. 2).

Optimum extender composition

The median post-thaw motility from generation 1 to 6 increased from 25.5 ± 20.3% to 47.4 ± 16.5% (median ± SD) based solely on differential evolution. After coupling differential evolution with GPR the median post-thaw motility increased from 52.6 ± 6.9% at generation 7 to 66.7 ± 3.1% at generation 17 (median ± SD) using total motility as the optimization metric. The median post-thaw motility increased from 50.0 ± 14.0% at generation 18 to 68.3 ± 6.0% at generation 21 (median ± SD) using relative total motility as the optimization metric (Fig. 3).

The median post-thaw motility for the total motility metric at generations 15, 16, and 17 were 63.0%, 55.4%, and 66.7% respectively. At generation 17, median post-thaw motility for the control medium was 70.5%. There was no difference in motility between the total of generation 17 extenders and commercial medium (χ² = 1.58, df = 1, p = 0.21; α = 0.05, m = 32, n = 4, ε₁ _ = 0.15, ε₂ = 0.15, W + = 0.30, \(\widehat{\upsigma }\) = 0.15, CRIT = 0.10, REJ = 0). The median post-thaw motility for the relative total motility metric at generations 19, 20, and 21 were 63.3%, 66.5%, and 68.3%, respectively. At generation 21, median post-thaw motility for the control medium was 65.3%. The median post-thaw motility for all but one extender in generation 21 exceeded the control medium. However, there was no difference in motility between the total of generation 21 extenders and commercial medium (χ² = 2.13, df = 1, p = 0.14; α = 0.05, m = 32, n = 4, ε₁ = 0.15, ε₂ = 0.15, W + = 0.73, \(\widehat{\upsigma }\) = 0.07, CRIT = 0.44, REJ = 0). The convergence testing between generations 19, 20, and 21 indicated that there were no differences between the three generations and the algorithm was approaching a set optimal media composition (χ² = 4.31, df = 2, p = 0.12;(Gen 19 vs Gen 20) α = 0.05, m = 32, n = 4, ε₁ = 0.15, ε₂ = 0.15, W + = 0.44, \(\widehat{\upsigma }\) = 0.07, CRIT = 0.52, REJ = 0; (Gen 20 vs Gen 21) α = 0.05, m = 32, n = 4, ε₁ = 0.1, ε₂ = 0.1, W + = 0.43, \(\widehat{\upsigma }\) = 0.07, CRIT = 0.51, REJ = 0; (Gen 19 vs Gen 21) α = 0.05, m = 32, n = 4, ε₁ = 0.15, ε₂ = 0.15, W + = 0.34, \(\widehat{\upsigma }\) = 0.07, CRIT = 0.63, REJ = 0). We did not observe any difference in post-thaw between total post-thaw motility or relative post-thaw motility optimization metric (χ² = 1.73, df = 1, p = 0.18; α = 0.05, m = 32, n = 4, ε₁ = 0.15, ε₂ = 0.15, W + = 0.60, \(\widehat{\upsigma }\) = 0.07, CRIT = 0.10, REJ = 0) (Fig. 4).

Validation of top media designs

A set of five out of 196 algorithm driven extenders were chosen to compare with the standard control egg-yolk based extender. The criteria for choosing the 5 extenders were relative post-thaw motility (see Fig. 5). The mean post-thaw motility and progressive motility were 31.9 ± 1.3% and 23.2 ± 1.1%, respectively, for the top five extenders, compared with 25.2 ± 1.1% and 16.8 ± 1.1%, respectively, for the corresponding controls. This corresponds to a 26.6% and 37.6% improvement in post-thaw motility and progressive motility, respectively. Table 2 shows the components of the top extenders and concentrations of each individual component. The results for post-thaw motion characteristics and flow cytometry parameters of top extender and standard control extender are presented in Table 3. We used a stepwise approach to identify the top algorithm-driven extenders by comparing the various motion characteristics and flow cytometry parameters in post-thaw semen in the top algorithm driven extender and the commercial extender as control. Total motility in the top extender varied between 57.4% to 63.4% which is comparable to the amount of control extender with 60.5% value. There is a similar trend for progressive motility so that the average of post-thaw progressive motility was equivalent to the average of control post-thaw progressive motility (Table 3). While all the kinetic parameters (VAP, VCL, ALH and BCF) had similar value compared to the control extender, LIN as important fertility associated metric was significantly higher in the algorithm driven extenders (Table 3). There was no significant difference in terms of percentage of mitochondrial activity, membrane functionality and acrosome integrity within and between 5 top extenders and standard control extender (Table 3).

Table 2 The components and concentrations of the five top extenders driven by algorithm design.

Full size table

Table 3 Post-thaw motion and flow cytometry parameters of bull sperm in 5 top algorithm driven extenders.

Full size table

Discussion

Extender components play a critical role in protecting sperm during cryopreservation, and each component has a specific role as a stabilizer, cryoprotectant agent, energy provider, or antioxidant. Extenders are composed of a number of these ingredients and there are multiple interactions between each individual component. Conventional methods of extender development are not able to discover the effects of each individual ingredient in interaction with others and often are only able to evaluate one or two factors at a time. These complications are compounded by the high within- and among-bull post thaw recovery from sample to sample. Therefore, our approach allows an experimental exploration of these interactions to determine optimal freezing media. In the current study, we used machine learning to optimize the composition of bull semen freezing media, with the hope that it would provide a tool robust against the widely varying within and among-bull post-thaw fertility. This is a new strategy in bull semen cryopreservation.

The first objective of this study was to evaluate the post-thaw motility between two machine learning models, artificial neural networks and Gaussian process regression. An artificial neural network is generic in structure and possesses the ability to learn from historical data as well as with relatively less data compared to Gaussian process regression²¹. Bas and Boyasi²² used similar modeling techniques for enzyme kinetics, and found that artificial neural networks are better than response surface methodology for both data fitting and estimation capabilities²². We observed the same result within our study. The artificial neural network produced a smaller mean squared error than the Gaussian process regression model. However, the Kruskal–Wallis test and the Mann–Whitney test of equivalence indicated that there was no significant difference between the experiential results and either of our two models. This may indicate that using mean squared error to as a metric for selecting the optimal theoretical model is insufficient.

Although the mean squared error showed that the artificial neural network model is superior to the Gaussian process regression, it may not be the optimal model. In fact, there were large differences in experimental predictions. The artificial neural network model predicted a small variation in the post-thaw motility compared to the experiential data where there is large variation in post-thaw motility. This may indicate that artificial neural network models are more suitable for the cryopreservation of homogeneous samples like cultured Jurkat cells or mesenchymal stem cells¹⁸. However, sperm samples are much more heterogeneous, with random sampling, genetic difference, and seasonal fluctuation producing samples that are highly variable. As such, we observed that the Gaussian process regression produced variance that was more comparable with experimental results. As a model, the Gaussian process regression model accurately predicted experimental post-thaw motility and captured the sample variance, whereas the artificial neural network was only able to match post-thaw motility predictions.

The second objective of this study was to optimize the formulation of ingredients in extender for improved post-thaw motility in bull sperm. To do this, we combined Gaussian process regression and differential evolution to optimize the extender components to producing media that was comparable to our commercial control extender. We needed 21 generation to create extenders that was comparable to commercial that have been optimized over 60 years. Our approach also required significantly fewer trials compared to a simple factorial approach, which, with three levels (e.g. zero, low, high) for each factor (a crude estimate of real protocols), would have needed nearly 200,000 treatment combinations, and due to the variability among and even within bull semen, the number of experiments for this many treatments would be impossible.

Differential evolution is a parallel direct search method based on biological evolution mechanisms⁴⁰. Pollock et al.¹⁸ were the first to apply differential evolution to optimize media composition and cooling rate for the cryopreservation of cultured Jurkat (white blood) cells and mesenchymal stem cells¹⁸. We found similar results when applying these techniques to bull sperm cryopreservation. This iterative approach has several benefits compared with conventional experimental factorial approaches, including adaptation to new samples, the ability to add/subtract ingredients easily and specify for low cryotolerance. Thus, we expect that this modeling-informed experimental design can be applied to other species’ sperm, and the cryopreservation of other nonhomogeneous cells and tissues. In the current study, mean post-thaw motility from generations more often increased each generation with the total motility metric. With the relative total motility metric, we see that generation 19—21 the algorithm post-thaw motility began to plateau. In the final generation for each optimization metric (Fig. 3), we observed that differential evolution using relative total motility produced more media that were on-par or better then control.

During cryopreservation, extender components and the their interactions dramatically affect sperm survival⁴¹. Motility of sperm is the main frequent quality control parameter for selecting sperm with the highest fertility potential⁴¹. There is strong positive correlation between motility and fertility. Therefore, we selected motility as the main criteria to select the best extenders in each generation. The post-thaw motility of sperm in the latest version of designed extender was comparable to the commercial control extender which is in the regular use for the artificial insemination. To validate motility data, we performed a flow cytometric experiment and there was a logical relationship between the mitochondrial activity, apoptosis, and acrosome integrity with the post-thaw motility observed in the previous step.

In the current study, extender formulations were parameterized using a number of components including buffer, sugars, CPAs, antioxidants, and membrane stabilizers. These parameters defined the extender recipes that were tested experimentally. The output of the experiment (total motility) was the input into the algorithm that prescribed the next set of extender components. This process was repeated sequentially until a stopping condition was observed. On average our post thaw motility increased 0.5% per trial and some trials had media exceeding post thaw motility from standard protocols. Based on the post-thaw recovery in the best media, we repeated the testing of five final media obtained from our algorithmic approach with comparable or superior post-thaw recovery compared to the control standard extender. Unfortunately, the gains seen in previous trials during the main experiment where recovery was well above control medium were not seen as dramatically in our follow up experiment. However, an improvement in kinetic parameters specially LIN was observed in our top media. This improvement is significant as it has certain impact on field fertility. The algorithm driven extenders have a variety of compounds that can stabilize the membrane integrity of sperm as well as osmotic tensions during freeze–thaw. Cholesterol loaded cyclodextrin and trehalose were two major compounds of our extenders that are thought to be membrane stabilizers facilitating osmotic resistant against damage.

During the optimization process the differential evolution algorithm did eliminate some variables and we manually removed others as new research emerged. There are many studies that compare glycerol and ethylene glycol as CPAs^42,43,44,45. Recently, it was shown that may be possible to replace glycerol with ethylene glycol without effecting sperm quality or cryosurvival⁴². These studies looked at individual effects of glycerol and ethylene glycol on post-thaw parameters, as ethylene glycol and glycerol both perform similar roles as a permeating CPA. We initially hypothesized that that an extender with both ethylene glycol and glycerol may increase post-thaw motility. However, this was not the case. We found that extenders that contained ethylene glycol almost universally had lower post-thaw motility then those with glycerol only. The differential evolution algorithm often generated extenders that had zero ethylene glycol during the optimization process. At generation 21, we found that the extenders with the highest post-thaw motility contained no ethylene glycol. The differential evolution algorithm also eliminated fructose in some trials. We know fructose is an integral component to extender recipes³, and it would be uncontroversial to state that removing it could have been detrimental. However, we observed the opposite: extenders without fructose had good post-thaw motility. It is unknown if the positive result was merely caused by interactions between components in the extender. We leave the exploration of the component interactions for future work. The differential evolution algorithm included growth factor for 18 generations, however work by Kowser et al.⁴⁶ indicated that growth factor may be detrimental to post-thaw motility. Thus, to explore this effect and the hypothesis that growth factor may be detrimental, we manually removed growth factor for generation 19. We found that post-thaw motility increased after removing the growth factor. Note, however, if this was not the case, we could have continued and used the original recipe with growth factor included, without any loss or setback. As all additional data could be used to further train the Gaussian process regression. This shows that our optimization approach makes it possible to incorporate new information and explore ongoing hypotheses during the whole optimization process without any negative impact (e.g. wasted time or resources).

We were able to replicate some industry standards recipes after 21 generations (Table 2). Glycerol is a common CPA used many extenders. The standard glycerol amount is 6% (v/v)⁴⁷. The average glycerol amount we arrived at was approximately 6% (v/v). Typical tris-based extenders have 20% (v/v) egg yolk, our extenders only needed slightly less egg yolk. Standard fructose concentration is about 55.5 mM³⁴, and our new extenders that had fructose only needed either slightly less than the standard (50 mM) or very minimal amounts. This reduction may simply be due to the number of extender components that we explored, which reduced the amount of egg yolk and fructose that was needed. Our approached allowed us to replicate industry standards as well as reduce the amount or concentration of other necessary components. Therefore, we have demonstrated that this approach is a feasible methodology to verify that current extenders are optimal, to reduce the cost of manufacturing extenders, and to develop complex cryopreservation medium for cells and tissues where there is not a mature cryopreservation medium.

Conclusions

In this study, the Gaussian process regression model was optimized for predicting post-thaw motility of frozen extender in extenders generated by differential evolution. These approaches are the first steps towards improving post-thaw functionality and still need to be defined based on multiparameter fertility metrics such as a combination of motion and flow cytometric parameters. The post-thaw results obtained by this approach were comparable to commercial and industry standards, however only a field fertility trial will provide complete evidence of the true fertility of each sample as a function of preservation medium.

Materials and methods

Chemicals and reagents

All chemicals used in current study are provided by Sigma (St. Louis, MO, USA), and Merck (Darmstadt, Germany) Company, unless otherwise indicated. All animal experimental procedures were approved by the University of Saskatchewan University Animal Care Committee (UAP 002CatA2018) and were performed in accordance with relevant guidelines and regulations. ARRIVE guidelines have been followed in the methods.

Bulls and semen collection

Ejaculates were collected using an artificial vagina from 68 Holstein bulls, aged between 15 and 30 months, regularly used for breeding purpose at Semex (Quebec, Canada). Semen samples were kept in a water bath at 33 °C while preliminary analysis of fresh semen was performed to measure the concentration, motility, and morphology. Samples that did not meet minimum quality assessments (namely greater than 10⁹ sperm/mL, motility greater than 70%, and less than 15% abnormal morphology) were excluded from the study. All semen samples used in the study were collected as part of the regular production schedule from a production herd, and as such were a random selection of the greater production herd (~ 200 bulls) on site.

Extender preparation and cryopreservation

A standard control tris-based extender was prepared by dissolving 200 mmol/L tris (hydroxymethyl-aminomethane), 66.7 mmol/L citric acid and 55.5 mmol/L fructose. Then, tris-based extender was mixed with 200 ml fresh egg yolk and the distilled water to make 1 L extender⁴⁸. At each generation, eight tris-based extenders were prepared by dissolving prescribed concentrations of ingredients that are presented in Table 1. For four bulls per generation, diluted semen straws were prepared in triplicate for each of the eight algorithmically generated media extenders and the control extender. To do so, diluted semen samples were cooled to 4 °C for a 4 h equilibration time in each medium with one-step processing, packaged in 0.25 ml French straws, and then frozen in a controlled rate freezer (Digitcool 007,262, IMV, France) as follows: from 4 °C to −12 °C at −4 °C/min, from −12 °C to −40 °C at −40 °C/min, and from −40 °C to −140 °C at −50 °C/min before plunging into liquid N2. After at least 24 h of storage, frozen straws were thawed by transferring directly to a 37 °C water bath for 45 s before evaluation.

Algorithm design

A machine learning algorithm was implemented for both modeling the outcome as a function of all inputs via an additional neural network and Gaussian process regression model and the differential evolution algorithm that informs the next component selection.

The differential evolution algorithm used in this study was developed from strategy 1 (DE/random/1/bin, which has better accuracy where sharp changes may appear in the optimization space) by Storn and Price⁴⁰ and was coded in python and output information about the test population was used for empirical testing. A schematic of the process is outlined in Fig. 1. Briefly, the differential evolution algorithm randomly generates an initial population (generation 0) that spans the entire parameter space. Then, a test population (Generation 0.5) is generated from generation 0, by crossover, specific design variable values from generation 0 is used in test population or mutation which using strategy 1. Strategy 1 generates test population values by randomly selecting values from three other extenders from generation 0, we then take the difference between two of the three values and multiply by a mutation factor, then we add the result to the third value. This process is repeated for every individual in generation 0. Then, an experiment is performed with the initial population and test population and every individual is scored. Then, only the highest scoring individuals from both populations are selected. The selected individuals are then labeled generation 1. The process is then repeated, using generation 1. After 10 generations, strategy 1 was modified to increase variation in the test population. On top of strategy 1, we added the parameters’ reference value (Table 1) multiplied by a random value between ± (0–10) %. We set the rate at which crossover occurred to 0.5 (i.e., 50% of the time) and the mutation factor for strategy 1 at 0.9, these values were fell within the optimal range described by Pi et al.¹⁹ for strategy 1. The algorithm was completed, and convergence was achieved when post-thaw motility between multiple generations began to plateau.

The optimal theoretical post-thaw motility model used in this study was created in Python. An artificial neural network and Gaussian process regression were created in Python using TensorFlow and Scikit-Learn module respectively. The loss function for both models were mean squared error of predicted vs measured post-thaw motility. The artificial neural network had 12 input neurons (e.g., design variables), 1 hidden layer with 12 neurons, and 1 output neuron (e.g., post-thaw motility). We used the rectified linear unit function as the activation function and to prevent over fitting we used L2 regularization value of 0.0001. After 5 generations of differential evolution, we used tenfold cross validation to measure the theoretical fitness of each machine learning model. The tenfold cross validation method produced 10 mean squared error estimates which are averaged. A smaller average mean squared error indicates better model fitness. Then, we trained each model using data from generation 1–5 (200 post-thaw motility—extender pairs) and predicted post-thaw motility for extenders produced by differential evolution generation 6 (32 post-thaw motility—extender pairs). The Kruskal–Wallis test was used to compare the post-thaw motility predictions from each model to experimental post-thaw motility. A p < 0.05 was deemed significant. We selected the model that most closely emulated the experimental post-thaw motility as the optimal theoretical model and coupled that with differential evolution at generation 7.

Assessment of semen parameters post-thaw

For the first 17 generations, we used post-thaw motility as the optimization metric for our model to get our model within a comparable range to our control extender medium post thaw recovery. At generation 18, we changed to relative total motility as our optimization metric to give more power to our post thaw metrics by controlling for individual bull variability. The relative total motility was calculated by taking a ratio of algorithmic total motility over control total motility of each sample. Three additional iterations were performed using the new input parameter.

Motility was evaluated using the Sperm Class Analyzer (Microptic, Spain) capturing videos at 50 frames per second during one second⁴⁸. The diluted semen (2.5 µl) was placed on a pre-warmed chamber slide (37 °C, Life optic slide, 20 µM depth chamber) and motility characteristics were determined using a phase-contrast microscope (Nikon, Canada) with a 10 × objective at 33 °C.

Validation of top performing media recipes

We used a stepwise approach to identify the top algorithm-driven extenders by comparing the various motion characteristics and flow cytometry parameters in post-thaw semen in the top algorithm driven extender and Semex commercial extender as control. Experiments were implemented to compare the results of the top algorithm-derived media design recipes with the commercial control extender. The validation experiment was accomplished using 15 bulls in one trial. Bulls were considered replicates. There were not any selection criteria for the bulls as they were in normal production.

Motion characteristics were measured by the SCA as briefly explained above. Flow cytometric evaluations were performed using a BD LSRII cytometer (BD Biosciences, CA, USA). Three different lasers with wavelengths of 355 nm (25 mW laser output, ultra -violet laser), 488 nm (20 mW laser output, blue laser) and 633 nm (17 mW laser output, red laser) were used for analysis of semen samples. For each analysis, 10,000 sperm were measured, and data were saved as FCS file. Propidium iodide (PI) was used to measure the membrane integrity of sperm. PI binds to the DNA of cells with the damaged membrane and display a red fluorescence after excitation with a 488 nm laser. Sperm showing red fluorescence after excitation with the 488 nm laser were calculated as sperm with non-intact membrane or percentage of membrane damage (MD). Acrosome reaction was measured using peanut (Arachis hypogea) agglutinin (PNA) protein that can bind to proteins of the acrosome. PNA is conjugated with fluorescein isothiocyanate (FITC), a molecule that excites at 488 nm and provides a green, fluorescent emission at 518 nm. Sperm showing green fluorescence after excitation with the 488 nm laser were counted as sperm with acrosome reacted (AR). Mitotracker deepRed which fluoresce red after excitation with a 633 nm laser light was used to measure the mitochondrial activity of sperm. Mitotracker deepRed can across the plasma membrane and enter into the active mitochondria. The percentage of sperm showing red fluorescence after excitation with the 633 nm laser were counted considered as sperm with active mitochondria.

Statistical analysis

All statistical analysis were conducted in R (version 4.1.2, R Core Team (2021)⁴⁹) using the FSA: Fisheries Stock Analysis package⁵⁰ and EQUIVNONINF: Testing for Equivalence and Noninferiority⁵¹. To assess the predictive ability of the machine learning models, a Kruskal–Wallis test, Fligner-Killeen test, and Mann–Whitney test for equivalence were used to compare to experimental results. Then a post-hoc Dunn test was used to test pairwise relationships between all experimental results, artificial neural network post-thaw motility predictions, and Gaussian process regression post-thaw motility predictions. We also compared the post-thaw motility produced by Generation 17 and 21 extenders to commercial extenders to assess optimization metric performance (64 motility—extender pairs). Generation 19 to 21 were compared using Kruskal–Wallis to assess if differential evolution was converging to an optimum. Comparisons were considered significant for p < 0.05. We defined a conservative symmetrical equivalence margin of 150%, corresponding to (−0.15, 0.15)⁵². Our equivalence margins were chosen to match the industry standard sperm motility quality acceptance region of 30%. Medians and standard deviation were reported for all values unless otherwise specified.

Data availability

The datasets generated the current study are available on request.

References

Ugur, M. R. et al. Advances in cryopreservation of bull sperm. Front. Vet. Sci. 6, 1–15 (2019).
Article Google Scholar
Benson, J. D., Woods, E. J., Walters, E. M. & Critser, J. K. The cryobiology of spermatozoa. Theriogenology 78, 1682–1699 (2012).
Article CAS Google Scholar
Amirat, L. et al. Bull semen in vitro fertility after cryopreservation using egg yolk LDL: A comparison with Optidyl®, a commercial egg yolk extender. Theriogenology 61, 895–907 (2004).
Article CAS Google Scholar
Yoon, S. J., Kwon, W. S., Rahman, M. S., Lee, J. S. & Pang, M. G. A novel approach to identifying physical markers of cryo-damage in bull spermatozoa. PLoS ONE 10, 1 (2015).
Article Google Scholar
Medeiros, C. M. O., Forell, F., Oliveira, A. T. D. & Rodrigues, J. L. Current status of sperm cryopreservation: Why isn’t it better?. Theriogenology 1, 5327–5344 (2002).
Google Scholar
Pojprasath, T., Lohachit, C., Techakumphu, M., Stout, T. & Tharasanit, T. Improved cryopreservability of stallion sperm using a sorbitol-based freezing extender. Theriogenology 75, 1742–1749 (2011).
Article CAS Google Scholar
Lonergan, P. Historical and futuristic developments in bovine semen technology. Animal 12, s4–s18 (2018).
Article CAS Google Scholar
Mousavi, S. M. et al. Comparison of two different antioxidants in a nano lecithin-based extender for bull sperm cryopreservation. Anim. Reprod. Sci. 209, 1 (2019).
Article Google Scholar
Murphy, E. M. et al. Influence of bull age, ejaculate number, and season of collection on semen production and sperm motility parameters in holstein friesian bulls in a commercial artificial insemination centre. J. Anim. Sci. 96, 2408–2418 (2018).
Article Google Scholar
Thurston, L. M., Watson, P. F. & Holt, W. V. Semen cryopreservation: A genetic explanation for species and individual variation?. Cryo-Letters 23, 255–262 (2002).
Google Scholar
Sieme, H., Oldenhof, H. & Wolkers, W. F. Mode of action of cryoprotectants for sperm preservation. Anim. Reprod. Sci. 169, 2–5 (2016).
Article CAS Google Scholar
Ashrafi, I., Kohram, H. & Ardabili, F. F. Antioxidative effects of melatonin on kinetics, microscopic and oxidative parameters of cryopreserved bull spermatozoa. Anim. Reprod. Sci. 139, 25–30 (2013).
Article CAS Google Scholar
ChaithraShree, A. R. et al. Effect of melatonin on bovine sperm characteristics and ultrastructure changes following cryopreservation. Vet. Med. Sci. 6, 177–186 (2020).
Article CAS Google Scholar
Barbas, J. P. & Mascarenhas, R. D. Cryopreservation of domestic animal sperm cells. Cell Tissue Bank. 10, 49–62 (2009).
Article CAS Google Scholar
Galbraith, S. C., Bhatia, H., Liu, H. & Yoon, S. Media formulation optimization: current and future opportunities. Curr. Opin. Chem. Eng. 22, 42–47 (2018).
Article Google Scholar
Grzesik, P. & Warth, S. C. One-time optimization of advanced T cell culture media using a machine learning pipeline. Front. Bioeng. Biotechnol. 9, 1 (2021).
Article Google Scholar
Bedbrook, C. N., Yang, K. K., Rice, A. J., Gradinaru, V. & Arnold, F. H. Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization. PLoS Comput. Biol. 13, 1 (2017).
Article Google Scholar
Pollock, K., Budenske, J. W., McKenna, D. H., Dosa, P. I. & Hubel, A. Algorithm-driven optimization of cryopreservation protocols for transfusion model cell types including Jurkat cells and mesenchymal stem cells. J. Tissue Eng. Regen. Med. 11, 2806–2815 (2017).
Article CAS Google Scholar
Pi, C. H., Dosa, P. I. & Hubel, A. Differential evolution for the optimization of DMSO-free cryoprotectants: Influence of control parameters. J. Biomech. Eng. 142, 1–10 (2020).
Article Google Scholar
Li, R., Hornberger, K., Dutton, J. R. & Hubel, A. Cryopreservation of human iPS cell aggregates in a DMSO-free solution—An optimization and comparative study. Front. Bioeng. Biotechnol. 8, 1 (2020).
Article Google Scholar
Desai, K. M., Survase, S. A., Saudagar, P. S., Lele, S. S. & Singhal, R. S. Comparison of artificial neural network (ANN) and response surface methodology (RSM) in fermentation media optimization: Case study of fermentative production of scleroglucan. Biochem. Eng. J. 41, 266–273 (2008).
Article CAS Google Scholar
Baş, D. & Boyacı, İH. Modeling and optimization II: Comparison of estimation capabilities of response surface methodology with artificial neural networks in a biochemical reaction. J. Food Eng. 78, 846–854 (2007).
Article Google Scholar
I. Goodfellow, Y. Bengio, A. C. Deep Learning (MIT Press, 2016).
Myers, R. H. & Montgomery, D. C. Response surface methodology: Process and product in optimization using designed experiments (1995).
Mayer, D. T. & Lasley, J. F. The factor in egg yolk affecting the resistance, storage potentialities, and fertilizing capacity of mammalian spermatozoa. J. Anim. Sci. 4, 261–269 (1945).
Article Google Scholar
Pace, M. M. & Graham, E. F. Components in egg yolk which protect bovine spermatozoa during freezing. J. Anim. Sci. 39, 1144–1149 (1974).
Article CAS Google Scholar
Liu, Z., Foote, R. H. & Brockett, C. C. Survival of bull sperm frozen at different rates in media varying in osmolarity. Cryobiology 37, 219–230 (1998).
Article CAS Google Scholar
Purdy, P. H. & Graham, J. K. Effect of cholesterol-loaded cyclodextrin on the cryosurvival of bull sperm. Cryobiology 48, 36–45 (2004).
Article CAS Google Scholar
Raheja, N., Grewal, S., Sharma, N., Kumar, N. & Choudhary, S. A review on semen extenders and additives used in cattle and buffalo bull semen preservation. J. Entomol. Zool. Stud. 6, 239–245 (2018).
Google Scholar
Forero-Gonzalez, R. A. et al. Effects of bovine sperm cryopreservation using different freezing techniques and cryoprotective agents on plasma, acrosomal and mitochondrial membranes. Andrologia 44, 154–159 (2012).
Article Google Scholar
El-Sheshtawy, R. I., Sisy, G. A. & El-Nattat, W. S. Effects of different concentrations of sucrose or trehalose on the post-thawing quality of cattle bull semen. Asian Pac. J. Reprod. 4, 26–31 (2015).
Article Google Scholar
Woelders, H., Matthijs, A. & Engel, B. Effects of trehalose and sucrose, osmolality of the freezing medium, and cooling rate on viability and intactness of bull sperm after freezing and thawing. Cryobiology https://doi.org/10.1006/cryo.1997.2028 (1997).
Ahmad, E. & Aksoy, M. Trehalose as a cryoprotective agent for the sperm cells: A mini review. Anim. Heal. Prod. Hyg. 1, 123–129 (2012).
Google Scholar
Foote, R. H. & Kaprotht, M. T. Large batch freezing of bull semen: Effect of time of freezing and fructose on fertility. J. Dairy Sci. 85, 453–456 (2002).
Article CAS Google Scholar
Purdy, P. H. A review on goat sperm cryopreservation. Small Rumin. Res. 63, 215–225 (2006).
Article Google Scholar
Fowler, A. & Toner, M. Cryo-injury and biopreservation. Ann. N. Y. Acad. Sci. 1066, 119–135 (2006).
Article ADS Google Scholar
Hardeland, R., Reiter, R. J., Poeggeler, B. & Tan, D. The significance of the metabolism of the neurohormone melatonin: Antioxidative protection and formation of bioactive substances. Neurosci. Biobehav. Rev. 17, 347–357 (1993).
Article CAS Google Scholar
Li, C. et al. Detection of nerve growth factor (NGF) and its specific receptor (TrkA) in ejaculated bovine sperm, and the effects of NGF on sperm function. Theriogenology 74, 1615–1622 (2010).
Article CAS Google Scholar
Saeednia, S., Shabani Nashtaei, M., Bahadoran, H., Aleyasin, A. & Amidi, F. Effect of nerve growth factor on sperm quality in asthenozoosprmic men during cryopreservation. Reprod. Biol. Endocrinol. 14, 1–8 (2016).
Article Google Scholar
Storn, R. & Price, K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11, 341–359 (1997).
Article MathSciNet MATH Google Scholar
Awad, M. M. Effect of some permeating cryoprotectants on CASA motility results in cryopreserved bull spermatozoa. Anim. Reprod. Sci. 123, 157–162 (2011).
Article CAS Google Scholar
Gororo, E., Makuza, S. M., Chidzwondo, F. & Chatiza, F. P. Variation in sperm cryosurvival is not modified by replacing the cryoprotectant glycerol with ethylene glycol in bulls. Reprod. Domest. Anim. 55, 1210–1218 (2020).
Article CAS Google Scholar
Rota, A., Milani, C., Cabianca, G. & Martini, M. Comparison between glycerol and ethylene glycol for dog semen cryopreservation. Theriogenology 65, 1848–1858 (2006).
Article CAS Google Scholar
Mehta, V., Pareek, P., Kumar, A. & Purohit, G. N. Comparative effect of different concentrations of glycerol and ethylene glycol and temperature on cryopreservation of ram semen. Res. J. Vet. Pract. 8, 1 (2020).
Article Google Scholar
Swelum, A. A., Mansour, H. A., Elsayed, A. A. & Amer, H. A. Comparing ethylene glycol with glycerol for cryopreservation of buffalo bull semen in egg-yolk containing extenders. Theriogenology 76, 833–842 (2011).
Article CAS Google Scholar
Kowsar, R., Ronasi, S., Sadeghi, N., Sadeghi, K. & Miyamoto, A. Epidermal growth factor alleviates the negative impact of urea on frozen-thawed bovine sperm, but the subsequent developmental competence is compromised. Sci. Rep. 11, 1–13 (2021).
Article Google Scholar
Miller, W. J. & Vandemark, N. L. The influence of glycerol level, various temperature aspects and certain other factors on the survival of bull spermatozoa at sub-zero temperatures. J. Dairy Sci. 37, 45–51 (1954).
Article CAS Google Scholar
Bhat, M. H., Blondin, P., Vincent, P. & Benson, J. D. Low concentrations of 3-O-methylglucose improve post thaw recovery in cryopreserved bovine spermatozoa. Cryobiology 95, 15–19 (2020).
Article CAS Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing. (2021).
Ogle, D. H., Wheeler, P. & Dinno, A. FSA: Fisheries Stock Analysis. (2020).
Wellek, S. & Ziegler, P. EQUIVNONINF: Testing for Equivalence and Noninferiority. (2021).
Wellek, S. Testing Statistical Hypotheses of Equivalence. (Chapman and Hall/CRC, 2010).
Tu, F. Experimental and Computational Approaches to Optimizing Bovine Gamete Cryopreservation. (University Of Saskatchewan, 2021).

Download references

Acknowledgements

Funding for this research was provided in part by the National Science and Engineering Research Council (CRD PJ531082) and Mitacs (to JB). We acknowledge that this research occurred on the traditional territory of the Anishinabewaki , Ho-de-no-sau-nee-ga (Haudenosaunee), Omàmìwininìwag (Algonquin), Wendake-Nionwentsïo, Wabanaki (Dawnland Confederacy), N’dakina (Abenaki / Abénaquis), and Treaty 6 territory, the traditional territory of the Niitsítpiis-stahkoii (Blackfoot / Niitsítapi ), Cree, Michif Piyii (Métis), and Očhéthi Šakówiŋ. An earlier version of this work was published in the thesis of F. Tu⁵³. We thank the reviewers for taking time and effort in reviewing this paper. We also thank J. Kusch for providing comments and suggestions during the preparation of this manuscript.

Author information

Authors and Affiliations

Department of Computer Science, Memorial University of Newfoundland, St John’s, NL, Canada
Frankie Tu
Ro, Clinical Strategy, New York, NY, USA
Maajid Bhat
Semex Alliance, Saint-Hyacinthe, Canada
Patrick Blondin, Patrick Vincent & Mohsen Sharafi
Department of Biology, University of Saskatchewan, Saskatoon, Canada
Frankie Tu, Maajid Bhat, Mohsen Sharafi & James D. Benson

Authors

Frankie Tu
View author publications
You can also search for this author in PubMed Google Scholar
Maajid Bhat
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Blondin
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Vincent
View author publications
You can also search for this author in PubMed Google Scholar
Mohsen Sharafi
View author publications
You can also search for this author in PubMed Google Scholar
James D. Benson
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

F.T. consulted on media design, designed the M.L. algorithm, analyzed experimental data, contributed to writing the paper, prepared figures, reviewed and edited the paper. M.B. consulted on initial media components, collected experimental data, interpreted results, contributed to writing the paper, reviewed and edited the paper. P.B. & P.V. consulted on initial media components, coordinated sample collection, interpreted results, reviewed and edited the paper. M.S. collected experimental data, interpreted results, analyzed experimental data, prepared figures, contributed to writing the paper, reviewed and edited the paper. J.B. conceived of the project and developed experimental design, obtained funding, consulted on initial media components, coordinated sample collection, interpreted results, contributed to writing the paper, reviewed and edited the paper. All authors reviewed and approved the final draft.

Corresponding author

Correspondence to James D. Benson.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Tu, F., Bhat, M., Blondin, P. et al. Machine learning and hypothesis driven optimization of bull semen cryopreservation media. Sci Rep 12, 22328 (2022). https://doi.org/10.1038/s41598-022-25104-6

Download citation

Received: 23 August 2022
Accepted: 24 November 2022
Published: 25 December 2022
DOI: https://doi.org/10.1038/s41598-022-25104-6

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.