## Introduction

Sheep are relatively prolific small ruminants and an important source of animal protein contributing to human diets worldwide. Sheep gestation is relatively short (about 150 days) and litter sizes consisting of two or more offspring are common. As a result, sheep farm profitability is highly correlated to reproductive efficiency. Formally, reproductive efficiency for sheep farmers is expressed as the number of lambs born annually per ewe exposed to a ram at breeding. Breed type and prolificacy, nutrition, environment, age at first mating, conception rate, embryo and fetus viability, and flock age structure are some of the determining factors contributing to reproductive efficiency. However, outcomes of ewe fertility management can vary considerably among flocks. Identifying pregnant ewes and determining the number of fetuses they carry are key components of breeding management in sheep production1. Pregnancy testing during the critical early period of the mating season allows for re-breeding or the culling of non-pregnant ewes, resulting in increasing flock pregnancy rates2. If producers miss this opportunity, they can adjust their management practices by separating the open ewes from the pregnant mob to feed each group based on their physiological needs. Another benefit to early determination of pregnancy and litter size (PLS) is the acquisition of valuable data for selection and breeding purposes.

In addition to detecting pregnancy, predicting or determining litter size is instrumental to successful reproductive management. Maternal nutrition during gestation directly impacts ewe prolificacy3 as well as lamb survivability and performance. These lamb performance traits include growth4,5, reproductive capacity6 and hormonal development7. Thus, early detection of ewe PLS elevates income for producers by increasing the number of pregnant ewes and the number of healthy lambs born. Costs of production are reduced by preventing over-feeding of open ewes, and optimizing rations based on nutritional needs of the pregnant animals in an attempt to reduce the number of overweight singles, small-sized multiples and the incidence of pregnancy toxemia.

Ultrasonography is the gold standard and the most commonly performed method for PLS detection in sheep8. This method requires producers to either invest in an ultrasound machine and develop the appropriate skills for scanning or they must contract the services from a veterinarian. Ultrasound pregnancy detection is commonly practiced between 45 and 90 days into gestation9. However, detecting the number of fetuses is not straightforward and depends on the time of scanning as well as operator experience10. The breeding season is also a busy time for ultrasound professionals, limiting the number of farms they can serve. The cost of ultrasonography, currently CAD5–8/ewe in Alberta in Canada, also varies depending on flock size and geographical location of the farm. This makes ultrasonography more expensive for medium-to-small size flocks and those that are not conveniently accessible. In some jurisdictions, including the province of Alberta, delivering ultrasound services specifically for pregnancy diagnosis is restricted to veterinarian professionals, which limits its widespread use. Molecular biomarkers, such as proteins or metabolites found in blood, urine or milk, are a promising alternative for the indirect measurement or prediction of different traits in many livestock species11,12. Biomarkers are most suited for traits that have higher economic value. Likewise, biomarkers are particularly useful if the trait measurement needs to be performed within a short timeframe, or if the direct measurement of the trait involves lengthy trials, is labour-intensive, leads to loss of the animal or is expensive. While plasma progesterone (P4) levels can be used to detect sheep pregnancy as early as 18 days, P4 does not accurately detect open, non-pregnant ewes13,14. Likewise, there is no commercial kit that provides the service to farmers in any part of the world (including in Alberta). Recent literature indicates promising results when applying metabolomics to detect pregnancy in other livestock species11,15,16. However, there are no publications using high throughput metabolomics platforms to characterize non-hormonal metabolite biomarkers that can be used for sheep PLS detection in readily accessible biofluids at early stages of gestation. Therefore, a metabolomic study on early-stage sheep PLS detection is warranted. Livestock metabolomics is an emerging field that has led to the discovery of useful biomarkers in many livestock species12. However, only one study has used metabolomics to investigate non-hormonal metabolic changes during ewe pregnancy17. Most other metabolomic studies have measured hormones or individual metabolites associated with ewe pregnancy18,19,20,21,22. Previously, we have shown that metabolomics can be used to identify candidate blood biomarkers for detecting several economically important traits in sheep, such as residual feed intake and carcass merit23. Based on that success, we decided to investigate if blood biomarkers of sheep PLS could be identified and validated. Given the metabolic changes that occur due to pregnancy, we hypothesized that ewe pregnancy and the number of lambs delivered per pregnant ewe can be predicted at early stages of pregnancy using blood biomarkers. Therefore, the objectives of this study were to: (1) profile the blood metabolome associated with ewe PLS, and (2) identify and validate blood biomarkers of ewe PLS prior to 60 days of gestation. These findings could provide an alternative route for ewe pregnancy detection and enhance the reproductive management of sheep flocks. Indirect measurement of sheep PLS through blood biomarkers is also expected to increase the profitability of sheep production by reducing the proportion of open ewes during the breeding season. It will also improve the health and welfare of pregnant ewes through better nutritional management based on their pregnancy requirements. ## Results The results from our metabolomic studies on sheep PLS are divided into three sections. The first describes the changes detected in serum metabolite levels of ewes during different timepoints of pregnancy. The second (discovery phase) describes the identification of serum-based PLS biomarkers at different stages of pregnancy through pairwise comparisons of pregnant and non-pregnant ewes, as well as via pairwise comparisons of pregnant ewes with different litter sizes (based on pregnancy outcome). The third describes validation or replication of the PLS biomarkers identified at day 50 of gestation in the discovery phase on an independent (hold-out) larger cohort of ewes. ### Changes in the serum metabolome of ewes during pregnancy The first objective of this study was to comprehensively and quantitatively characterize the serum metabolome of ewes from seven days pre-breeding to 70 days post-breeding. The Livestock Metabolome Database (LMDB12) currently includes 375 compounds assigned to the sheep metabolome, 300 of which were previously reported and quantified in the serum/plasma metabolome of non-pregnant sheep. As there are no published reports regarding the serum metabolome of sheep during gestation we undertook a targeted, quantitative metabolomic analysis of sheep serum using two analytical platforms, NMR spectroscopy and LC–MS/MS. We were able to identify and quantify 107 metabolites with unique chemical structures in the serum of 131 pregnant/non-pregnant ewes over 5 different timepoints (the classification of these metabolites based on each platform is provided in Table 1). Details regarding the most significant longitudinal changes and most differentiating metabolites are described below. ### Identifying PLS biomarkers via pairwise metabolomic comparisons For the discovery phase of the study, the flocks were divided into six different groups based on their pregnancy and litter status (CNT = controls or open non-pregnant, PRG = pregnant, MLP = multiplet, SNG = singlets, TWN = twins, TRP = triplets). Each of the six groups were compared (pairwise) at each of the five different timepoints (7 days pre-breeding [− 7 day], day 0, 35, 50 and 70 post-breeding). In total 15 different pairwise comparisons were done over five timepoints (75 total comparisons). The outcomes from univariate and multivariate analyses of those comparison groups that yielded significant candidate biomarkers are presented in Tables 2, 3 and 4, respectively. The data show that as ewes progress through gestation, the serum metabolome of pregnant ewes compared to open ewes, as well as pregnant ewes with different litter sizes, significantly diverges. Moreover, within each group, the blood metabolome significantly (p-value < 0.05) differed between each timepoint as determined by two-way ANOVA. Over the five timepoints tested, day 50 and day 70 yielded the most promising results. In particular, the volcano plot and the partial least squares discriminant analysis (PLS-DA) plot identified statistically significant metabolites that differentiated each group within each comparison. T-test results were most significant for the last two timepoints (days 50 and 70) between the most divergent comparison groups (CNT vs PRG and CNT vs MLP). Based on these results we then focused on identifying serum candidate biomarkers at day 50 and day 70 of gestation. ### Longitudinal assessment of significant metabolites during pregnancy Longitudinal assessment of the t-test results (Table 2) revealed three significant metabolites (acetic acid, urea, and l-arginine) differentiating pregnant and open ewes at day 50 and day 70 of gestation. All the metabolites that were significantly different by day 50 (using a p-value threshold of < 0.05) for the CNT vs MLP groups were also significant in the CNT vs PRG comparison, except l-carnitine. Similarly, differentiating metabolites from day 70 (according to the t-test) of the CNT vs MLP groups were all similar to the CNT vs PRG group, except isoleucine. The similarities between these two comparisons were expected since the PRG group is composed of both MLP and SNG ewes. Longitudinal assessment of the volcano plots (Table 3) among all pairwise comparison groups revealed that acetic acid was significantly different between the CNT vs MLP groups from day 35 of gestation. However, acetic acid was only significantly different from day 50 for the CNT vs PRG groups. At day 70 post-breeding, choline was significantly different in all comparison groups except the TWN vs TRP groups. We also observed that comparison of CNT against PRG and MLP at later timepoints of gestation shared the largest number of metabolite similarities among other data sets and comparisons. Longitudinal assessment using PLS-DA and variable importance of projection (VIP; Table 4) showed that l-lysine and acetic acid were two of the 15 most differentiating metabolites throughout all timepoints of gestation (days 0, 35, 50 and 70) in the CNT vs MLP comparison. Three other metabolites (urea, 3-hydroxybutyric acid, and methanol) were also commonly observed in three of the four post-breeding timepoints (days 35, 50 and 70). Moreover, acetic acid and urea were the two highest scoring VIP metabolites on day 50 and day 70 in both the CNT vs PRG and CNT vs MLP comparisons. This further confirms the trend observed in univariate analyses and underlines how the CNT group, when compared against the PRG and MLP groups, typically shared more metabolic similarities in later pregnancy timepoints. Temporal trends were then investigated. For the CNT vs PRG comparison, one group of significantly altered metabolites at day 50 was identified (acetic acid, l-arginine, SM (OH) C24:1, lysoPC a C26:0, lysoPC a C26:1, tryptophan, C3 [propionylcarnitine], putrescine, trimethylamine N-oxide,), while another group was identified at day 70 (acetic acid, l-arginine, urea, glycine, dimethylamine, dimethyl sulfone, 3-hydroxybutyric acid, sarcosine, l-lysine). These metabolites were consistently identified by all statistical analyses. Temporal comparison of the CNT group against the MLP group at days 0 and 35 identified l-ornithine as a significantly altered metabolite. l-ornithine was found to be significant in all analyses for both timepoints. Acetic acid was another significantly altered metabolite at day 35. At day 50 of gestation, the metabolites that exhibited the greatest difference included acetic acid, l-arginine, tryptophan and carnosine. At day 70, nine other significantly altered metabolites were identified, including urea, l-arginine, choline, glycine, acetic acid, dimethylamine, formate, 3-hydroxybutyric acid, dimethyl sulfone and acetoacetate. In contrast, we did not identify any temporal pattern using univariate or multivariate statistical analyses of the SNG vs TRP groups or the TWN vs TRP groups. ### Candidate biomarkers of ewe pregnancy To identify candidate biomarkers of ewe pregnancy, we compared the CNT ewes against all other pregnant ewes regardless of their litter size (PRG). To seek further confirmation and examine the extremes in terms of litter size, we removed the SNG ewes from the PRG dataset and also compared the CNT and MLP ewes. The advantage of the latter comparison is that the outcome biomarkers could help inform producers not only if the animal is pregnant but also that the ewe is expected to deliver more than one lamb. A detailed summary of the results is presented in Table 5. An additional table (Supplementary Table 1) shows the average individual concentration values (at day 50) for each of the metabolites and conditions mentioned in Tables 2, 3, 4 and 5. A complete list of metabolites identified in sheep serum and their corresponding concentrations at all timepoints reported in this study is also listed in the open access LMDB (the Livestock Metabolome Database; www.lmdb.ca). We identified no statistically useful serum biomarkers until day 35 of gestation when comparing the CNT group with the PRG group. However, at day 50 of the CNT vs PRG comparison, we identified a panel of five metabolites (methanol, l-carnitine, d-glucose, l-arginine, and urea; with an area under the receiver operating characteristic curve (AU-ROC) = 0.76) that could serve as candidate biomarkers for detecting pregnant ewes. At day 70, we identified a panel of two metabolites for ewe pregnancy that had an AU-ROC of close to 1.0 with very high statistical significance (p-value < 0.001). Comparing the CNT and MLP groups, we identified no useful biomarkers at day-7, while the other four timepoints revealed potentially useful biomarkers. The AU-ROC value and statistical significance of the biomarkers improved substantially later in the gestation, i.e., at day 70. Among the different timepoints assessed, day 50 had the largest panel of biomarkers, and these biomarkers were identical to the candidate biomarkers found at day 50 of the CNT vs PRG comparison. Given the value of detecting PLS at the earliest timepoint in gestation, a logistic regression equation was developed for the candidate biomarkers found at day 50 using the CNT vs PRG comparison. This equation is given below: \begin{aligned} {\text{logit}}\left( {\text{P}} \right) \, & = {\text{ log}}\left( {{\text{P }}{/}\left( {{1} - {\text{P}}} \right)} \right) \, = { 1}.{599 } + { 1}.{217}\,{\textsc{l}}{\text{{-}arginine }} + { 2}.0{95}\,{\text{urea }} \\ & \quad + { 1}.{222}\,{\textsc{l}}{\text{{-}carnitine }} + \, 0.{137}\,{\text{methanol }}{-} \, 0.{5}0{5}\,{\textsc{d}}{\text{{-}glucose,}} \\ \end{aligned} (1) where P is the probability of pregnancy occurring with a cut-off of 0.81. Because the concentrations of the metabolites used in the CNT vs PRG comparison were sum normalized, log transformed and Pareto scaled, the metabolite values used in the equation must be adjusted. These adjustments are provided in Table 6. This same logistic regression equation was later used to predict the pregnancy status of ewes in the validation phase. ### Candidate biomarkers of ewe litter size Comparisons were made of CNT vs MLP groups (to identify pregnant ewes that deliver more than one lamb), SNG vs TRP groups (pregnant ewes that deliver a single or more than two lambs) and TWN vs TRP groups (pregnant ewes that deliver a twin or more than two lambs). A detailed summary of results is presented in Table 5. Candidate biomarkers were identified at all five timepoints for the SNG vs TRP comparison. This comparison revealed three to four candidate biomarkers at each timepoint with AU-ROC values varying from a low of 0.74 on day 0 to a high of 0.81 on day 70. All biomarkers were statistically significant except for the markers identified for day 35, which only had a statistical tendency. l-carnitine was the most frequently observed candidate biomarker, appearing at days − 7, 35 and 50. Since day 50 of gestation was the earliest timepoint to detect pregnancy, this timepoint was used to develop a logistic regression equation for the panel of candidate biomarkers (methionine and l-carnitine) of the SNG vs TRP comparison. This equation is given below: $${\text{logit}}\left( {\text{P}} \right) \, = {\text{ log}}\left( {{\text{P}}/\left( {{1} - {\text{P}}} \right)} \right) \, = \, 0.{211 }{-}{ 4}.{464}\,{\text{methionine }} + { 4}.{393}\,{\textsc{l}}{\text{{-}carnitine,}}$$ (2) where P is the probability of delivering more than two lambs with a cut-off of 0.70. Because the concentrations of the metabolites used in this study were median normalized, cube root transformed and Pareto scaled, the metabolite values must be adjusted. These adjustments are provided in Table 6. With regard to the TWN vs TRP group comparison, l-carnitine was also identified as the most frequently recurrent metabolite at all timepoints. For this comparison group, biomarkers at days − 7 and day 50 only had a statistical tendency, while other timepoints had statistically significant biomarkers. All AU-ROC values were below 0.80 and most panels consisted of a relatively larger number of metabolites. The candidate biomarkers (isobutyric acid, l-lactic acid, l-carnitine, valine, tyrosine, and methanol) identified for the TWN vs TRP comparison groups at day 50 of gestation were used to develop a logistic regression model as follows: \begin{aligned} {\text{logit}}\left( {\text{P}} \right) \, & = {\text{ log}}\left( {{\text{P }}{/} \left( {{1} - {\text{P}}} \right)} \right) \, = \, - 0.{124 } + \, 0.{4}0{6}\,{\text{isobutyric}}\,{\text{acid }}{-} \, 0.{388}\,{\textsc{l}}{\text{{-}lactic}}\,{\text{acid }} \\ & \quad {-} \, 0.{771}\,{\textsc{l}}{\text{{-}carnitine }} + \, 0.{593}\,{\text{valine }} + \, 0.{144}\,{\text{tyrosine }} + \, 0.{683}\,{\text{methanol,}} \\ \end{aligned} (3) where P is the probability of triplets over twins occurring with a cut-off of 0.57. Because the concentrations of the metabolites used in this study were sum normalized, cube root transformed and auto scaled, the metabolite values used in the equation must be adjusted. These adjustments are provided in Table 6. The above two equations were later used to predict litter size status of pregnant ewes in the validation phase. ### Validation phase Given that we determined the ideal time to assess PLS in ewes via serum metabolomics was at day 50 post-breeding, the sample collection for the validation phase was conducted only at day 50 of gestation. This section describes the validation of the same panel of day 50 candidate biomarkers, and the prediction of the validation dataset using the logistic regression equations developed in the discovery phase. In conducting this validation phase, we looked at approximately twice the number of samples analyzed in the discovery phase from commercial flocks located in different regions and under different management practices (in two of the top sheep producing provinces in Canada, Alberta and Ontario). ### Validated biomarkers of ewe pregnancy Statistical analyses of the validation dataset for the five candidate biomarkers of pregnancy (presented previously) improved the AU-ROC to ≥ 0.90 (Fig. 1) and the p-value to < 0.05 (Table 5). Methanol, l-carnitine, d-glucose, l-arginine, and urea were confirmed to be robust biomarkers to detect ewe pregnancy at day 50 of gestation. Supplementary Fig. 1 shows the boxplots for these five metabolites comparing their normalized/scaled values between pregnant and non-pregnant ewes. The same logistic regression model (Eq. 1) presented for the candidate biomarkers in the discovery phase was used to predict the pregnancy status of the validation dataset. This regression model was successful in making predictions with a sensitivity of 69% and a specificity of 85%. ### Validated biomarkers of ewe litter size The AU-ROC value for candidate biomarkers (methionine and l-carnitine) of SNG vs TRP improved from 0.78 in the discovery phase to 0.84 in the validation set (Fig. 2). This was accompanied by improved significance from a p-value < 0.05 to a p-value < 0.001 (Table 5). Therefore, methionine and l-carnitine appear to be robust biomarkers of ewe litter size. Supplementary Fig. 2 shows the box plots for these two metabolites comparing their normalized/scaled values between SNG and TRP pregnancies. The same logistic regression model (Eq. 2) developed in the discovery phase to distinguish SNG vs TRP was used in the validation dataset. The regression model was successful in predicting litter size (SNG vs. TRP) with a sensitivity of 56% and a specificity of 91%. The candidate biomarkers (isobutyric acid, l-lactic acid, l-carnitine, valine, tyrosine, and methanol) identified for the TWN vs TRP comparison also reached statistical significance with an improved AU-ROC of 0.81 (Fig. 3). These compounds were confirmed as robust biomarkers of ewe litter size. The same logistic regression model (Eq. 3) was used for the panel of candidate biomarkers of TWN vs TRP comparison groups developed in the discovery phase to predict the validation dataset. Supplementary Fig. 3 shows the box plots for these six metabolites comparing their normalized/scaled values between TWN and TRP pregnancies. This regression model was successful in predicting litter size (TWN vs. TRP) with a sensitivity of 66% and specificity of 85%. Biomarkers of pregnancy overlapped with those of the CNT versus MLP comparison groups indicating that if a ewe tests positive for the panel, not only is she pregnant but she is also expected to carry multiple fetuses. On the other hand, if the animal tests negative, she is not pregnant. To get a more precise measure of the litter size, further evaluation of the pregnant ewe’s blood using the other panels of litter size biomarkers will likely be required. Therefore, if a pregnant ewe tests positive for the triplet biomarker panel (methionine, l-carnitine), the ewe is expected to deliver more than two lambs while a negative test does not necessarily indicate that the ewe will deliver a single lamb. On the other hand, pregnant ewes that test negative for biomarkers of twin vs triplet biomarker panel (isobutyric acid, l-lactic acid, l-carnitine, valine, tyrosine, and methanol) are expected to deliver twins. ## Discussion Over the past decade, livestock metabolomics research has gained considerable momentum. Currently the number of papers being published on the subject is almost doubling every 2 years. However, sheep metabolomics is still lagging behind the research activities for other livestock species such as cattle and pigs. For this reason, we focused on further characterizing the sheep metabolome and identifying candidate biomarkers associated with production traits of high economic value such as residual feed intake, carcass merit23 and reproductive performance. In this study, we examined sheep serum using NMR and LC–MS/MS-based metabolomics to identify robust and useful metabolite biomarkers of PLS. The initial step involved profiling the sheep serum metabolome during the first half of pregnancy. In doing so, we identified and quantified a total of 107 serum metabolites. Although no new sheep serum metabolites were identified (after comparison to the data in the LMDB12), the proportion of quantified sheep serum metabolites in the LMDB were increased from 49 to 52%. Data from this experimental work also adds to the reference values obtained from healthy pregnant sheep in the LMDB. Moreover, the study provides quantitative information about the metabolic dynamics of the ewe serum metabolome from seven days prior to breeding to day 70 of gestation. These data are now publicly accessible in the LMDB (www.lmdb.ca). The central objective of this study was to identify serum metabolite biomarkers for sheep PLS using high-throughput, quantitative metabolomic platforms. As far as we are aware, this is the first study to identify non-hormonal metabolite biomarkers of both pregnancy and litter size, and to provide logistic regression models to predict pregnancy status in domestic sheep. It is important to note, however, that there are other compounds or biomarkers that have shown promise for assessing ewe PLS. These include genes, proteins and metabolites, some of which are described below. ### Previously identified PLS biomarkers Efforts to identify specific gene transcript levels and genetic markers for sheep PLS have been previously described. For example, changes in the expression levels of the interferon-tau-stimulated gene in the thymus24 and endometrium25 have been found to signal pregnancy at early gestation. There are also a number of studies on genes responsible for sheep litter size26. The Booroola gene, located on ovine chromosome 6, has a major impact on ovulation rate and is a major determining factor for litter size in sheep27. This gene has at least 23 different variants. Certain Booroola variants increase follicle sensitivity to the follicle-stimulating hormone, thereby inducing a faster follicle maturation28. Moradband et al.29 found that heterozygotes in the Iranian Baluchi sheep breed had increased the litter size. Ewes that are homozygous for the variant almost double their ovulation rate. However, their lambs have a low survival rate with a lower growth rate and weaning rate28. The Booroola gene is associated with the bone morphogenetic protein receptor 1B (BMPR-1B26). Increased blood concentrations of the BMPR-1B protein have been reported to benefit follicular development, yielding better ovulation and increased litter size30. A separate study that evaluated proteins in the follicular fluid (FF) of ewes found that the FF of larger follicles compared to smaller follicles had increased glucose and cholesterol concentrations, but lower concentration of triglycerides, lactate, alkaline phosphatase and lactate dehydrogenase31. These metabolites and proteins appear to be correlated with ovulation rate, suggesting their relevance to prolific ewes and the litter they carry. In another study, Koch et al.32 used MS-based proteomics to identify 15 signature proteins from the uterine luminal fluid of ewes as indicators of pregnancy and involved with embryonic growth, immune regulation and nutritional needs. As yet, none of these protein markers have been rigorously validated by ROC curve analysis and none are commercially used in sheep PLS testing. Another example of a protein biomarker in pregnant ewes is the pregnancy-associated glycoprotein (PAG). The PAG is a placental-secreted factor that is detected in maternal serum upon implantation of the fetus onto the endometrium. This protein can be measured as early as 30 days in gestation33, with increasing concentrations as the pregnancy progresses34. Pregnancy specific protein B (PSPB is a form of PAG that is released by the fetus to maintain the corpus luteum35. Also, PSPB along with other PAGs increases with increasing number of fetuses carried by the ewe (Pickworth et al. 36). However, PSPB is breed-specific (Redden and Passavant37) which limits its application for all sheep breeds. Generally, PAGs are also positively correlated with maternal serum P4 levels34. In a study by Karen et al.13, blood PAG had 93.5% sensitivity for detecting pregnancy at day 22 of gestation, however, their results were skewed by the abnormally low (17%) pregnancy rate of the flock. In addition to genetic and protein biomarkers of sheep PLS, a number of metabolite biomarkers have also been explored. Progesterone is a promising example of a hormonal metabolite biomarker that could be used for assessing sheep PLS. Progesterone is predominantly produced by the CL at the beginning of gestation and later (day 50 onwards) is produced by the placenta to maintain the pregnancy34,38. The concentration of P4 in ewe blood increases over the course of gestation and has been used as an indicator of pregnancy, as well as placental and fetal wellbeing34. However, identifying ewe PLS through measurements of P4 concentrations at around days 50–80 of gestation has a sensitivity varying between 65 and 85% and a specificity between 65 and 93%21,39. While potentially promising, blood P4 concentrations are not considered sufficiently accurate indicators of non-pregnant ewes13 and are not useful for differentiating ewes based on litter size21. Furthermore, LC–MS-based metabolomic analysis of a panel of eight steroid hormones, including P4, in our own sheep serum samples (n = 94), showed no improvement in biomarker performance when using P4 independently or in combination with non-hormonal metabolites to detect sheep PLS status (unpublished data). Another steroid hormone, estradiol, has also been used for detecting litter size after 50 days into gestation40. Despite P4 and estradiol being significant reproductive hormones and associated with ewe PLS, to date there is insufficient evidence and validation based on ROC analysis or regression modeling to make these hormones useful for assessing sheep PLS status41. Other (non-hormonal) metabolites have also been identified as potential pregnancy markers in other livestock species. A recent study of pregnant buffaloes identified five milk metabolites detected by LC–MS on day 18 after artificial insemination as candidate biomarkers of pregnancy15. Likewise, in beef cattle, four plasma metabolites were detected by NMR at day 40 of gestation16. These reports suggest that measurement of non-hormonal metabolites may serve as an indirect means of pregnancy and/or litter size detection in ruminants. To date, few studies have reported non-hormonal metabolites associated with sheep PLS. Sun et al.17 used NMR to investigate pregnant ewe metabolism in relation to in utero fetal growth at four timepoints from day 50 of gestation onwards. They reported 13 serum metabolites that are associated with protein and lipid metabolism of twin-bearing pregnant ewes. In another study using MS-based analysis of FF and ovarian vein serum in the Han sheep breed42, a total of eight metabolites (glucose 6-phosphate, glucose 1-phosphate, aspartate, asparagine, glutathione oxidized, cysteine-glutathione disulfide, γ-glutamylglutamine, and 2-hydroxyisobutyrate) were significantly associated with ewe litter size. Another recent metabolomic study using LC–MS/MS revealed that sphingolipid and amino acid metabolism is important for maintaining the uterine environment to increase embryo survival rate43. In addition to these studies, there are a few other reports that measured individual metabolites in pregnant sheep18,19,20,22. None of these studies identified or rigorously assessed the reported metabolites as robust PLS biomarkers. Overall, existing data suggests that individual genes, proteins and metabolites may be useful for assessing sheep PLS. However, as of yet, there have been no metabolomic studies that have attempted to rigorously identify and validate a panel of readily accessible non-hormonal metabolite blood biomarkers for assessing sheep PLS. A common feature of the serum biomarkers presented in this study is that all are detectable by NMR spectroscopy. While the identification and validation of a set of useful sheep PLS biomarker panels was our primary interest in this study (see Table 5), we also believe it is important to provide some biological context and to suggest how some of these metabolites may play a role in sheep pregnancy. Indeed, the biological role of some of these metabolites appears to tie in with the reproductive physiology of sheep. However, some metabolites have not previously been identified as having a role in pregnancy, litter size or gestation and so it is difficult to understand their biological context. The following section further discusses the known biological relevance of each metabolite biomarker identified in this study. It also elaborates on the potential impact that these biomarkers may have for the sheep industry. ### Potential biological roles of the PLS biomarkers identified in this study L-arginine is an essential amino acid that is known to be important for successful pregnancy. At day 50 of gestation, l-arginine was significantly (p-value < 0.05; Table 2) elevated in pregnant ewes (214 ± 85 µM) relative to non-pregnant controls (174 ± 78 µM). Arginine appears to play a role in a number of physiological pathways related to pregnancy. Luther et al.44 provided pregnant ewes with l-arginine supplementation and observed enhanced ovarian function along with elevated numbers of viable fetuses. The same study identified a direct positive correlation between l-arginine and P4, leading to improved pregnancy maintenance and early embryonic growth. Our results appear consistent with these reports and show that pregnant ewes as well as ewes that delivered more lambs had a higher serum concentration of l-arginine. Furthermore, maternal administration of this amino acid in the later portion of gestation has been shown to increase lamb birth weight, enhance blood flow and increase nutrient transport to the fetus through synthesis of nitric oxide45,46. l-arginine also improves pancreatic and brown adipose tissue growth during fetal development47, and increases post-partum brown fat storage and the survivability of female lambs48. Serum l-arginine is associated with improved post-partum weaning weight and the weaning rate of lambs49. Administering this amino acid to prolific ewes further improves the lambing rate by nearly 60%, increases lamb birth weight by over 20% without negatively impacting maternal body weight, and decreases lamb mortality rate at birth by more than 20%50. Another metabolite identified as a strong biomarker of litter size was urea. At day 50 of gestation, the average urea concentration was significantly (p-value < 0.001) lower in pregnant ewes (1823 ± 667 µM) compared to open ewes (2518 ± 871 µM). Urea is a source of nitrogen for rumen microbes and is produced through the degradation of amino acids. Elevated blood concentration of urea in ewes seems to reduce conception and pregnancy rate51. Likewise, high concentrations of circulating urea have adverse impacts on embryonic development52. Our results are in agreement with these findings as pregnant ewes as well as ewes with a greater litter size have a lower concentration of blood urea compared to non-pregnant ewes. One of the more interesting biomarkers we identified for litter size was methionine. We found that the average methionine serum concentration was significantly lower (28 ± 9 µM, p-value < 0.001) in pregnant ewes that delivered more than two lambs compared to ewes that delivered just one lamb (33 ± 9 µM). Methionine is an essential amino acid that plays an important role in general animal performance53, as well as the growth and development of lambs in early life54. Methionine is also a methyl group supplier for epigenetic alteration of DNA, especially in late gestation55. Indeed, Sinclair and associates56 reported widespread epigenetic alterations in progeny, mostly male lambs, resulting from restricted supply of dietary methionine to the pregnant dam. Alterations to the genome induced by metabolites such as methionine are responsible for modification of health-related phenotypes, cell growth, host immunity, and protein production56,57,58,59. l-lactic acid is another biomarker of litter size that is traditionally associated with muscle metabolism. However, during pregnancy its concentration increases with the progression of gestation60. Average l-lactic acid concentration was significantly higher (3293 ± 1948 µM, p-value = 0.01) in pregnant ewes that delivered more than two lambs compared to ewes that delivered only two lambs (2432 ± 989 µM). Lactate can be used as an alternative source of energy by the fetal brain61. Therefore, a ewe with a higher number of fetuses is expected to have a higher concentration of serum l-lactic acid. Valine is another biomarker we found to be associated with ewe litter size, and it decreased with increasing number of lambs. The average valine serum concentration was significantly higher (219 ± 74 µM, p-value = 0.007) in TWN versus TRP (191 ± 64 µM) pregnant ewes. This metabolite is a branched-chain amino acid that stimulates protein synthesis in fetal muscle62,63. Therefore, ewes that deliver three or more lambs and have an overall higher fetal protein synthesis compared to those that deliver twins are expected to have a higher utilization of this amino acid and lower concentration in the serum. Branched-chain amino acids are also integral to the immune system by supporting the growth of lymphocytes and natural killer cells to remove viral infections64. Pregnant ewes are more prone to immune challenges and an increased number of fetuses increases immune vulnerability of the ewe65,66. Therefore, ewes that have the largest litter size, i.e., triplets vs twins, are expected to draw more valine from the maternal serum, which aligns with our results. ### Comparison to ultrasonography The current gold standard for sheep PLS assessment is ultrasonography. Ultrasound is mostly used to determine pregnancy status (open vs pregnant). However, certain experienced ultrasound operators can detect the number of fetuses in pregnant ewes as early as approximately 40–45 days of pregnancy and onwards (based on industry data in Canada). In fact, our field observations indicate that most Canadian ultrasound technicians identify litter size as one fetus or more than one. Ultrasound scanning is relatively rapid (2–5 min/ewe) and costs CAD5–8/ewe (depending on the location of the farm, travel required for the operator to reach the farm, and the number of ewes being scanned). All sheep used in this study were characterized via ultrasound analysis by trained technicians at day 50 of pregnancy.

Using records from 166 ewes with complete data from ultrasound scanning and corresponding pregnancy outcome, we determined that the sensitivity of ultrasound was 0.55, the specificity was 0.70 and the AU-ROC of using ultrasonography for pregnancy detection was 0.65. With regard to ultrasonography results for litter size, we found that for distinguishing SNG vs TRP, the sensitivity was 0.51 while the specificity was 0.18. With regard to distinguishing TWN vs TRP, the sensitivity of ultrasonography was 0.43 while the specificity was 0.18. It is noteworthy that the consistency of ultrasound prediction varied between farms mainly due to the expertise and experience of the technician who tended to underestimate singles and triplets while overestimating twins. Comparing our metabolomics results to these ultrasound measurements (Table 7) serum metabolite markers performed better than ultrasonography by 24% in terms of AU-ROC, 20% in terms of sensitivity, and 18% in terms of specificity for detecting ewe pregnancy. Likewise, if we compare our predictive biomarker panels for detecting litter size against ultrasonography, metabolite panels performed 9–35% better in terms of sensitivity and nearly 80% better in terms of specificity for predicting litter size. These results indicate serum metabolite measurements are significantly more accurate than ultrasound in detecting and assessing sheep PLS in this study.

In order for any alternative tool to compete with ultrasound for sheep PLS assessment, it would have to be either cheaper, more accurate, more convenient or able to detect PLS at earlier gestational timepoints. The metabolite panels identified in this study are more accurate, however, could they compete with the cost of ultrasound? Ultrasound tests cost between CAD$5–8 per ewe, for those producers who can access ultrasound technicians. Currently metabolite tests consisting of three or four metabolites conducted on MS instruments can be done for as little as CAD$5 per sample (excluding shipping costs). These costs can be reduced further if testing were to be optimized or more widespread. If the metabolite tests could be converted to a handheld device (such as a lateral flow assay or a simple colorimetric test) for pen side testing, then both the lower cost (perhaps as little as \$3 a test) and improved convenience would make these sorts of blood tests very appealing to producers. These biomarkers have a better performance when it comes to predicting larger litter sizes in pregnant ewes. Even if we assume that these biomarkers perform comparably to ultrasound, the cost of the blood test would not vary (as it does for ultrasound scanning) based on flock size and geographical location of the farm. This would permit farms with smaller flocks and farms located in remote areas to benefit from blood-based PLS detection. If serum markers could be found effective much earlier in gestation (say at day 25 or 35) with a sensitivity or specificity that is comparable to ultrasound, then the potential of a blood test for sheep PLS would be even greater.

### Future prospects

We have shown that targeted, quantitative metabolomics technologies can be used to discover and validate serum metabolite biomarkers of sheep pregnancy and litter size. Using a large cohort of samples collected from multiple commercial flocks across Canada, we successfully identified four panels of biomarkers that can determine ewe PLS with good accuracy and precision. The performance of these markers appears to exceed that seen with ultrasound measurements within the context of this experiment. Therefore, we believe that if these biomarkers could be further optimized (for high throughput off-site assays) or translated to hand-held or pen-side tests (similar to the urine-based pregnancy detection kit for women), they could be used to routinely assess PLS in Canadian sheep flocks. We are working on developing a pen-side kit, using the panel of five biomarkers identified and validated in this study, to detect ewe pregnancy 50 days into gestation. If producers require the exact number of the litter size, a second test incorporating the two panels of biomarkers reported here could also be developed. In conclusion, translating these results for on-farm, pen-side use could significantly improve reproduction management and profitability of sheep breeding enterprises.

## Methods

All animal procedures were approved by the University of Alberta Animal Care Committee (AUP00002510) and all methods were carried out in accordance with relevant guidelines and regulations. Moreover, all methods associated with animal experiments are in accordance with the ARRIVE guidelines (https://arriveguidelines.org).

### Experimental design

The experiments were designed in two phases: (1) a discovery phase to identify candidate serum biomarkers of ewe pregnancy and litter size at the earliest timepoint in gestation, and (2) a validation phase to validate the candidate biomarkers using a sample size approximately two times larger than that used in the discovery phase.

### Discovery phase sampling

In the discovery phase, ewes were selected from two farms (Olds College and a private farm) in Alberta, Canada, consisting of Suffolk × Dorset crosses (n = 91) and Rideau Arcott (n = 152) ewes, respectively. Blood was drawn from all animals over five timepoints throughout this phase, including seven days prior to exposing the ewes to rams (day − 7), day 0 (day of ram turnout for breeding), days 35, 50 and 70 of gestation (Fig. 4A). These animals were synchronized for estrus and the number of lambs delivered was recorded.

Based on the pregnancy outcome of all the animals included in this phase, two broad groups (Fig. 4B) were formed for statistical analyses: controls (CNT; n = 32) composed of non-pregnant, open ewes, and pregnant ewes (PRG) that delivered one or more lambs (n = 99). The CNT animals were comprised of ewes that were bred and did not deliver any lambs (n = 9) as well as the negative controls (n = 23) which were not exposed to rams. We divided the PRG animals to form three subgroups including ewes that delivered a single lamb (SNG; n = 30), ewes that delivered a twin (TWN; n = 36) and those that delivered a triplet or more (TRP; n = 33). The remaining ewes (n = 112) were not included in the analyses due to poor sample collection, missing data, and/or the producer’s decision to cull the animal.

### Animal feed

During the discovery phase, the Olds College ewes were group-housed outdoors and fed a ration of grass mix alfalfa hay with whole barley grain and a mineral supplement. Ewes at the private farm were group housed indoors in a climate-controlled barn and fed corn silage with supplemental mineral and vitamin. Initially, it was assumed all animals were pregnant with twins, and the feed rations were formulated using the SheepBytes program (https://www.sheepbytes.ca/) in compliance with National Research Council recommendations67. Each ewe received nutrients based on live weight of 70–75 kg (equivalent to 1.51 Mcal net energy maintenance) in early gestation.

### Estrus synchronization and breeding management

All ewes were synchronized with progesterone-bearing controlled internal drug release (CIDRs; Zoetis Canada Inc.) 14 days prior to ram turn out for breeding. To install the CIDRs, ewes were first lined in the chute and then the CIDR was inserted into the applicator by folding its wings and the tip of the applicator was gently lubricated to facilitate insertion of the device into the ewe. If the vulva appeared to be dirty, it was cleaned prior to implanting the CIDR. The applicator was then gently inserted into the vagina to release the CIDR. The applicators were disinfected between each use by dipping in a warm water and iodine solution.

Upon CIDR removal, ewes received pregnant mare serum gonadotropin (NOVORMON™, Syntex S.A., Buenos Aires, Argentina) by intramuscular injection in the rump (1 ml/ewe for the prolific Rideau Arcott breed and 2 ml/ewe for the Suffolk x Dorset crosses).

All ewes, except for the CNT group, were then grouped with the breeding rams at a ratio of no more than 10 ewes per ram. Ram turnout at the Alberta private farm location occurred on November 4th, 2017, with ewes lambing between March 29th and April 5th, 2018. Ram turnout at the Olds College location occurred on October 4th and 11th, 2017 (groups A and B, respectively), with ewes lambing between February 26th and March 28th, 2018. Lambing at each location was observed and recorded by farm staff.

### Laparoscopic reproductive examination

A subset of the negative controls was examined at day 50 of gestation using laparoscopy to visually observe and approve ovarian health. Animals were restrained using a cradle and anesthetized by intravenous injection of a combined sedative of 0.6 mg/mL xylazine (Vetoquinol Canada Inc., ON, Canada) and 2 mg/mL Ketamine (Vetoquinol Canada Inc., ON, Canada). Once on the cradle, the anesthetized ewe was lifted from its rear, bringing the back two legs up while the head and front two legs are down. Approximately six inches from each teat was clipped and cleaned with a 4% chlorhexidine scrub (Ceva Animal Health Inc., ON, Canada) and 99% isopropyl alcohol. The clipped areas provided a point of entry for the scope on one side and a cannula on the other. A moderate amount of CO2 was introduced into the abdominal cavity through a trocar going into one of the clipped points. The laparoscope was introduced into the cannula to see the ovaries. The ovaries of all open ewes were observed and approved by a veterinarian as reproductively sound and not showing any apparent abnormalities. The cannulas were then removed and the skin was stapled to close the two holes. The animals were gently rolled off the cradle and within five minutes they were relieved from the anesthesia. All utensils were maintained and cleaned in a dilute iodine solution (West Penetone Inc., QC, Canada) between each animal examination.

### Ultrasound diagnosis

All bred ewes were trans-abdominally scanned (Sonosite M-Turbo ultrasound machine, FUJIFILM Sonosite Inc., ON, Canada) for pregnancy and litter detection while standing in a chute at day 50 of gestation by an experienced technician for each province. Certified technicians reported pregnancy as open (no detectable fetus present), single (detection of only one fetus), twins (detection of two fetuses), and triplets or more (detection of more than two fetuses). All ultrasound assessments were reconciled with the actual lambing records from each flock.

### Validation phase sampling and feeding

During the validation phase, ewes were selected from two farms in Alberta (Suffolk and Canadian Arcott crosses at Lakeland College [n = 65], and Suffolk crosses at a private farm [n = 12]) and two farms in Ontario (Rideau Arcotts and Suffolk crossed with Rideau Arcott at private farm one [n = 55], and Dorset and Rideau Arcott crosses at private farm two [n = 111]). Each farm used “typical” Canadian feed rations. In particular, sheep were fed either (1) grass-legume hay mixtures with grain (barley or corn) or (2) corn silage or haylage. Specifically, one farm in Ontario and one farm in Alberta fed silage/haylage, whereas one farm in Alberta and one farm in Ontario fed the hay and grain mix. Based on the discovery phase results, blood was only drawn from all animals at a single timepoint (day 50 of gestation). All ewes were naturally mated to the rams at a ratio of 10:1, none of which were synchronized for estrus. All ewes had their lambing outcome recorded and categorized similar to the discovery phase (i.e., CNT, PRG, SNG, TWN and TRP).

### Blood collection and processing

Blood samples from all ewes of both phases (discovery and validation) were drawn from the jugular vein. Samples were collected using 21-gauge needles (PrecisionGlide®, USA) and vacutainers coated with no anticoagulant (BD Vacutainer, USA) for a maximum volume of 10 mL. Blood samples were kept on ice upon collection for a maximum of 30 min. Samples were then centrifuged (Beckman Coulter, USA) for 30 min at 17,700 rpm at 4 °C. The supernatant serum was then transferred to Eppendorf tubes (Axygen, USA) and snap frozen using liquid nitrogen. Frozen serum samples were labelled and stored at − 80 °C until used for metabolomic analyses.

### Metabolomics experiments

All ewe serum samples were analyzed using nuclear magnetic resonance (NMR) spectroscopy and liquid chromatography tandem mass spectrometry (LC–MS/MS). A thorough description of sample preparation and analysis methods for each platform is provided in Goldansaz et al.23. In brief, for the NMR analysis, all serum samples were filtered using a 3 kDa ultrafiltration device to remove the macromolecules (i.e., proteins and lipoproteins). A total sample volume of 250 µL (including the serum and buffer solution) was introduced to a 700 MHz Avance III (Bruker, USA) spectrometer equipped with a 5 mm HCN Z-gradient pulsed-field gradient cryoprobe. The 1D 1H-NMR spectra were then collected, processed and analyzed using methods previously described and a modified version of the Bayesil automated NMR analysis software package68. For the LC–MS/MS metabolomic analysis, serum samples were analyzed using an in-house quantitative metabolomics kit (called TMIC Prime) run on an Agilent 1260 series UHPLC system (Agilent Technologies, Palo Alto, CA) coupled with an AB SCIEX QTRAP® 4000 mass spectrometer (Sciex Canada, Concord, Canada). A detailed description of the methods, kit design, workflow and data analysis is given in Goldansaz et al.23.

### Statistical analyses

To conduct a standard categorical analysis and identify the relevant serum PLS biomarkers, we categorized the animals into six different groups based on their pregnancy outcome (i.e., CNT, PRG, SNG, TWN, TRP, MLP). Metabolomic datasets from the two platforms were pre-processed and normalized using standard methods available via MetaboAnalyst 4.069. Metabolites that had > 20% missing values were removed from the dataset prior to statistical analyses. Univariate and multivariate statistical analyses, including fold change, student’s t-test, volcano plot analysis, and partial least squares discriminant analysis (PLS-DA) were conducted using MetaboAnalyst. The PLS-DA plot helped visualize the separation of each animal group based on their corresponding serum metabolome, and its significance was verified using permutation testing (n = 1000). The PLS-DA analyses that were significant were also evaluated for the top 15 VIP features, revealing those metabolites that had the most significant contribution to separating the comparison groups. Biomarkers were identified and evaluated using the biomarker module in MetaboAnalyst 4.069. This module automatically selects subsets of statistically significant metabolites (initially identified via PLS-DA analysis and validated by permutation analysis using n = 1000 and p < 0.05). This module then performs a series of logistic regression calculations on the normalized and scaled metabolite concentration values and calculates the ROC curves as well as the AU-ROC values to identify the optimal set of biomarkers. Individual or multiple metabolite profiles with an AU-ROC ≥ 0.70 were considered as candidate biomarkers for each trait. The threshold for statistical significance reported in this manuscript is a p-value < 0.05 and a Benjamini–Hochberg false discovery rate (or Q-value) < 0.05, unless otherwise mentioned. Also, a 0.05 < p-value < 0.10 is referred to as a tendency while, differences with a p-value > 0.10 are referred to as not significant.