Predictive blood biomarkers of sheep pregnancy and litter size

Early detection of sheep pregnancy and the prediction of how many lambs a pregnant ewe delivers affects sheep farmers in a number of ways, most notably with regard to feed management, lambing rate, and sheep/lamb health. The standard practice for direct detection of sheep pregnancy and litter size (PLS) is ultrasonography. However, this approach has a number of limitations. Indirect measurement of PLS using blood biomarkers could offer a simpler, faster and earlier route to PLS detection. Therefore, we undertook a large-scale metabolomics study to identify and validate predictive serum biomarkers of sheep PLS. We conducted a longitudinal experiment that analyzed 131 serum samples over five timepoints (from seven days pre-conception to 70 days post-conception) from six commercial flocks in Alberta and Ontario, Canada. Using LC–MS/MS and NMR, we identified and quantified 107 metabolites in each sample. We also identified three panels of serum metabolite biomarkers that can predict ewe PLS as early as 50 days after breeding. These biomarkers were then validated in separate flocks consisting of 243 animals yielding areas-under-the-receiver-operating-characteristic-curve (AU-ROC) of 0.81–0.93. The identified biomarkers could lead to the development of a simple, low-cost blood test to measure PLS at an early stage of pregnancy, which could help optimize reproductive management on sheep farms.


Results
The results from our metabolomic studies on sheep PLS are divided into three sections. The first describes the changes detected in serum metabolite levels of ewes during different timepoints of pregnancy. The second (discovery phase) describes the identification of serum-based PLS biomarkers at different stages of pregnancy through pairwise comparisons of pregnant and non-pregnant ewes, as well as via pairwise comparisons of pregnant ewes with different litter sizes (based on pregnancy outcome). The third describes validation or replication of the PLS biomarkers identified at day 50 of gestation in the discovery phase on an independent (hold-out) larger cohort of ewes.
Changes in the serum metabolome of ewes during pregnancy. The first objective of this study was to comprehensively and quantitatively characterize the serum metabolome of ewes from seven days prebreeding to 70 days post-breeding. The Livestock Metabolome Database (LMDB 12 ) currently includes 375 compounds assigned to the sheep metabolome, 300 of which were previously reported and quantified in the serum/ plasma metabolome of non-pregnant sheep. As there are no published reports regarding the serum metabolome of sheep during gestation we undertook a targeted, quantitative metabolomic analysis of sheep serum using two analytical platforms, NMR spectroscopy and LC-MS/MS. We were able to identify and quantify 107 metabolites with unique chemical structures in the serum of 131 pregnant/non-pregnant ewes over 5 different timepoints (the classification of these metabolites based on each platform is provided in Table 1). Details regarding the most significant longitudinal changes and most differentiating metabolites are described below.

Identifying PLS biomarkers via pairwise metabolomic comparisons.
For the discovery phase of the study, the flocks were divided into six different groups based on their pregnancy and litter status (CNT = controls or open non-pregnant, PRG = pregnant, MLP = multiplet, SNG = singlets, TWN = twins, TRP = triplets). Each of the six groups were compared (pairwise) at each of the five different timepoints (7 days pre-breeding [− 7 day], day 0, 35, 50 and 70 post-breeding). In total 15 different pairwise comparisons were done over five timepoints (75 total comparisons). The outcomes from univariate and multivariate analyses of those comparison groups that yielded significant candidate biomarkers are presented in Tables 2, 3 and 4, respectively.
The data show that as ewes progress through gestation, the serum metabolome of pregnant ewes compared to open ewes, as well as pregnant ewes with different litter sizes, significantly diverges. Moreover, within each group, the blood metabolome significantly (p-value < 0.05) differed between each timepoint as determined by two-way ANOVA. Over the five timepoints tested, day 50 and day 70 yielded the most promising results. In particular, the volcano plot and the partial least squares discriminant analysis (PLS-DA) plot identified statistically significant metabolites that differentiated each group within each comparison. T-test results were most significant for the last two timepoints (days 50 (Table 3) among all pairwise comparison groups revealed that acetic acid was significantly different between the CNT vs MLP groups from day 35 of gestation. However, acetic acid was only significantly different from day 50 for the CNT vs PRG groups. At day 70 post-breeding, choline was significantly different in all comparison groups except the TWN vs TRP groups. We also observed that comparison of CNT against PRG and MLP at later timepoints of gestation shared the largest number of metabolite similarities among other data sets and comparisons. Table 2. Student's t-test of four comparison groups from the discovery dataset. Statistical analysis using t-test revealed significant (p-value < 0.05) serum metabolites of each comparison at five timepoints during the discovery phase. NS Not significant, CNT control open ewes, PRG pregnant ewes, SNG pregnant ewes that delivered one lamb, TWN pregnant ewes that delivered two lambs, TRP pregnant ewes that delivered more than two lambs. Day − 7 refers to 7 days prior to initiation of gestation and day 0 is the start of pregnancy.   Table 4) showed that l-lysine and acetic acid were two of the 15 most differentiating metabolites throughout all timepoints of gestation (days 0, 35, 50 and 70) in the CNT vs MLP comparison. Three other metabolites (urea, 3-hydroxybutyric acid, and methanol) were also commonly observed in three of the four post-breeding timepoints (days 35, 50 and 70). Moreover, acetic acid and urea were the two highest scoring VIP metabolites on day 50 and day 70 in both the CNT vs PRG and CNT vs MLP comparisons. This further confirms the trend observed in univariate analyses and underlines how the CNT group, when compared against the PRG and MLP groups, typically shared more metabolic similarities in later pregnancy timepoints.
Temporal comparison of the CNT group against the MLP group at days 0 and 35 identified l-ornithine as a significantly altered metabolite. l-ornithine was found to be significant in all analyses for both timepoints. Acetic acid was another significantly altered metabolite at day 35. At day 50 of gestation, the metabolites that exhibited the greatest difference included acetic acid, l-arginine, tryptophan and carnosine. At day 70, nine other significantly altered metabolites were identified, including urea, l-arginine, choline, glycine, acetic acid, dimethylamine, formate, 3-hydroxybutyric acid, dimethyl sulfone and acetoacetate. In contrast, we did not identify any temporal pattern using univariate or multivariate statistical analyses of the SNG vs TRP groups or the TWN vs TRP groups.
Candidate biomarkers of ewe pregnancy. To identify candidate biomarkers of ewe pregnancy, we compared the CNT ewes against all other pregnant ewes regardless of their litter size (PRG). To seek further confirmation and examine the extremes in terms of litter size, we removed the SNG ewes from the PRG dataset and also compared the CNT and MLP ewes. The advantage of the latter comparison is that the outcome biomarkers could help inform producers not only if the animal is pregnant but also that the ewe is expected to deliver more than one lamb. A detailed summary of the results is presented in Table 5. An additional table (Supplementary  Table 1) shows the average individual concentration values (at day 50) for each of the metabolites and conditions mentioned in Tables 2, 3, 4 and 5. A complete list of metabolites identified in sheep serum and their corresponding concentrations at all timepoints reported in this study is also listed in the open access LMDB (the Livestock Metabolome Database; www. lmdb. ca). We identified no statistically useful serum biomarkers until day 35 of gestation when comparing the CNT group with the PRG group. However, at day 50 of the CNT vs PRG comparison, we identified a panel of five metabolites (methanol, l-carnitine, d-glucose, l-arginine, and urea; with an area under the receiver operating characteristic curve (AU-ROC) = 0.76) that could serve as candidate biomarkers for detecting pregnant ewes. At day 70, we identified a panel of two metabolites for ewe pregnancy that had an AU-ROC of close to 1.0 with very high statistical significance (p-value < 0.001). Comparing the CNT and MLP groups, we identified no useful biomarkers at day-7, while the other four timepoints revealed potentially useful biomarkers. The AU-ROC value and statistical significance of the biomarkers improved substantially later in the gestation, i.e., at day 70. Among the different timepoints assessed, day 50 had the largest panel of biomarkers, and these biomarkers were identical to the candidate biomarkers found at day 50 of the CNT vs PRG comparison. Given the value of detecting PLS at the earliest timepoint in gestation, a logistic regression equation was Table 4. Partial least squares discriminant analysis (PLS-DA) analysis of four comparison groups from the discovery dataset. Multivariate statistical analysis of the discovery dataset using PLS-DA revealed top 15 metabolites that significantly (p-value < 0.05) differentiate between the two comparison groups at each timepoint. NS Not significant, CNT control open ewes, PRG pregnant ewes, SNG pregnant ewes that delivered one lamb, TWN pregnant ewes that delivered two lambs, TRP pregnant ewes that delivered more than two lambs. Day − 7 refers to seven days prior to initiation of gestation and day 0 is the start of pregnancy. where P is the probability of pregnancy occurring with a cut-off of 0.81. Because the concentrations of the metabolites used in the CNT vs PRG comparison were sum normalized, log transformed and Pareto scaled, the metabolite values used in the equation must be adjusted. These adjustments are provided in Table 6. This same logistic regression equation was later used to predict the pregnancy status of ewes in the validation phase.
Candidate biomarkers of ewe litter size. Comparisons were made of CNT vs MLP groups (to identify pregnant ewes that deliver more than one lamb), SNG vs TRP groups (pregnant ewes that deliver a single or more than two lambs) and TWN vs TRP groups (pregnant ewes that deliver a twin or more than two lambs). A detailed summary of results is presented in Table 5. Candidate biomarkers were identified at all five timepoints for the SNG vs TRP comparison. This comparison revealed three to four candidate biomarkers at each timepoint with AU-ROC values varying from a low of 0.74 on day 0 to a high of 0.81 on day 70. All biomarkers were statistically significant except for the markers identified for day 35, which only had a statistical tendency. l-carnitine was the most frequently observed candidate biomarker, appearing at days − 7, 35 and 50. Since day 50 of gestation was the earliest timepoint to detect pregnancy, this timepoint was used to develop a logistic regression equation for the panel of candidate biomarkers (methionine and l-carnitine) of the SNG vs TRP comparison. This equation is given below: where P is the probability of delivering more than two lambs with a cut-off of 0.70. Because the concentrations of the metabolites used in this study were median normalized, cube root transformed and Pareto scaled, the metabolite values must be adjusted. These adjustments are provided in Table 6.
With regard to the TWN vs TRP group comparison, l-carnitine was also identified as the most frequently recurrent metabolite at all timepoints. For this comparison group, biomarkers at days − 7 and day 50 only had a statistical tendency, while other timepoints had statistically significant biomarkers. All AU-ROC values were below 0.80 and most panels consisted of a relatively larger number of metabolites. The candidate biomarkers   Table 5. Receiver operating characteristics (ROC) analysis of the comparison groups in the discovery and validation datasets. Candidate biomarkers were evaluated during all five timepoints of the discovery phase and day 50 of gestation was the best timepoint to reveal candidate biomarkers of ewe PLS. Therefore, biomarker analysis was pursued for only day 50 of gestation in the validation phase. The panel of metabolites that reached an area-under-the-curve (AU-ROC) of at least 0.65 or were significant (p-value < 0.05) were considered as candidate biomarkers in the discovery phase and were confirmed as biomarkers if the AU-ROC and p-value improved in the validation analysis. NS Not significant, NA biomarker not available, CNT control open ewes, PRG pregnant ewes, SNG pregnant ewes that delivered one lamb, TWN pregnant ewes that delivered two lambs, TRP pregnant ewes that delivered more than two lambs. Day − 7 refers to seven days prior to initiation of gestation and day 0 is the start of pregnancy. where P is the probability of triplets over twins occurring with a cut-off of 0.57. Because the concentrations of the metabolites used in this study were sum normalized, cube root transformed and auto scaled, the metabolite values used in the equation must be adjusted. These adjustments are provided in Table 6. The above two equations were later used to predict litter size status of pregnant ewes in the validation phase.

ROC
Validation phase. Given that we determined the ideal time to assess PLS in ewes via serum metabolomics was at day 50 post-breeding, the sample collection for the validation phase was conducted only at day 50 of gestation. This section describes the validation of the same panel of day 50 candidate biomarkers, and the prediction of the validation dataset using the logistic regression equations developed in the discovery phase. In conducting this validation phase, we looked at approximately twice the number of samples analyzed in the discovery phase from commercial flocks located in different regions and under different management practices (in two of the top sheep producing provinces in Canada, Alberta and Ontario).
Validated biomarkers of ewe pregnancy. Statistical analyses of the validation dataset for the five candidate biomarkers of pregnancy (presented previously) improved the AU-ROC to ≥ 0.90 (Fig. 1) and the p-value to < 0.05 (Table 5). Methanol, l-carnitine, d-glucose, l-arginine, and urea were confirmed to be robust biomarkers to detect ewe pregnancy at day 50 of gestation. Supplementary Fig. 1 shows the boxplots for these five metabolites comparing their normalized/scaled values between pregnant and non-pregnant ewes. The same logistic regression model (Eq. 1) presented for the candidate biomarkers in the discovery phase was used to predict the pregnancy status of the validation dataset. This regression model was successful in making predictions with a sensitivity of 69% and a specificity of 85%.
Validated biomarkers of ewe litter size. The AU-ROC value for candidate biomarkers (methionine and l-carnitine) of SNG vs TRP improved from 0.78 in the discovery phase to 0.84 in the validation set (Fig. 2). This was accompanied by improved significance from a p-value < 0.05 to a p-value < 0.001 (Table 5). Therefore, methionine and l-carnitine appear to be robust biomarkers of ewe litter size. Supplementary Fig. 2 shows the box plots for these two metabolites comparing their normalized/scaled values between SNG and TRP pregnancies. The same logistic regression model (Eq. 2) developed in the discovery phase to distinguish SNG vs TRP was used in the validation dataset. The regression model was successful in predicting litter size (SNG vs. TRP) with a sensitivity of 56% and a specificity of 91%. The candidate biomarkers (isobutyric acid, l-lactic acid, l-carnitine, valine, tyrosine, and methanol) identified for the TWN vs TRP comparison also reached statistical significance with an improved AU-ROC of 0.81 (Fig. 3). These compounds were confirmed as robust biomarkers of ewe litter size. The same logistic regression model (Eq. 3) was used for the panel of candidate biomarkers of TWN vs TRP comparison groups developed in the discovery phase to predict the validation dataset. Supplementary Fig. 3 shows the box plots for these six metabolites comparing their normalized/scaled values between TWN and TRP pregnancies. This regression model was successful in predicting litter size (TWN vs. TRP) with a sensitivity of 66% and specificity of 85%.
Biomarkers of pregnancy overlapped with those of the CNT versus MLP comparison groups indicating that if a ewe tests positive for the panel, not only is she pregnant but she is also expected to carry multiple fetuses.    Receiver operating characteristics (ROC) curve of biomarkers of pregnant ewes with a single or more than two lambs. The comparison of SNG vs TRP groups identified methionine and l-carnitine as significant (p-value < 0.001) biomarkers that would identify ewes that carry a single lamb or those that carry more than two lambs. To get a more precise measure of the litter size, further evaluation of the pregnant ewe's blood using the other panels of litter size biomarkers will likely be required. Therefore, if a pregnant ewe tests positive for the triplet biomarker panel (methionine, l-carnitine), the ewe is expected to deliver more than two lambs while a negative test does not necessarily indicate that the ewe will deliver a single lamb. On the other hand, pregnant ewes that test negative for biomarkers of twin vs triplet biomarker panel (isobutyric acid, l-lactic acid, l-carnitine, valine, tyrosine, and methanol) are expected to deliver twins.

Discussion
Over the past decade, livestock metabolomics research has gained considerable momentum. Currently the number of papers being published on the subject is almost doubling every 2 years. However, sheep metabolomics is still lagging behind the research activities for other livestock species such as cattle and pigs. For this reason, we focused on further characterizing the sheep metabolome and identifying candidate biomarkers associated with production traits of high economic value such as residual feed intake, carcass merit 23 and reproductive performance. In this study, we examined sheep serum using NMR and LC-MS/MS-based metabolomics to identify robust and useful metabolite biomarkers of PLS. The initial step involved profiling the sheep serum metabolome during the first half of pregnancy. In doing so, we identified and quantified a total of 107 serum metabolites. Although no new sheep serum metabolites were identified (after comparison to the data in the LMDB 12 ), the proportion of quantified sheep serum metabolites in the LMDB were increased from 49 to 52%. Data from this experimental work also adds to the reference values obtained from healthy pregnant sheep in the LMDB. Moreover, the study provides quantitative information about the metabolic dynamics of the ewe serum metabolome from seven days prior to breeding to day 70 of gestation. These data are now publicly accessible in the LMDB (www. lmdb. ca).
The central objective of this study was to identify serum metabolite biomarkers for sheep PLS using highthroughput, quantitative metabolomic platforms. As far as we are aware, this is the first study to identify nonhormonal metabolite biomarkers of both pregnancy and litter size, and to provide logistic regression models to predict pregnancy status in domestic sheep. It is important to note, however, that there are other compounds or biomarkers that have shown promise for assessing ewe PLS. These include genes, proteins and metabolites, some of which are described below.
Previously identified PLS biomarkers. Efforts to identify specific gene transcript levels and genetic markers for sheep PLS have been previously described. For example, changes in the expression levels of the www.nature.com/scientificreports/ interferon-tau-stimulated gene in the thymus 24 and endometrium 25 have been found to signal pregnancy at early gestation. There are also a number of studies on genes responsible for sheep litter size 26 . The Booroola gene, located on ovine chromosome 6, has a major impact on ovulation rate and is a major determining factor for litter size in sheep 27 . This gene has at least 23 different variants. Certain Booroola variants increase follicle sensitivity to the follicle-stimulating hormone, thereby inducing a faster follicle maturation 28 . Moradband et al. 29 found that heterozygotes in the Iranian Baluchi sheep breed had increased the litter size. Ewes that are homozygous for the variant almost double their ovulation rate. However, their lambs have a low survival rate with a lower growth rate and weaning rate 28 . The Booroola gene is associated with the bone morphogenetic protein receptor 1B (BMPR-1B 26 ). Increased blood concentrations of the BMPR-1B protein have been reported to benefit follicular development, yielding better ovulation and increased litter size 30 . A separate study that evaluated proteins in the follicular fluid (FF) of ewes found that the FF of larger follicles compared to smaller follicles had increased glucose and cholesterol concentrations, but lower concentration of triglycerides, lactate, alkaline phosphatase and lactate dehydrogenase 31 . These metabolites and proteins appear to be correlated with ovulation rate, suggesting their relevance to prolific ewes and the litter they carry. In another study, Koch et al. 32 used MS-based proteomics to identify 15 signature proteins from the uterine luminal fluid of ewes as indicators of pregnancy and involved with embryonic growth, immune regulation and nutritional needs. As yet, none of these protein markers have been rigorously validated by ROC curve analysis and none are commercially used in sheep PLS testing.
Another example of a protein biomarker in pregnant ewes is the pregnancy-associated glycoprotein (PAG). The PAG is a placental-secreted factor that is detected in maternal serum upon implantation of the fetus onto the endometrium. This protein can be measured as early as 30 days in gestation 33 , with increasing concentrations as the pregnancy progresses 34 . Pregnancy specific protein B (PSPB is a form of PAG that is released by the fetus to maintain the corpus luteum 35 . Also, PSPB along with other PAGs increases with increasing number of fetuses carried by the ewe (Pickworth et al. 36 ). However, PSPB is breed-specific (Redden and Passavant 37 ) which limits its application for all sheep breeds. Generally, PAGs are also positively correlated with maternal serum P4 levels 34 . In a study by Karen et al. 13 , blood PAG had 93.5% sensitivity for detecting pregnancy at day 22 of gestation, however, their results were skewed by the abnormally low (17%) pregnancy rate of the flock.
In addition to genetic and protein biomarkers of sheep PLS, a number of metabolite biomarkers have also been explored. Progesterone is a promising example of a hormonal metabolite biomarker that could be used for assessing sheep PLS. Progesterone is predominantly produced by the CL at the beginning of gestation and later (day 50 onwards) is produced by the placenta to maintain the pregnancy 34,38 . The concentration of P4 in ewe blood increases over the course of gestation and has been used as an indicator of pregnancy, as well as placental and fetal wellbeing 34 . However, identifying ewe PLS through measurements of P4 concentrations at around days 50-80 of gestation has a sensitivity varying between 65 and 85% and a specificity between 65 and 93% 21,39 . While potentially promising, blood P4 concentrations are not considered sufficiently accurate indicators of non-pregnant ewes 13 and are not useful for differentiating ewes based on litter size 21 . Furthermore, LC-MS-based metabolomic analysis of a panel of eight steroid hormones, including P4, in our own sheep serum samples (n = 94), showed no improvement in biomarker performance when using P4 independently or in combination with non-hormonal metabolites to detect sheep PLS status (unpublished data). Another steroid hormone, estradiol, has also been used for detecting litter size after 50 days into gestation 40 . Despite P4 and estradiol being significant reproductive hormones and associated with ewe PLS, to date there is insufficient evidence and validation based on ROC analysis or regression modeling to make these hormones useful for assessing sheep PLS status 41 .
Other (non-hormonal) metabolites have also been identified as potential pregnancy markers in other livestock species. A recent study of pregnant buffaloes identified five milk metabolites detected by LC-MS on day 18 after artificial insemination as candidate biomarkers of pregnancy 15 . Likewise, in beef cattle, four plasma metabolites were detected by NMR at day 40 of gestation 16 . These reports suggest that measurement of non-hormonal metabolites may serve as an indirect means of pregnancy and/or litter size detection in ruminants.
To date, few studies have reported non-hormonal metabolites associated with sheep PLS. Sun et al. 17 used NMR to investigate pregnant ewe metabolism in relation to in utero fetal growth at four timepoints from day 50 of gestation onwards. They reported 13 serum metabolites that are associated with protein and lipid metabolism of twin-bearing pregnant ewes. In another study using MS-based analysis of FF and ovarian vein serum in the Han sheep breed 42 , a total of eight metabolites (glucose 6-phosphate, glucose 1-phosphate, aspartate, asparagine, glutathione oxidized, cysteine-glutathione disulfide, γ-glutamylglutamine, and 2-hydroxyisobutyrate) were significantly associated with ewe litter size. Another recent metabolomic study using LC-MS/MS revealed that sphingolipid and amino acid metabolism is important for maintaining the uterine environment to increase embryo survival rate 43 . In addition to these studies, there are a few other reports that measured individual metabolites in pregnant sheep [18][19][20]22 . None of these studies identified or rigorously assessed the reported metabolites as robust PLS biomarkers. Overall, existing data suggests that individual genes, proteins and metabolites may be useful for assessing sheep PLS. However, as of yet, there have been no metabolomic studies that have attempted to rigorously identify and validate a panel of readily accessible non-hormonal metabolite blood biomarkers for assessing sheep PLS.
A common feature of the serum biomarkers presented in this study is that all are detectable by NMR spectroscopy. While the identification and validation of a set of useful sheep PLS biomarker panels was our primary interest in this study (see Table 5), we also believe it is important to provide some biological context and to suggest how some of these metabolites may play a role in sheep pregnancy. Indeed, the biological role of some of these metabolites appears to tie in with the reproductive physiology of sheep. However, some metabolites have not previously been identified as having a role in pregnancy, litter size or gestation and so it is difficult to understand their biological context. The following section further discusses the known biological relevance of www.nature.com/scientificreports/ each metabolite biomarker identified in this study. It also elaborates on the potential impact that these biomarkers may have for the sheep industry.

Potential biological roles of the PLS biomarkers identified in this study. L-arginine is an essen-
tial amino acid that is known to be important for successful pregnancy. At day 50 of gestation, l-arginine was significantly (p-value < 0.05; Table 2) elevated in pregnant ewes (214 ± 85 µM) relative to non-pregnant controls (174 ± 78 µM). Arginine appears to play a role in a number of physiological pathways related to pregnancy. Luther et al. 44 provided pregnant ewes with l-arginine supplementation and observed enhanced ovarian function along with elevated numbers of viable fetuses. The same study identified a direct positive correlation between l-arginine and P4, leading to improved pregnancy maintenance and early embryonic growth. Our results appear consistent with these reports and show that pregnant ewes as well as ewes that delivered more lambs had a higher serum concentration of l-arginine. Furthermore, maternal administration of this amino acid in the later portion of gestation has been shown to increase lamb birth weight, enhance blood flow and increase nutrient transport to the fetus through synthesis of nitric oxide 45,46 . l-arginine also improves pancreatic and brown adipose tissue growth during fetal development 47 , and increases post-partum brown fat storage and the survivability of female lambs 48 . Serum l-arginine is associated with improved post-partum weaning weight and the weaning rate of lambs 49 . Administering this amino acid to prolific ewes further improves the lambing rate by nearly 60%, increases lamb birth weight by over 20% without negatively impacting maternal body weight, and decreases lamb mortality rate at birth by more than 20% 50 . Another metabolite identified as a strong biomarker of litter size was urea. At day 50 of gestation, the average urea concentration was significantly (p-value < 0.001) lower in pregnant ewes (1823 ± 667 µM) compared to open ewes (2518 ± 871 µM). Urea is a source of nitrogen for rumen microbes and is produced through the degradation of amino acids. Elevated blood concentration of urea in ewes seems to reduce conception and pregnancy rate 51 . Likewise, high concentrations of circulating urea have adverse impacts on embryonic development 52 . Our results are in agreement with these findings as pregnant ewes as well as ewes with a greater litter size have a lower concentration of blood urea compared to non-pregnant ewes.
One of the more interesting biomarkers we identified for litter size was methionine. We found that the average methionine serum concentration was significantly lower (28 ± 9 µM, p-value < 0.001) in pregnant ewes that delivered more than two lambs compared to ewes that delivered just one lamb (33 ± 9 µM). Methionine is an essential amino acid that plays an important role in general animal performance 53 , as well as the growth and development of lambs in early life 54 . Methionine is also a methyl group supplier for epigenetic alteration of DNA, especially in late gestation 55 . Indeed, Sinclair and associates 56 reported widespread epigenetic alterations in progeny, mostly male lambs, resulting from restricted supply of dietary methionine to the pregnant dam. Alterations to the genome induced by metabolites such as methionine are responsible for modification of health-related phenotypes, cell growth, host immunity, and protein production 56-59 . l-lactic acid is another biomarker of litter size that is traditionally associated with muscle metabolism. However, during pregnancy its concentration increases with the progression of gestation 60 . Average l-lactic acid concentration was significantly higher (3293 ± 1948 µM, p-value = 0.01) in pregnant ewes that delivered more than two lambs compared to ewes that delivered only two lambs (2432 ± 989 µM). Lactate can be used as an alternative source of energy by the fetal brain 61 . Therefore, a ewe with a higher number of fetuses is expected to have a higher concentration of serum l-lactic acid.
Valine is another biomarker we found to be associated with ewe litter size, and it decreased with increasing number of lambs. The average valine serum concentration was significantly higher (219 ± 74 µM, p-value = 0.007) in TWN versus TRP (191 ± 64 µM) pregnant ewes. This metabolite is a branched-chain amino acid that stimulates protein synthesis in fetal muscle 62,63 . Therefore, ewes that deliver three or more lambs and have an overall higher fetal protein synthesis compared to those that deliver twins are expected to have a higher utilization of this amino acid and lower concentration in the serum. Branched-chain amino acids are also integral to the immune system by supporting the growth of lymphocytes and natural killer cells to remove viral infections 64 . Pregnant ewes are more prone to immune challenges and an increased number of fetuses increases immune vulnerability of the ewe 65,66 . Therefore, ewes that have the largest litter size, i.e., triplets vs twins, are expected to draw more valine from the maternal serum, which aligns with our results.
Comparison to ultrasonography. The current gold standard for sheep PLS assessment is ultrasonography. Ultrasound is mostly used to determine pregnancy status (open vs pregnant). However, certain experienced ultrasound operators can detect the number of fetuses in pregnant ewes as early as approximately 40-45 days of pregnancy and onwards (based on industry data in Canada). In fact, our field observations indicate that most Canadian ultrasound technicians identify litter size as one fetus or more than one. Ultrasound scanning is relatively rapid (2-5 min/ewe) and costs CAD$5-8/ewe (depending on the location of the farm, travel required for the operator to reach the farm, and the number of ewes being scanned). All sheep used in this study were characterized via ultrasound analysis by trained technicians at day 50 of pregnancy.
Using records from 166 ewes with complete data from ultrasound scanning and corresponding pregnancy outcome, we determined that the sensitivity of ultrasound was 0.55, the specificity was 0.70 and the AU-ROC of using ultrasonography for pregnancy detection was 0.65. With regard to ultrasonography results for litter size, we found that for distinguishing SNG vs TRP, the sensitivity was 0.51 while the specificity was 0.18. With regard to distinguishing TWN vs TRP, the sensitivity of ultrasonography was 0.43 while the specificity was 0.18. It is noteworthy that the consistency of ultrasound prediction varied between farms mainly due to the expertise and experience of the technician who tended to underestimate singles and triplets while overestimating twins. Comparing our metabolomics results to these ultrasound measurements ( www.nature.com/scientificreports/ performed better than ultrasonography by 24% in terms of AU-ROC, 20% in terms of sensitivity, and 18% in terms of specificity for detecting ewe pregnancy. Likewise, if we compare our predictive biomarker panels for detecting litter size against ultrasonography, metabolite panels performed 9-35% better in terms of sensitivity and nearly 80% better in terms of specificity for predicting litter size. These results indicate serum metabolite measurements are significantly more accurate than ultrasound in detecting and assessing sheep PLS in this study. In order for any alternative tool to compete with ultrasound for sheep PLS assessment, it would have to be either cheaper, more accurate, more convenient or able to detect PLS at earlier gestational timepoints. The metabolite panels identified in this study are more accurate, however, could they compete with the cost of ultrasound? Ultrasound tests cost between CAD$5-8 per ewe, for those producers who can access ultrasound technicians. Currently metabolite tests consisting of three or four metabolites conducted on MS instruments can be done for as little as CAD$5 per sample (excluding shipping costs). These costs can be reduced further if testing were to be optimized or more widespread. If the metabolite tests could be converted to a handheld device (such as a lateral flow assay or a simple colorimetric test) for pen side testing, then both the lower cost (perhaps as little as $3 a test) and improved convenience would make these sorts of blood tests very appealing to producers. These biomarkers have a better performance when it comes to predicting larger litter sizes in pregnant ewes. Even if we assume that these biomarkers perform comparably to ultrasound, the cost of the blood test would not vary (as it does for ultrasound scanning) based on flock size and geographical location of the farm. This would permit farms with smaller flocks and farms located in remote areas to benefit from blood-based PLS detection. If serum markers could be found effective much earlier in gestation (say at day 25 or 35) with a sensitivity or specificity that is comparable to ultrasound, then the potential of a blood test for sheep PLS would be even greater.

Future prospects.
We have shown that targeted, quantitative metabolomics technologies can be used to discover and validate serum metabolite biomarkers of sheep pregnancy and litter size. Using a large cohort of samples collected from multiple commercial flocks across Canada, we successfully identified four panels of biomarkers that can determine ewe PLS with good accuracy and precision. The performance of these markers appears to exceed that seen with ultrasound measurements within the context of this experiment. Therefore, we believe that if these biomarkers could be further optimized (for high throughput off-site assays) or translated to hand-held or pen-side tests (similar to the urine-based pregnancy detection kit for women), they could be used to routinely assess PLS in Canadian sheep flocks. We are working on developing a pen-side kit, using the panel of five biomarkers identified and validated in this study, to detect ewe pregnancy 50 days into gestation. If producers require the exact number of the litter size, a second test incorporating the two panels of biomarkers reported here could also be developed. In conclusion, translating these results for on-farm, pen-side use could significantly improve reproduction management and profitability of sheep breeding enterprises.

Methods
All animal procedures were approved by the University of Alberta Animal Care Committee (AUP00002510) and all methods were carried out in accordance with relevant guidelines and regulations. Moreover, all methods associated with animal experiments are in accordance with the ARRIVE guidelines (https:// arriv eguid elines. org).
Experimental design. The experiments were designed in two phases: (1) a discovery phase to identify candidate serum biomarkers of ewe pregnancy and litter size at the earliest timepoint in gestation, and (2) a validation phase to validate the candidate biomarkers using a sample size approximately two times larger than that used in the discovery phase.
Discovery phase sampling. In the discovery phase, ewes were selected from two farms (Olds College and a private farm) in Alberta, Canada, consisting of Suffolk × Dorset crosses (n = 91) and Rideau Arcott (n = 152) ewes, respectively. Blood was drawn from all animals over five timepoints throughout this phase, including seven days prior to exposing the ewes to rams (day − 7), day 0 (day of ram turnout for breeding), days 35, 50 and 70 of gestation (Fig. 4A). These animals were synchronized for estrus and the number of lambs delivered was recorded. Table 7. Performance comparison of metabolomic biomarkers and ultrasonography. Sensitivity and specificity and the ability to predict sheep PLS is compared between ultrasonography and regression models of blood metabolite biomarkers. Most biomarker panels offer a higher sensitivity and specificity than that of ultrasound diagnosis of PLS. The values calculated for ultrasound are for detecting pregnancy status (CNT vs PRG) and whether the pregnant ewes carry a single fetus or more (SNG vs MLP) while, the biomarker panels also identify the specific number of the litter (i.e., SNG, TWN, TRP). www.nature.com/scientificreports/ Based on the pregnancy outcome of all the animals included in this phase, two broad groups (Fig. 4B) were formed for statistical analyses: controls (CNT; n = 32) composed of non-pregnant, open ewes, and pregnant ewes (PRG) that delivered one or more lambs (n = 99). The CNT animals were comprised of ewes that were bred and did not deliver any lambs (n = 9) as well as the negative controls (n = 23) which were not exposed to rams. We divided the PRG animals to form three subgroups including ewes that delivered a single lamb (SNG; n = 30), ewes that delivered a twin (TWN; n = 36) and those that delivered a triplet or more (TRP; n = 33). The remaining ewes (n = 112) were not included in the analyses due to poor sample collection, missing data, and/or the producer's decision to cull the animal.
Animal feed. During the discovery phase, the Olds College ewes were group-housed outdoors and fed a ration of grass mix alfalfa hay with whole barley grain and a mineral supplement. Ewes at the private farm were group housed indoors in a climate-controlled barn and fed corn silage with supplemental mineral and vitamin. Initially, it was assumed all animals were pregnant with twins, and the feed rations were formulated using the SheepBytes program (https:// www. sheep bytes. ca/) in compliance with National Research Council recommendations 67 . Each ewe received nutrients based on live weight of 70-75 kg (equivalent to 1.51 Mcal net energy maintenance) in early gestation.
Estrus synchronization and breeding management. All ewes were synchronized with progesteronebearing controlled internal drug release (CIDRs; Zoetis Canada Inc.) 14 days prior to ram turn out for breeding. To install the CIDRs, ewes were first lined in the chute and then the CIDR was inserted into the applicator by folding its wings and the tip of the applicator was gently lubricated to facilitate insertion of the device into the ewe. If the vulva appeared to be dirty, it was cleaned prior to implanting the CIDR. The applicator was then gently inserted into the vagina to release the CIDR. The applicators were disinfected between each use by dipping in a warm water and iodine solution.
Upon CIDR removal, ewes received pregnant mare serum gonadotropin (NOVORMON™, Syntex S.A., Buenos Aires, Argentina) by intramuscular injection in the rump (1 ml/ewe for the prolific Rideau Arcott breed and 2 ml/ewe for the Suffolk x Dorset crosses).
All ewes, except for the CNT group, were then grouped with the breeding rams at a ratio of no more than 10 ewes per ram. Ram turnout at the Alberta private farm location occurred on November 4th, 2017, with ewes lambing between March 29th and April 5th, 2018. Ram turnout at the Olds College location occurred on October Samples were collected at five timepoints during the discovery phase; Day − 7 refers seven days prior to mating ewes and rams, Day 0 refers to the day of mating, Day 35 refers to 35 days after mating, Day 50 refers to 50 days after mating, and Day 70 refers to 70 days after mating. (b) Experimental groups during the discovery phase included control ewes (CNT) which were not pregnant (n = 32) and pregnant ewes (PRG) with different litter sizes (n = 99). The PRG group consisted of pregnant ewes that delivered one lamb (SNG), pregnant ewes that delivered two lambs (TWN), and pregnant ewes that delivered three or more lambs (TRP). All delivered lambs were healthy and viable. Laparoscopic reproductive examination. A subset of the negative controls was examined at day 50 of gestation using laparoscopy to visually observe and approve ovarian health. Animals were restrained using a cradle and anesthetized by intravenous injection of a combined sedative of 0.6 mg/mL xylazine (Vetoquinol Canada Inc., ON, Canada) and 2 mg/mL Ketamine (Vetoquinol Canada Inc., ON, Canada). Once on the cradle, the anesthetized ewe was lifted from its rear, bringing the back two legs up while the head and front two legs are down. Approximately six inches from each teat was clipped and cleaned with a 4% chlorhexidine scrub (Ceva Animal Health Inc., ON, Canada) and 99% isopropyl alcohol. The clipped areas provided a point of entry for the scope on one side and a cannula on the other. A moderate amount of CO 2 was introduced into the abdominal cavity through a trocar going into one of the clipped points. The laparoscope was introduced into the cannula to see the ovaries. The ovaries of all open ewes were observed and approved by a veterinarian as reproductively sound and not showing any apparent abnormalities. The cannulas were then removed and the skin was stapled to close the two holes. The animals were gently rolled off the cradle and within five minutes they were relieved from the anesthesia. All utensils were maintained and cleaned in a dilute iodine solution (West Penetone Inc., QC, Canada) between each animal examination.
Ultrasound diagnosis. All bred ewes were trans-abdominally scanned (Sonosite M-Turbo ultrasound machine, FUJIFILM Sonosite Inc., ON, Canada) for pregnancy and litter detection while standing in a chute at day 50 of gestation by an experienced technician for each province. Certified technicians reported pregnancy as open (no detectable fetus present), single (detection of only one fetus), twins (detection of two fetuses), and triplets or more (detection of more than two fetuses). All ultrasound assessments were reconciled with the actual lambing records from each flock.
Validation phase sampling and feeding. During the validation phase, ewes were selected from two farms in Alberta (Suffolk and Canadian Arcott crosses at Lakeland College [n = 65], and Suffolk crosses at a private farm [n = 12]) and two farms in Ontario (Rideau Arcotts and Suffolk crossed with Rideau Arcott at private farm one [n = 55], and Dorset and Rideau Arcott crosses at private farm two [n = 111]). Each farm used "typical" Canadian feed rations. In particular, sheep were fed either (1) grass-legume hay mixtures with grain (barley or corn) or (2) corn silage or haylage. Specifically, one farm in Ontario and one farm in Alberta fed silage/haylage, whereas one farm in Alberta and one farm in Ontario fed the hay and grain mix. Based on the discovery phase results, blood was only drawn from all animals at a single timepoint (day 50 of gestation). All ewes were naturally mated to the rams at a ratio of 10:1, none of which were synchronized for estrus. All ewes had their lambing outcome recorded and categorized similar to the discovery phase (i.e., CNT, PRG, SNG, TWN and TRP).
Blood collection and processing. Blood samples from all ewes of both phases (discovery and validation) were drawn from the jugular vein. Samples were collected using 21-gauge needles (PrecisionGlide ® , USA) and vacutainers coated with no anticoagulant (BD Vacutainer, USA) for a maximum volume of 10 mL. Blood samples were kept on ice upon collection for a maximum of 30 min. Samples were then centrifuged (Beckman Coulter, USA) for 30 min at 17,700 rpm at 4 °C. The supernatant serum was then transferred to Eppendorf tubes (Axygen, USA) and snap frozen using liquid nitrogen. Frozen serum samples were labelled and stored at − 80 °C until used for metabolomic analyses.
Metabolomics experiments. All ewe serum samples were analyzed using nuclear magnetic resonance (NMR) spectroscopy and liquid chromatography tandem mass spectrometry (LC-MS/MS). A thorough description of sample preparation and analysis methods for each platform is provided in Goldansaz et al. 23 . In brief, for the NMR analysis, all serum samples were filtered using a 3 kDa ultrafiltration device to remove the macromolecules (i.e., proteins and lipoproteins). A total sample volume of 250 µL (including the serum and buffer solution) was introduced to a 700 MHz Avance III (Bruker, USA) spectrometer equipped with a 5 mm HCN Z-gradient pulsed-field gradient cryoprobe. The 1D 1 H-NMR spectra were then collected, processed and analyzed using methods previously described and a modified version of the Bayesil automated NMR analysis software package 68 . For the LC-MS/MS metabolomic analysis, serum samples were analyzed using an in-house quantitative metabolomics kit (called TMIC Prime) run on an Agilent 1260 series UHPLC system (Agilent Technologies, Palo Alto, CA) coupled with an AB SCIEX QTRAP ® 4000 mass spectrometer (Sciex Canada, Concord, Canada). A detailed description of the methods, kit design, workflow and data analysis is given in Goldansaz et al. 23 .

Statistical analyses.
To conduct a standard categorical analysis and identify the relevant serum PLS biomarkers, we categorized the animals into six different groups based on their pregnancy outcome (i.e., CNT, PRG, SNG, TWN, TRP, MLP). Metabolomic datasets from the two platforms were pre-processed and normalized using standard methods available via MetaboAnalyst 4.0 69 . Metabolites that had > 20% missing values were removed from the dataset prior to statistical analyses. Univariate and multivariate statistical analyses, including fold change, student's t-test, volcano plot analysis, and partial least squares discriminant analysis (PLS-DA) were conducted using MetaboAnalyst. The PLS-DA plot helped visualize the separation of each animal group based on their corresponding serum metabolome, and its significance was verified using permutation testing (n = 1000). The PLS-DA analyses that were significant were also evaluated for the top  www.nature.com/scientificreports/ those metabolites that had the most significant contribution to separating the comparison groups. Biomarkers were identified and evaluated using the biomarker module in MetaboAnalyst 4.0 69 . This module automatically selects subsets of statistically significant metabolites (initially identified via PLS-DA analysis and validated by permutation analysis using n = 1000 and p < 0.05). This module then performs a series of logistic regression calculations on the normalized and scaled metabolite concentration values and calculates the ROC curves as well as the AU-ROC values to identify the optimal set of biomarkers. Individual or multiple metabolite profiles with an AU-ROC ≥ 0.70 were considered as candidate biomarkers for each trait. The threshold for statistical significance reported in this manuscript is a p-value < 0.05 and a Benjamini-Hochberg false discovery rate (or Q-value) < 0.05, unless otherwise mentioned. Also, a 0.05 < p-value < 0.10 is referred to as a tendency while, differences with a p-value > 0.10 are referred to as not significant.