Introduction

Legume seeds have made a significant contribution to human diet since ancient times being characterized by beneficial nutritional, agricultural, economical, and ecological traits1. Legume seeds are considered as a potential source of nutrients including proteins, fibers, vitamins, minerals and carbohydrates2 in addition to biologically active compounds (antinutrients) which play an important role in diseases treatment and/or prevention3. Members of Medicago, Melilotus, Ononis, and Trifolium species F. Fabaceae (Leguminosea) contributed significantly for generations as forage plant in the Mediterranean region4,5,6. Medicago sativa (Linn.) seeds the most ancient cultivated fodder plant all over the world, is ranked to be the fourth economically valuable (important) forage crop in North America and temperate regions7.

Since ancient times, Medicago, Melilotus, Ononis, and Trifolium species were used in traditional medicine. In the Chinese and Hindu societies, physicians make a cooling poultice from Medicago sativa seeds commonly known as clover, to be used for boils treatment7,8. GC–MS analysis of M. sativa seeds revealed its enrichment in crude protein (33.79%), crude oil (8.11%), squalene, hexadecanoic acid methyl ester, n-hexadecanoic acid, 9,12-octadecadienoic acid methyl ester, 9-octadecenamide, and vitamin E9. Moreover, Medicago sativa seeds inclusion in diet is recommended to normalize serum cholesterol level in type II hyperlipoproteinemia patients8,10. Albeit, Medicago sativa seed was found to exhibit some health hazards including systemic lupus erythematosus like syndrome in female monkeys11 due to its content of canavanine (a nonprotein amino acid), a known anti nutrient.

Melilotus species yellow sweet clover was used by Hippocrates and Dioscorides to treat skin ulcers and abscesses due to its emollient and anti-edematous effect12, while the Pharaohs used Melilotus prepared tea as an anthelmintic5. Melilotus species are rich in alkaloids, flavonoids, coumarins, triterpenes and saponins5. GC–MS Analysis of n-hexane extract of M. officinalis seed oil revealed the presence of coumarin (hepatotoxic compound) at significant level (8.40%), not recommending its oil utilization for cooking13.

Members of the Ononis seeds were used internally and externally in ethnomedicine for centuries due to their biologically valuable isoflavonoids and proanthocyanidins content14,15. Ononis natrix seeds possess a high nutritional value in terms of theoretical nutritional parameters owing to its high protein content (37%), amino acid score (112%), protein efficiency ratio (2.8–2.9), and essential amino acid/ total amino acid (39%)4. Previous GC–MS analysis of O. natrix seed oil revealed its enrichment in linoleic (33%) and linolenic acids (27%)16.

Trifolium species were traditionally used as expectorant, analgesic (rheumatic aches), antiseptic and for treatment of constipation, anthelmentic, eczema, psoriasis, lung, nervous and reproductive system disorders6. Trifolium species are considered as potential source of health phytochemicals due to its high content of quercetin flavonoid and soyasaponin17. Furthermore, Trifolium seeds were reported to provide the best nutritional values and amino acid composition compared to Medicago and Ononis4.

Due to the increased population worldwide, the demand for the exploration of new healthy alternative for animal proteins and forage plants has increased. Yet the exploration of the safety and nutritional value of Medicago, Melilotus, Ononis, and Trifolium species seeds’ as human food has not been fully achieved. Therefore, a comprehensive (integrated) approach for metabolites profiling of 15 leguminous seeds from four different genera (Medicago, Melilotus, Ononis, and Trifolium) using gas chromatography–mass spectrometry GC–MS to provide better insight into their primary metabolites content and nutritional traits. Metabolites heterogeneity (diversity) among the different leguminous seeds was measured using unsupervised and supervised multivariate data analysis as principal component analysis (PCA), hierarchical cluster analysis (HCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) and to aid identify markers of each genus.

Materials and methods

Plant material

The dried legume seeds viz. Melilotus, Trifolium, Medicago, and Ononis different species were obtained with permission from the Department for Bioarchaeology, Austrian Archaeological Institute (OeAI), Austrian Academy of Sciences (OeAW), Austria (Table 1). The experimental study of the seeds complied with all the appropriate guideline18. Voucher specimens were kept at the Herbarium of Faculty of Pharmacy, Cairo University, Cairo, Egypt. Analysis of each sample was carried out in triplicate to consider the biological variation.

Table 1 Sample codes of legume seed species used in this study.

GC–MS analysis of the silylated primary metabolites

The analysis of primary metabolites was conducted as described in19. In brief, the finely powdered seeds (100 mg) were extracted with methanol and centrifugated at 12,000 rpm for 10 min to get rid of the debris. Samples were sonicated and extracted once, following the same protocol as20,21.

Three samples of each seed were analysed using the same conditions to consider the biological variation. The methanol extracts were evaporated under a nitrogen gas stream till dryness. Dried pellet was derivatized using 150 µL of N-methyl-N-(trimethylsilyl)-trifluoroacetamide (MSTFA) and incubated for 45 min at 60 °C. GC/MS analysis was carried out on a Shimadzu GC-17A gas chromatograph that is coupled to Shimadzu QP5050A mass spectrometer, using Rtx-5MS column (30 m length, 0.25 mm inner diameter, and 0.25 μm film thickness).

Validation and quality control of samples for nutrient analysis using GC–MS

Three pooled quality control samples were injected before GC–MS analysis. The pooled quality control samples were injected multiple times during the whole experiment to further ensure the stability and accuracy of the analysis20,21. The relative standard deviation (RSD) of retention time was in the range of 0.05–0.15%. The RSD of peak intensity varied between 2.63 and 8.08%.

For metabolites quantification, soluble sugars, free amino acids, organic acids and fatty acids were quantified using standard curves of glucose, glycine, citric and stearic acids and results were expressed as mg/g. Four serial dilutions were prepared from 10 to 600 μg/mL for establishing the standard curves. Calibration curves for glucose, glycine, citric acid and stearic acids displayed 0.9948 correlation coefficient.

Metabolites identification and absolute quantification

First GC/MS peaks were deconvoluted using AMDIS software (https://www.amdis.net), afterwards the identification of silylated metabolites was accomplished by comparison of their retention indices (RI) relative to n-alkanes series (C8–C30), and their mass spectra matching to WILEY, NIST library databases and also with standards whenever available. Peak abundance was obtained using MS-DIAL software with previously described parameters in22 Alcohols, organic acids, fatty acids, soluble sugars and free amino acids were quantified using the standard curves of glycerol, lactic acid, stearic acid, glucose and glycine and expressed as mg/g. For the standard curves, four serial dilutions were prepared (from 10 to 600 µg/mL). Calibration curves for glucose, glycine, and stearic acids displayed a correlation coefficient of ca. 0.994821.

Multivariate data analysis

The multivariate data analysis (MVDA) was carried out using both the unsupervised principal component analysis (PCA) and hierarchical cluster analysis (HCA), in addition to the supervised orthogonal partial least squares-discriminate analysis (OPLS-DA) using SIMCA 14.1 (Umetrics, Umea, Sweden), all variables were scaled and mean centered to Pareto Variance. The unsupervised PCA was performed for acquiring an extensive configuration (overview or figure) of the variance of metabolites among the different seeds’ specimens, while the supervised OPLS-DA was implemented to confirm PCA results and to access detailed information on the distinctions (variation or differences) in metabolites composition among the studied specimens. Chemometric models were assessed utilizing the two parameters (specifications) R2 and Q2 number of permutations in models set at 200. R2 was employed to specify the model goodness of fit, while Q2 indicated the model predictability. Outliers were detected using DModx (distance to the model) whereas strong outliers’ detection for the OPLS-DA plot was performed using Hotelling’s T2. An iterative permutation test was carried out to eradicate the non-randomness of separation among groups.

Enrichment analysis

Enrichment analysis was performed using MetaboAnalyst 5.0 (https://www.metaboanalyst.ca, accessed on 17 September 2023) by annotating KEGG IDs with main-class and “sub-class” metabolite chemical sets.

Results and discussion

The main goal of this study was to evaluate metabolome diversity within less explored legume seed species. The examined samples comprised three Melilotus, four Medicago, four Trifolium, and four Ononis seed species represented by different species (Table 1). To assess the biological variance within each sample as well as the analysis conditions, three independent biological specimens were analyzed using GC–MS.

GC/MS-based metabolite profiling

GC–MS analysis was carried out post-silylation to assess seeds’ metabolome in context to its low molecular weight primary metabolites, Fig. 1. About 87 compounds (Table 2) were identified, comprising alcohols, amino acids, aromatics, fatty acids/esters, nitrogenous compounds, organic acids, sugar alcohols, sugars, terpenes, and steroids. The major annotated metabolites among all examined seeds are represented in Fig. 2.

Figure 1
figure 1

Representative GC–MS chromatograms of TMS derivatives of metabolites in the extracts of “MS” Melilotus segetalis, “MX” Medicago xvaria, “TP” Trifolium pannonic, “OA” Ononis arvensis.

Table 2 Levels of silylated primary metabolites in Melilotus, Trifolium, Medicago, and Ononis seed species.
Figure 2
figure 2

A bar chart illustrating the concentration of major metabolite classes in Melilotus, Trifolium, Medicago, and Ononis seed species expressed as mg/g. MA, Melilotus albus; MO, Melilotus officinalis; MS, Melilotus segetalis; TP, Trifolium pannonic; TI, Trifolium incarnatu; TM, Trifolium montanu; TA, Trifolium arvense; MDS, Medicago sativa; MDO, Medicago orbiculari; ML, Medicago lupulina; MX, Medicago xvaria; OR, Ononis repens; ON, Ononis natrix; OS, Ononis spinosa; OA, Ononis arvensis.

Fatty acids and acyl esters

Fatty acids and acyl esters predominated examined seeds as typical storage organs detected at 6.2 to 64.2 mg/g except for Trifolium TP and TI in which sugar levels were more abundant (26.6 and 27.6 mg/g, respectively vs. 14.4 and 17.9 mg/g fatty acids and acyl esters, respectively). Lipids were represented mostly by saturated (SFA) as palmitic acid, monounsaturated (MUFA) viz. oleic acid, and polyunsaturated fatty acids (PUFA) viz. linoleic and α-linolenic acids. Ononis OS, OA, OR, Trifolium TM, and Medicago MX encompassed the highest levels of fatty acids and acyl esters ranging from 24.5 to 64.2 mg/g, posing them as candidates for future use in biofuel industry23, in contrast to the majority of legumes encompassing low fat24. SFA ranged from 5.4 mg/g in Melilotus MO to 30.1 mg/g in OS represented by mainly myristic (peak 30), palmitic (peak 33), stearic (peak 38), arachidic (peak 39), behenic (peak 42) and lignoceric acids (peak 43). While unsaturated fatty acids were detected at highest levels in Ononis OA and OS (19.6 and 34.1 mg/g, respectively) versus lowest in Melilotus MO and MS (0.8 and 1.3 mg/g, respectively), comprising oleic (peak 35), linoleic (peak 36) and α-linolenic acids (peak 37).

Saturated fatty acids (SFA)

The examined seeds encompassed palmitic and stearic acids as the major SFA, others detected at much smaller levels included arachidic, myristic, and behenic acids, in accordance with previous reports16,25,26. Palmitic acid (C16:0) (peak 33) was the major detected SFA in the examined seeds reaching highest level in OS and OA (10.5 and 9.9 mg/g, respectively), and lowest in Melilotus MO and MS (1.9 and 2.3, respectively). Likewise, monopalmitin (peak 41) was present in Ononis OS at its highest concentration (4 mg/g) followed by Melilotus MA, Ononis OA, and Medicago MDO (1–1.6 mg/g), while it ranged from 0.5 to 0.9 mg/g in other seeds. Although palmitic acid has negative effects on chronic adult ailments, it remains an essential element in the membrane, transport, and secretory lipids27. Its average daily intake is at ca. 20–30 g accounting for 8–10 energy%.

Also, high levels of stearic acid (C18:0) (peak 38) were detected in examined seeds ranging from 2.1 in Melilotus MO to 9.2 in OS. The average daily intake of stearic acid was estimated at 8.1 and 5.4 g, accounting for ca. 91% of the total fat for mean and women, respectively28. Besides imparting the required physical characteristics of solid fat29, stearic acid exerts a hypocholesterolemic potential similar to that of oleic acid30,31. Both palmitic and stearic acid dietary supplementation increase milk production in cows32,33, suggesting that examined seeds especially Ononis OS and OA represent a healthy fat source for humans and as fodder.

Arachidic acid was also detected in Ononis i.e., OS at a higher concentration (2.1mg/g) compared to the other seed species (0.08–0.4 mg/g). Likewise, behenic acid and lignoceric acid were detected at exclusively higher levels in OS compared to other seeds (1.7 and 0.9 mg/g) vs. (0.05–0.4 mg/g and 0.04–0.15 mg/g, respectively), suggesting that they could be used as markers to distinguish OS from other Ononis species. While myristic acid was higher in Trifolium TM, TA, Medicago MDS, and ML (1.1–1.3 mg/g) vs. (0.4–0.8 mg/g) in other seed species.

Unsaturated fatty acids (MUFA and PUFA)

The examined seeds were enriched in oleic, linoleic, and α-linolenic acids with variable amounts, in line with previous reports16,25,26. Oleic acid (peak 35), monounsaturated ω-9 fatty acid, was the major distinguished MUFA in all seeds ranging from 2.4 to 27.3 mg/g except Melilotus MO and MS (0.6 and 0.8 mg/g, respectively) detected at highest level in Ononis OS 27.3 mg/g followed by OA, Trifolium TM, and Ononis OR (9.9–12.5 mg/g). Such abundance of oleic acid in the later seed species pose them as healthy functional foods, owing to its several health benefits e.g., antioxidant34, anti-inflammatory35, hepatoprotective35, anticancer effects36, besides its potential to lower serum LDL cholesterol37.

ω-3 and ω-6, polyunsaturated fatty acids, are not biosynthesized by humans and must be introduced into the diet38. Linoleic acid (peak 36), ω-6 fatty acid was detected at highest level in Ononis OA, OR, and OS (3.4–4.1 mg/g), while its level ranged from 1 to 2 mg/g in most other seeds. Such enrichment of Ononis species in linoleic acid accentuates their antioxidant39 and anti-inflammatory properties40.

Likewise, α-linolenic acid (peak 37), major ω-3 fatty acid was detected at highlevel in Ononis species, in addition to Medicago MX, MDO, and Melilotus MA compared to other seeds (1.2–3.4 vs. 0.1–0.9 mg/g). Both linoleic and α-linolenic acids exhibit antidiabetic41,42 and antihypercholesterolemic effects37.

A diet with a lower ω-6/ω-3 ratio is suggested to reduce the risk of several chronic ailments e.g., a ratio of 2–3 suppressed inflammation in rheumatoid arthritis patients, also a ratio of 2.5 decreased cell proliferation colorectal cancer patients43. The ω-6/ω-3 ratio in the investigated seeds ranged from 0.35 to 2.1 suggesting a good ω-6/ω-3 ratio. It was found lowest in Melilotus MA, MO, and Medicago MDS (0.35–0.42) and higher in Trifolium TP, TI, TM, Ononis OR, and OA (1.1–2.1).

Similarly, oleic/linoleic acid ratio is important as a higher ratio increases the resistance of LDL to oxidation and consequently decreases atherosclerosis, in addition, it increases the seeds’ shelf life44. The oleic/linoleic acid ratio in examined seeds ranged from 2.8 in OR to 14.2 in Melilotus MO. It was higher in Medicago ML, Trifolium TA, Medicago MDS, and Melilotus MO reaching (11.3, 12.1, 12.6, and 14.2, respectively). Such richness in oleic, linoleic, and α-linolenic acids with good ω-6/ω-3 and oleic/linoleic acid ratios of Ononis species (OR, OS and OA) and further Medicago (MX) suggest that they could be added to the diet to regulate serum cholesterol levels.

Sugars and sugar alcohols

Sugar level is important for seeds’ nutritional value and taste affecting their palatability. Sugars (mono- and disaccharides) predominated Trifolium TP and TI (26.6 and 27.6, respectively) amounting for the second abundant class in Trifolium TM, Medicago MX, and Ononis species at levels ranging from 9.7 to 30.6 mg/g, while they were remarkably low in Melilotus and Trifolium TA (0.2–3 mg/g) as revealed in the bar chart represented in Fig. 2.

Two Trifolium species (TP and TI) and three Ononis species (OS, OR, and OA) displayed the highest sugar content(24.8–30.6 mg/g), followed by Medicago (MX) at 16.9 mg/g, while other seed species displayed lower levels ranging from 1.2 to 9.7 mg/g. Melilotus displayed the lowest sugar levels (0.15–1.3 mg/g), especially MO (0.15mg/g).

Disaccharides were the most abundant sugar subclass detected in seeds albeit with some variations viz. sucrose, gentiobiose, and trehalose. Sucrose (peak 87) predominated in most of the examined seeds but at different levels in agreement with previous reports describing sucrose as the major legume seed sugar45. It was present at higher levels in Ononis (OR, OS, OA), Trifolium (TP and TI), and Medicago (MX) ranging from 13.3 to 19.1 mg/g, while Medicago species showed the lowest levels (0.04–0.5 mg/g) and suggestive for their lower palatabilty. Sucrose, the disaccharide of glucose and fructose, consumption in moderate amounts potentiates insulin release through fructose occurrence together with a stimulatory amount of glucose. Consequently, such seed intake will not raise post-prandial glucose level46. Gentiobiose (peak 90), the bitter disaccharide47, was enriched in three Trifolium species viz. TI, TP, and TM compared to others (1.5–6.4 vs. 0.01–0.9 mg/g). While trehalose (peak 88) was remarkably higher in Trifolium TP relative to other seed species (1.4 vs. 0.03–0.7 mg/g).

Compared to disaccharides, low levels of monosaccharides were detected except allofuranose (peak 82) which was detected in Ononis at relatively high levels (0.7–1.8 mg/g).

14 Sugar alcohols were identified (peak 65–78) in the examined seeds ranging from 0.5 to 8.9 mg/g. Ononis OS, OR, OA, Medicago MX,and Trifolium TI were the highest in sugar alcohols (4.9–8.9 mg/g) while Melilotus MO was the lowest (0.5 mg/g). Sugar alcohols are sweeteners of low glycemic index that add fewer calories to the diet and are endorsed for diabetic patients, in addition to their prebiotic effects48.

Pinitol (peak 71) was exceptionally high in 3 Ononis species viz. OR, OS, and OA (4–5) slightly higher from that reported in other seed legumes from previous studies e.g., soybean, lentil and chickpea (3.48, 1.97 and 1.95 mg/g, respectively)49. Also, pinitol level was higher in Trifolium TI and TP (2.3 and 1.1 mg/g, respectively) compared to the Trifolium TA and TM (0.3 and 0.4 mg/g, respectively) suggesting that pinitol could be used as a marker to distinguish between Ononis, as well as Trifolium species aside from its several health benefits e.g., antidiabetic, anti-inflammatory, antioxidant, and cardioprotective effects50. Future studies should target the selective removal of interfering low molecular weight carbohydrates such as mono and disaccharides to overcome their interference with the legume’s inositols and consequently their bioactivity as functional foods for diabetic patients and also to decrease their calorie content. Such fractionation can be applied using yeast treatment49 or ion exchange resins51 .

Sorbitol (peak 75) was exclusively high in Medicago MX (2.2 mg/g) compared to other examined seeds (0.02–0.4 mg/g). Contrary to other legumes such as soy beans, neither phytic acid nor the raffinose family oligosaccharides were distinguished in any of the examined seeds. Raffinose oligosaccharides are considered anti-nutrients as they are responsible for the flatulence effect of legume seeds52. Likewise, myoinositol, the precursor of phytic acid, was detected at trace amounts in the examined seeds with higher concentrations in Ononis OA and Medicago MX (0.5 and 0.7 mg/g, respectively). Such absence of raffinose and phytic acid should be further confirmed by examining seeds from different origins and using other techniques. Legumes' nutritional value is hampered by the substantial number of antinutrients´ they encompass. In most legumes, oligosaccharides of the raffinose family, led mostly by raffinose, prevail. Since humans lack α-galactosidase, oligosaccharides are a primary cause of flatulence due to their indigestion accompanied by uncomfortable symptoms i.e., flatulence, nausea, cramps, diarrhea, and abdominal pain due to anaerobic fermentation by the cecal and colonic bacteria. On the other hand, phytic acid forms an insoluble combination with minerals decreasing their bioavailability’s as in case of Fe, Zn, Mg, Ca, Cu, and Mn ions. In order to decrease antinutrients and improve bioavailability of nutritional components in the food system, a number of conventional food processing procedures are typically employed including soaking, germination, fermentation, and cooking can be used52. Phytic acid could be separated using ion exchange chromatography and estimated as described by53 or54, whereas raffinose could be estimated using HPLC or TLC as described in53 for QC procedure in food products.

Organic acids/alcohols

Organic acids stimulate pancreatic enzymes’ secretion, induce digestion and absorption of many metabolites, moreover they have a strong bactericidal effect55 and act as preservatives in food56.

17 Organic acids were identified in the seeds under investigation (peak 48–64) detected at levels ranging from 2.9 to 6.3 mg/g mainly represented by lactic acid and γ-hydroxybutyric acid. Trifolium TI, Medicago MDO, and Trifolium TP displayed the highest organic acids level at ca. 6 mg/g, accounting for their slightly sour taste. Lactic acid (peak 50) was the major organic acid in all seeds detected at highest level of ca. 3 mg/g in Melilotus MS, Trifolium TP, TI, and Medicago MDO. γ-Hydroxybutyric acid (peak 58) was the second major organic acid in the seeds detected at (0.7–1.4 mg/g) with lower levels in Trifolium TM, TA, and Medicago ML (0.7–0.8 mg/g) and higher in Trifolium TP, TI, and OS (1.3–1.4 mg/g). It is noteworthy that oxalic acid, the health-hazardous organic acid, was not present in examined seeds57.

Ononis seeds viz., OS, OA, and OR showed the highest alcohol level (8–9.9 mg/g) compared to other seeds (2.1–6.5 mg/g). Glycerol (peak 3), the sweet triol58, was the major alcohol in examined seeds, especially in OS, OA, and OR (7.4–9 mg/g) exceeding the other seed species (1.4–5.7 mg/g).

Amino acids/nitrogenous compounds

Free amino acids in examined seeds ranged from 5.2 to 12 mg/g. Ononis species (OA, OS, and OR), Medicago species (MX and MDO), and Trifolium species (TP and TI) showed the highest free amino acid level (ca. 9–12 mg/g) posing them as potential nutritive sources, yet their crude protein content should be further investigated. While the other Trifolium species (TA and TM) and Melilotus (MO and MS) were the least enriched (ca. 5.2–5.7 mg/g).

Essential, non-essential, and conditionally essential free amino acids were all detected in the examined seeds. The identified essential amino acids comprised valine, leucine, isoleucine, phenyl alanine, threonine, cysteine, while the conditionally essential amino acids included proline and tyrosine, and the nonessential amino acids encompassed alanine, serine, glycine, aspartic acid, pyroglutamic acid, and glutamic acid. l-Threonine (peak 15) was the major identified amino acid in all seeds (3.4–5.2 mg/g), followed by glycine (peak 17) (1.23–3.25 mg/g) with higher levels in Ononis OA and Trifolium TI (5.2 and 5 mg/g for threonine and 3.3 and 2.9 for glycine, respectively). Whereas, pyroglutamic acid (peak 21), the memory-enhancing amino acid59, was abundant in most seeds with higher levels in Trifolium TI, TP, Medicago MDO, MX, Ononis OR, OS, and OA ranging from 1 to 2 mg/g. On the other hand, serine (peak 13) was distinguished in Medicago seeds at relatively higher concentrations than others (0.3–0.8 mg/g vs. 0.02–0.06 mg/g).

Interestingly, cysteine level (peak 19) in both Trifolium TP and Medicago MX (1.87 and 1.11 mg/g, respectively) was higher than in other seed species (ca. 0.01–0.2 mg/g). As reported in many seed legumes, the examined species except Trifolium TP and Medicago MX were low in sulfur-containing amino acids e.g., cysteine4, pointing out that they may not be sufficient protein sources but should be supplemented with other balanced protein sources60. Moreover, lysine, histidine, tryptophan, and methionine were not detected contrary to previous reports in Medicago, Melilotus, Trifolium, and Ononis4. Hence, additional crude protein profiling with appropriate protein extraction and analysis techniques is recommended to verify their exact protein content. Different techniques could be utilized to assign the entire amino acid composition e.g., GC-FID and GC-IRMS61 and ion-exchange HPLC62.

Nitrogenous compounds were detected in the examined seed species at low levels ranging from 0.14 to 1.9 mg/g. They were mainly represented by nicotinic acid (peak 44) in Trifolium TI, TP, and OA (0.7–1.5 mg/g). Nicotinic acid has positive effects in cases of dyslipidemia as it greatly increases the plasma high-density lipoprotein (HDL) cholesterol levels. It is worth mentioning that none of the antinutrient biogenic amines were detected in the examined seeds e.g., cadaverine, putrescine, tyramine, and tryptamine, indicating their good storage and safety52.

Steroids and tocopherols

Unlike other seed legumes in which β-sitosterol is the most abundant phytosterol e.g., peas and lentils (1.91 and 1.23 mg/g)63, the examined seeds showed trace amounts except for Ononis OS (0.7 mg/g). Likewise, they showed small amounts of tocopherol (peak 95, 0.01–0.1 mg/g) and squalene (peak 94, 0.08–0.6 mg/g), detected only at high level in Medicago MDS, MDO, and MX (0.8–1.6 mg/g). This implies that in the examined seeds, except Medicago MDS, MDO, and MX, do not present rich sources of these antioxidants compared to other seeds64. Profiling using LC–MS can though better provide insight on these seeds antioxidant potential with regards to phenolics content. Few terpenes such as limonene and cineole were detected at trace levels and likely to contribute for flavour in MDS and OA (0.3 and 0.1 mg/g, respectively)X.

GC–MS-based multivariate data analysis (MVDA) for the primary metabolites of seeds of Medicago, Melilotus, Ononis, and Trifolium species

Unsupervised multivariate data analysis PCA and HCA of whole dataset

GC–MS based MVDA analysis tools were further employed to assess metabolites variations (differences) among the seeds of Medicago, Melilotus, Ononis, and Trifolium species. The unsupervised HCA and PCA analysis, in addition to the supervised OPLS-DA were employed to assist in species (accessions) distinction and markers identification.

The unsupervised HCA and PCA (Fig. 3) was established for discrimination between seeds of Medicago, Melilotus, Ononis, and Trifolium accessions. The HCA (Fig. 3A) portrayed one main cluster of the Ononis accessions, which could be ascribed to Ononis richness in fatty acids (29.8–64.2 mg/g), sugars (24.8–30.6 mg/g), sugar alcohols (5.6–8.9 mg/g) and free amino acids content (8.7–12 mg/g) as revealed from GC–MS analysis (Table 2). It should be noted that HCA failed to discriminate between all other seeds of Medicago, Melilotus, and Trifolium, as their independent biological replicates were dispersed and overlapped. The generated PCA model (Fig. 3B) accounted for 64% of the total variance, with PC1 and PC2 to account for 52% and 12%, respectively. The PCA model showed partial segregation of T. incarnatu (TI) and T. pannonic (TP) in one cluster in the upper right side and another cluster in the lower right side for O. repens (OR), O. arvensis (OA), and O. spinosa (OS). However, an obvious overlap of the independent biological replicates was observed among O. natrix (ON), T. montanu (TM), T. arvense (TA), and M. albus (MA). The unsupervised PCA loading plot (Fig. 3C) revealed that alcohols (glycerol), fatty acids (linolenic, oleic, palmitic and stearic acids), sugars (β-gentiobiose, sucrose and unknown disaccharide), and sugar alcohols (pinitol) were the markers responsible for such segregation. Hence, PCA model failed to provide clustering of individual seed replicates, except for T. incarnatu (TI) and T. pannonic (TP). Therefore, supervised OPLS-DA analysis was further adopted to minimize variance among replicates for each species to achieve better (species) separation.

Figure 3
figure 3

GC–MS based HCA and PCA of primary metabolites from all seeds’ specimens. (A) HCA plot. (B) Score plot of PC1 vs. PC2 scores. (C) Loading plot for PC1 & PC2 contributing metabolites and their assignments. The metabolome clusters are located at the distinct positions in two-dimensional space described by two vectors of principal component 1 (PC1) = 52% and PC2 = 12%.

Unsupervised multivariate data analysis PCA and HCA of each genotype separately

Unsupervised PCA models were further constructed for the accessions within the same genotype (genus) each modelled separately to assess the variability and similarities between the accessions and for better identification of markers within each genotype.

The unsupervised PCA score plot for Medicago accessions (Fig. 4A) showed a total variance at (71.4%), with PC1 (58.2%) against PC2 (13.2%), however failed to discriminate between Medicago accessions. Two replicates from M. xvaria (MX) accessions were clustered together, while the other replicate was clustered and overlapped with other Medicago accessions. The PCA loading plot (Fig. 4B) indicated that alcohols (glycerol), fatty acids (palmitic and oleic acids), sugars (sucrose), and sugar alcohols (pinintol and sorbitol) contributed to such segregation.

Figure 4
figure 4

GC–MS based HCA and PCA of primary metabolites from all Medicago seeds accessions (A) Score plot of PC1 vs. PC2 scores. (B) Loading plot for PC1 & PC2 contributing metabolites and their assignments. The metabolome clusters are located at the distinct positions in two-dimensional space described by two vectors of principal component 1 (PC1) = 58.2% and PC2 = 13.2%.

The unsupervised PCA score plot for Melilotus accessions model showed a total variance at (75.6%), with PC1 (59.1%) versus PC2 (16.5%). The PCA score plot (Fig. 5A) showed segregation of one of the three replicates of M. albus (MA) at the upper right side, while other replicate of M. albus (MA) was segregated far at the right lower side. Although the other Melilotus accessions were clustered and overlapped together. The PCA loading plot (Fig. 5B) indicated that alcohols (glycerol), fatty acids (oleic and palmitic acids), sugars (sucrose), and sugar alcohols (sorbitol) were potential markers for the segregation of M. albus (MA3), while segregation of M. albus (MA2) was assigned to its richness in fatty acids/esters (i.e., stearic acid and 1-monopalmitin) and the sugar acid (arabino-hexanoic acid, 3-deoxy-O-lactone).

Figure 5
figure 5

GC–MS based PCA of primary metabolites from Melilotus seeds accessions. (A) Score plot of PC1 vs. PC2 scores. (B) Loading plot for PC1 & PC2 contributing metabolites and their assignments. The metabolome clusters are located at the distinct positions in two-dimensional space described by two vectors of principal component 1 (PC1) = 59.1% and PC2 = 16.5%.

The unsupervised PCA score plot for Ononis accessions (Fig. 6A) showed a total variance at (78.1%). The PCA score plot showed clear separation of O. natrix (ON) accessions, while the independent biological replicates of O. spinosa (OS), O. repens (OR) and O. arvensis (OA) accessions were overlapped and dispersed. The PCA loading plot (Fig. 6B) indicated that the sugar D-allofuranose contributed for O. natrix (ON) separation, while glycerol, linolenic acid, sucrose, and pinitol more erniched in O. repens (OR) accessions.

Figure 6
figure 6

GC–MS based HCA and PCA of primary metabolites from all Ononis seeds accessions. (A) Score plot of PC1 vs. PC2 scores. (B) Loading plot for PC1 & PC2 contributing metabolites and their assignments. The metabolome clusters are located at the distinct positions in two-dimensional space described by two vectors of principal component 1 (PC1) = 63.9% and PC2 = 14.2%.

The unsupervised PCA score plot for Trifolium accessions (S. Fig. 1) portrayed two clusters, one for both T. pannonic (TP) and T. incarnatu (TI) at the right side, whereas other cluster was for T. montanu (TM) and T. arvense (TA) at the left side. The PCA score plot showed a total variance at (77.6%) and in agreement with HCA result (S. Fig. 1a). The PCA loading plot (S. Fig. 1c) demonstrated that both T. montanu (TM) and T.arvense (TA) segregation was attributed fortheir richness in myristic acid. In contrast, T. pannonic (TP) and T. incarnatu (TI) segregation was accounted for their richness in alcohol (glycerol), fatty acid (oleic acid) and sugars (β-gentiobiose, sucrose, and pinitol).

Supervised multivariate data analysis OPLS-DA

Supervised OPLS-DA (S. Fig. 2A) was performed in an attempt to differentiate between the seeds independent replicates and to further identify metabolite markers, albeit constructed model prediction power was relatively weak (negative value). Though, the OPLS-DA inner class relationship (S. Fig. 2B) revealed overlap of O. spinosa (OS) and O. arvensis (OA) independent replicates and their distant segregation.

Another supervised OPLS-DA (S. Fig. 3) was likewise employed to identify the markers responsible for the segregation (clustering) of the Ononis species as concluded from the unsupervised HCA and PCA (Fig. 3), and the supervised OPLS-DA inner class relationship (S. Fig. 2B). The supervised OPLS-DA was constructed in which Ononis species were modelled in one class against Medicago, Melilotus, and Trifolium species in the other class. The developed model (S. Fig. 3) showed a better samples separation, R2 (88%) and Q2 (78%), indicating high prediction power. The OPLS-DA score plot confirmed segregation of Ononis species from all other seeds’ accessions. The OPLS-DA score plot (S. Fig. 3B) revealed that alcohols (glycerol), fatty acids (linolenic, palmitic, and oleic acids), sugars (d-allofuranose, sucrose, unknown disaccharide), and sugar alcohols (pinitol) are the main discriminators of Ononis species, confirming the notable differences in their GC–MS based metabolites profiles (Table 2). The developed OPLS-DA model was validated using permutation test, confirming its statistically significant, as p-value being lower than 0.05 (S. Fig. 4).

Metabolites enrichment analysis

Metabolites enrichment analysis of the monitored metabolites using GC–MS was employed to reveal for the most differential pathways in each seed genus viz. Medicago, Melilotus, Ononis and Trifolium (S. Fig. 5), using the (Functional analysis) module of MetaboAnalyst 5.0.

The major mapped pathways with greatest number of differentially expressed genes (DEGs) in the seeds of Medicago genus included alpha linolenic acid/ linoleic acid metabolism, amino sugar metabolism, β-alanine metabolism, starch and glucose metabolism, lysine degradation, arachidonic acid metabolism, galactose metabolism, and oxidation of branched chain fatty acids pathways (S. Fig. 5A).

With regards to seeds of Melilotus genus, starch and sucrose metabolism, b-alanine metabolism, glycine/serine metabolism, oxidation of branched chain fatty acids, inositol/inositol phosphate metabolism, and phosphatidylinositol phosphate metabolic pathways were the main presented pathways (S. Fig. 5B).

Additionally, top mapped pathways in Ononis genus belonged to fatty acids metabolism, oxidation of branched chain fatty acids, α-linolenic acid/linoleic acid metabolism, β-oxidation of long chain fatty acids, glycine/serine metabolism pathways (S. Fig. 5C).

The Trifolium genus were enriched with galactose metabolism, starch and sucrose metabolism, fatty acids biosynthesis, beta oxidation of very long chain fatty acids, inositol metabolism, inositol phosphate metabolism, and phosphatidylinositol phosphate metabolism, steroid biosynthesis pathways versus Medicago, Melilotus, and Ononis groups (S. Fig. 5D).

Conclusion

Our results revealed that among examined seed legumes viz. Melilotus, Medicago, Ononis, and Trifolium, Ononis seeds (OR, OS and OA) were almost the most abundant in fatty acids (29.8–64.2 mg/g), sugars (24.8–30.6 mg/g), sugar alcohols (5.6–8.9 mg/g) and free amino acids content (8.7–12 mg/g), while less enriched in organic acids (4.3–5.4 mg/g), as displayed in the radar plot (Fig. 7), suggesting that they are nutritionally valuable and palatable both for human and as fodder. In contrast, Melilotus species (MO and MS) were not enriched in fatty acids (6.2–7 mg/g), sugars (0.2–1.3 mg/g), sugar alcohols (0.5–1.1mg/g), and free amino acids (5.7–6.1 mg/g), suggesting that they are not treasured as potential nutrients neither for human nor as fodder.

Figure 7
figure 7

A radar plot illustrating the concentration of major metabolite classes in Melilotus, Trifolium, Medicago, and Ononis seed species expressed as mg/g.

OS was the richest in fatty acids followed by OA and OR. Likewise, OS was the most abundant in sugars followed by TI, TP, OR, and OA (Fig. 7).

Interestingly, OS displayed the highest fatty acids (64.2 mg/g), sugars (30.6 mg/g) sugar alcohols (8.9 mg/g), alcohols (9.9 mg/g) and moderate free amino acids content (9.6 mg/g) and organic acid (4.9 mg/g) compared to all other seeds, proposing its nutritional value and palatability.

Lacking many essential free amino acids, the examined seeds may not considered as a sufficient protein source but further studies should be conducted to unveil the total protein content both as free aminoacids protein. However, T. pannonic and M. xvaria are considered the best with relatively high free amino acids as total and essential amino acids viz. cysteine and threonine. Further crude protein profiling using LCMS platform shall provide better insight of these seeds protein content.

The fatty acids profile of Ononis species (OR, OS and OA) and Medicago (MX) revealed for their richness in oleic, linoleic, and α-linolenic acids with good ω-6/ω-3 and oleic/linoleic acid ratios and suggest for their potential inclusion in diet to regulate serum cholesterol levels and prevent atherosclerosis. Asides from such fatty acids profile, Ononis richness in sugar alcohols such as pinitol or sorbitol pose for their low calorie content. Future studies should now focus on exploring secondary metabolites and their biochemical activities in these seeds.

The aim of the present study was to estimate free amino acid content and other low molecular weight primary metabolites as free sugars and their contribution in the nutritional value of the examined seeds using GC/MS. GC/MS has been previously used in other studies to detect primary metabolites’ content in plants exemplified by free amino acids, sugars and fatty acids1,65. Future studies should now focus on determining crude protein levels using appropriate protein extraction techniques to verify their exact total protein composition.

Chemometric tools have succeeded in the identification of Ononis metabolites’ markers belonging to various classes i.e., (alcohol) glycerol, sugars (D-allofuranose,), and sugar alcohols (pinitol). The sugar D-Allofuranose was ascribed as discriminator marker for O. natrix (ON) accessions. Additionally, in Trifolium species, the segregation of T. montanu (TM) and T. arvense (TA) was attributed for their richness in myristic acid. The differentiation between Medicago, Melilotus, and Trifolium genera was not achieved, also the discrimination between species of the same genus was not attained suggestive for the use of stronger taxonomic markers for their classification targeting their secondary metabolome using LC/MS. Although we have targeted only accessions in legume seeds, same approach can be applied in the future for the exploration of factors affecting legume seeds metabolites profiling, including seasonal variations, cultivation, and storage conditions. Also, our study can provide new interesting details for future taxonomical studies especially if targeting larger genotypes.