Introduction

Mango (Mangifera indica) is a widely popular tropical fruit belonging to Anacardiaceae family. It is known as the king of fruits owing for its delicious taste, fragrance and potential nutritional value1. Mango fruit varies in shape, size, flesh and peel color, taste and aroma, that are all basically dependent on cultivar type2. Mango increasing economic potential in the global market3, is manifested by its high production yield of mango to rank as the predominant tropical fruit in the twenty-first century4. Mango ranks the 5th amongst most cultivated fruit crops worldwide owing for its rich nutrient composition and phytochemicals5. It is grown in over 90 countries to amount for ca. 50% of tropical fruits produced worldwide6. The world production of mango cultivars reached 55.9 million tons in 20197. The majority of mango trees is cultivated in Asia in particular India8 as a major exporter to yield 17–23 million tons9. Mango has widely emerged in other countries including China, Indonesia, Thailand and Egypt to all account for 80% of the total world production1.

As a major fruit crop, it is represented by 1000 cvs., though only few of them are cultivated (30)1. Among worldwide sources of mango, Egyptian mango has gained increasing attention due to its remarkable flavor and taste among consumers10. In our previous publication, various cvs. have been investigated, specifically their pulp, based on their volatiles and bioactive compounds using solid-phase microextraction coupled with gas chromatography/mass spectroscopy (SPME-GC/MS)10 and bioactive secondary metabolites using ultra-performance liquid chromatography (UPLC) coupled to MS (UPLC/MS) and in relation to its antioxidant activity11. According to the development of functional foods, mango peel powder has been included in bakery products, jellies and pastas attributed to its potential antioxidant activity and glycemic index, while mango peels extracts were incorporated in co-pigments and lipid peroxidation inhibitors12. These effects have yet to be examined for mango kernels though unlikely as no carotenoids are reported in kernels compared to their richness in fruit pulp and peel13.

In recent years, several spectroscopic techniques including mostly hyphenated techniques such as gas/liquid-chromatography coupled to mass-spectroscopy (GC/LC–MS) analysed using multivariate data analyses (MVA) have greatly aided in the holistic characterization of metabolome, and further in samples classification in response to different status or phenotypes11,14. Multivariate data analyses (MVA) are typically employed in unsupervised mode, such as principal component analysis (PCA), or supervised one exemplified by orthogonal partial least-squares discriminant analysis (OPLS-DA) for visualization of the rich spectral datasets. Both are routinely incorporated for the classification of investigated specimens, and products fingerprinting and authentication for quality control purposes15. Such an approach has increasingly been used for quality control of functional foods in the context of determination of freshness, geographical and genotype, processing, and or adulteration detection11.

Following the potential economic value of mango fruits in the world market, we have previously reported on the use of MVA for classification of Egyptian mango cvs. Egypt’s total area under mango cultivation reached 130,000 ha with a total production of 766,128 tons regarded as the most important fruit crop cultivated in all Egyptian province16. Fourteen cultivars of mango fruits from different localities were subjected to aroma profiling using headspace solid phase microextraction SPME coupled with gas chromatography mass spectroscopy (GC–MS)10, and revealing for distinct aroma profile especially for premium mango cvs. such as Awees being enriched in terpenes10. More recently, liquid chromatography mass spectrometry and in comparison, to UV fingerprinting were employed for classification of mango fruit cvs. targeting their specialized metabolites in context to different cvs. and or geographical origin in Egypt11. A potential classification was observed from both analytical platforms viz. Ultra-performance liquid chromatography-mass spectrometry (UPLC/MS) and UV spectroscopy, revealing for higher phenolic content in premium Aweis cv concurrent with potential antioxidant effect11. UPLC/MS led to the identification of 47 peaks belonging to tannins as gallic acid esters, flavonoids, xanthones, phenolic acids and oxylipids, and confirmed from UV/Vis fingerprinting showing absorption patterns mostly attributed to galloylated conjugates and phenolic acids.

During mango fruits processing, kernels represent the main by-products that is typically discarded presenting a major biowaste in mango industry17 though rich in several phytochemicals and nutrients. According to mango varieties, kernel accounts for 10–25% of the whole fruit weight18. However, more than one million tons of mango kernels are being annually produced presenting valuable product if subjected to valorization practices increasingly adopted in food industry18. To maximize valorization practices and identify potential uses of mango wastes, detailed metabolites characterization is still needed19.

Mango kernels typically encompass bioactive components such as, carotenoids, phenolics and ascorbic acid, in addition to macronutrients such as proteins (6–13%), lipids (6–16%) including oleic and stearic acids20, carbohydrates (58–80%)20. Historically, mango`s kernels were traditionally used to treat gastric related ailments in Indian medicine21, such as vermifuge, an astringent in diarrhea, hemorrhages, bleeding hemorrhoids and in cases of gastritis22. Further, bioassays have revealed that mango kernel extract exhibited potential antimicrobial effect of potential to be used as food preservative23, and as a source of natural antioxidant additive in food processing17. Mango kernel has been used in the production of mango butter and seed, which are used in functional foods1.

From relevant studies, nutritional composition of mango fruit is clearly dependent on the type/variety of the cultivar, the origin and climatic conditions of its production locality, and maturity13, and less reported in case of its kernel or leaf as other byproducts. It is thus crucial to compare the chemical metabolites of different cvs. of mango kernels from several varieties and origin to identify best sources for valorization practices based on detailed chemical composition. This study presents the first metabolomics approach using GC/MS for the classification of Egyptian mango`s kernels as one of the major producers of this fruit worldwide. Kernels were collected from trees grown in different regions alongside Egypt where mango typically grows and further represented by different cvs. targeting their primary metabolites composition using GC/MS and analysed using chemometric tools.

Material and methods

Samples collection

Fresh ripe Egyptian mango fruits (17 samples) were collected from farms of Sharqia (30.7° N, 31.63° E), Suez (29.58° N, 32.33° E), and Ismailia (30.35° N, 32.16° E) provinces at the east region of the Nile Delta, in addition to at the west region of the Nile Delta at the west bank of the Nile River (Giza), Suppl. Figs. S1 and S2. All samples were cultivated in sandy soil versus Sharqia cvs. that grown in mixed soil; sand and clay. The irrigation methods varied between drip and spray irrigation. Regarding fertilization, fertilizing mango trees that drip irrigate; Primary irrigation: 2 kg ammonium sulfate + 1 kg magnesium sulfate. Second irrigation: 2 kg compound fertilizer 19 + 19 + 19. Third irrigation: 2 kg potassium sulfate + 2 L phosphoric acid. Fourth irrigation: 2 kg calcium nitrate. Foliar spraying: To increase the percentage of nodes at the beginning of flowering, use coltar (exini compound) at a rate of 75 cm3/100 L of water. Foliar spraying is carried out at the following rates in the event of symptoms of deficiency of elements on the trees (200 g sulfate + 200 g chelated zinc + 200 g chelated manganese + 100 g copper sulfate + 100 g magnesium + 50 g borax + 250 g urea (to raise the absorption efficiency) 600 L of water.

Specimens were authenticated by Dr. Tarek Eissa, October University for Modern Sciences and Arts and coded according to the cv. type and geographical origin. The first letter in codes denote for origin as such: S, Suez; Q, Sharqia; I, Ismailia & G, Giza, second letter denotes for cv. name, and third letter for seed as shown in Table 1. In addition, the stage of maturity was confirmed by its external firmness and color, which differs according to each cv. The selected fruits were directly peeled, kernel removed from fruits prior to being stored at −20 °C till analysis.

Table 1 List of the collected mango specimens including code, location and description, first letter in codes denote for origin as such: S, Suez; Q, Sharqia; I, Ismailia & G, Giza.

Voucher specimens are kept in the Pharmacognosy Department, Faculty of Pharmacy, Cairo University with the same codes used in the current studs after the addition of the department’s initials and year of collection. For instance, SAS was kept under the voucher code PG_CU_SAS_2021. All experimental procedures were carried out in accordance with the relevant laws and guidelines, including the appropriate permissions for the collection of plant specimens.

Samples preparation

Mango’s dried kernel was grounded separately using mortar and pestle under liquid nitrogen. The powdered seed (30 mg) was homogenized with 2.5 mL methanol containing 5 μg/mL xylitol (as internal standard for relative quantification) using a Turrax mixer. To prevent extra heating, homogenization was operated at 11,000 rpm for five 20 s periods, separated by 1 min of recession. After that, extract was vortexed vigorously and centrifuged at 3000g for 30 min to remove debris; with 100 μL aliquoted for chemical analysis. Three biological replicates were carried out for each kernel sample24.

GC/MS analysis

Kernel dried extracts (100 μL) were obtained via evaporation under nitrogen stream. About 150 μL of N-methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA) was used in derivatization for 45 min at 60 °C. Before GC/MS analysis, sample equilibrium was thoroughly processed via Shimadzu GC-17A gas chromatograph coupled to Shimadzu QP5050A mass spectrometer at 28 °C. The applied column (Rtx-5MS) was described by 30 m of length with inner diameter at 0.25 mm while the thickness film was 0.25 μm. Split mode was implemented for injections with a split ratio of 1:15 under conditions of : injector temp. 280 °C, column oven temp. 80 °C for 2 min, then modified to 315 °C at a rate of 5 °C/min, and kept isothermally at 315 °C for 12 min, when the flow rate of carrier gas (He) was 1 mL/min. Transfer line temp. was set at 280 °C and ion source temp. adjusted at 180 °C. Electron ionization mode (EI, 70 eV) with a scan range of m/z 50–650 was used. AMDIS software (https://www.amdis.net) was involved in identification; firstly peaks were deconvoluted to determine the silylated metabolites then comparing their retention indices (RI) with n-alkanes series (C8–C40), and mass matching to NIST25 and WILEY library databases and with standards if possible according to previously reported procedure26.

Metabolites identification, quantification and modelling

GC–MS files were converted to. netcdf file format using through MS Convert option in Shimadzu program, then to abf files utilizing ABF converter (https://www.reifycs.com/AbfConverter/). In that regard, data analysis was performed using MS dial software (http://prime.psc.riken.jp/compms/msdial/main.html) according to the following parameters: mass range (0–500 Da), MS1 tolerance for alignment (0.015 Da), retention time (0–30 min), minimum peak height (1000), sigma (0.7), accurate mass tolerance (MS) 0.01 Da, and peak height 1000. Alcohols, organic acids, fatty acids, soluble sugars and free amino acids were quantified using standard curves of glycerol, lactic acid, stearic acid, glucose and glycine and expressed as mg/g. For the standard curves, eight serial dilutions were prepared (from 10 to 600 μg/mL) following conditions cited in Fahmy et al.27. Peak abundance was exported for multivariate data analysis where final ID and metabolites were Pareto scaled using SIMCA 14.1 (Umetrics, Umea, Sweden) in which the obtained data were subjected to principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA). PCA was carried out to show the variance of metabolites amongst different samples whilst information on differences in the metabolite composition can be professed by OPLS-DA. In addition, Q2 and R2 were involved to induce the performance of the chemometric models and number of permutations; Q2 reflects the model predictability and R2 determine the fit goodness. The cross-validation method Q2 applied was the "k-fold cross validation" in which the calibration set is divided into subsets using a sevenfold. The distance to the model (DModX) was calculated to define outliers whereas Hotelling's T2 was utilized for diagnosis of strong outliers for the OPLS-DA plot24.

Results and discussion

The current study aimed to assess kernels´ metabolome heterogeneity in several mango cvs. represented by 17 cvs. cultivated in different regions in Egypt (e.g., Suez, Sharqia, Giza, and Ismailia) (Table 1). Such comprehensive metabolite profile could aid to identify best cvs. enriched in a certain chemical for future valorization purposes. To assess biological variance for each sample and analysis conditions, three biological replicates were analysed under same conditions using GC–MS following silylation by MSTFA. The use of MSTFA is based due its reaction with many labile functional groups commonly found in organic compounds (e.g., hydroxyl group of polar low molecular weight metabolites as sugars and amino groups in amino acids) to form the more volatile non-polar trimethylsilyl (TMS)-ether derivatives. Such derivatization reactions have broadened the scope of GC–MS analysis to other non-volatile compounds, including sugars, amino acids, and fatty acids28.

A total of 41 peaks were identified in mango kernels belonging to different classes including sugars (15), fatty acids/esters (6), amino acids (5), sugar alcohols (6), nitrogenous compounds (5), phenolics and phenolic lipids (4). In addition, trace levels of fatty alcohols, ketones, acids and alcohols were detected as displayed in Fig. 1. GC–MS chromatogram showed the major chemical constituents associated by the most abundant classes in mango kernels (Fig. 2).

Figure 1
figure 1

The phytochemical constituents of different cvs. of mangos` kernels originated from four localities in Egypt including Suez, Sharqia, Ismailia and Giza (e.g. SZS, SKS, SNS, SAS, SFS, QFS, QRS, QQS, QZS, QAS, QNS, QSS, QGS, QHS, QDS, IMS, and GZS) expressed in µg/mg. Sugars appear as the major metabolite class followed by fatty acids. For mango codes refer to Table 1.

Figure 2
figure 2

A representative GC–MS chromatogram with the major identified constituents from mango kernel collected from different Egyptian regions.

Quantitative variance of the identified metabolites in mango kernels were represented in Table 2. The tabulated data revealed the significant abundance of sugars particularly in Seedeq and Arnab cvs. The highest sugar level was detected in cvs. QQS, QRS, and QSS at 290.7, 283 and 122.5 µg/mg compared to GZS and IMS at 4 and 5.2 µg/mg, respectively. Seedeq mango is a creamy fruit with no fibre and a distinct sweet sugary taste and texture that distinguishes it from other varieties, making it suitable for the majority of individuals29. Besides, mango has the ability to be devolved into crystalized sugar, which may be considered a great substitute for cane sugar. It is also high in antioxidants and polyphenols, which have health benefits such as improving lipid profiles, stabilizing blood glucose fluctuations, and perhaps acting as an antidiabetic sugar30. Hence, QQS and QRS cvs. are highly recommended for further investigation to be involved as antidiabetic sugar due to the high percentage of antioxidants and polyphenols. Interesting to notice that Aweis and Fons cvs. displayed a small change in sugar levels based on origin, being detected at 14.5 µg/mg in SAS compared with QAS at 8.9 µg/mg, while SFS and QFS encompassed 10.8 and 7.6 µg/mg, respectively. In addition, Aweis and Fons were founded to be richer in fatty acids than sugars and this enhanced their test and flavor attributed to abundance of butyl caprylate for the first time in the Egyptian mango. In consistency with sugar results, sugar alcohols were more predominated in both QQS and QRS cvs. Variation of each volatile class will be deeply investigated in the following subsections.

Table 2 Quantities of identified metabolites in Mango kernel by GC–MS post silylation expressed as mean µg/mg ± SD (n = 3). The meaning of sample codes is listed in Table 1.

Sugars

Sugars amounted for the most abundant class in kernels represented by mono- and di-saccharide ranging from 3.5 to 290.9 µg/mg among cvs. The highest sugar level was detected in cvs. QQS, QRS, and QSS detected at 290.9, 283, and 122.5 µg/mg, respectively. Moderate levels were likewise detected in QZS (93.1 µg/mg) and SKS specimens (81 µg/mg) compared with QHS and QGS cvs. that showed the least level (3.5 µg/mg) (Table 2). In that context, the comparative study displayed that sugar content was variable within mango kernels that originated from Sharqia province and Suez. This finding was in agreement with previous literature of which sugars do not represent strong taxonomical or geographical markers being influenced by agricultural practices independent of location or cv. type26. Noticeably, monosaccharides viz. glucose, fructose, and talose were the major sugars, while disaccharides represented by maltose and melibiose were found at much lower levels in all kernels, and suggestive that mono sugars amounted for the major sugar source in kernels, and in agreement with our previous results in mango fruit11. Sucrose, fructose and glucose represent the principal sugars being detected in mature and ripe mango31,32, however traces of sucrose were detected in kernels.

With regard to cvs. collected from multiple sites in Egypt, Zabdeya as one of the most popular and highly consumed mango type in Egypt was collected from Suez, Sharqia, and Giza with notable differences in sugar levels (4.0–93.1 µg/mg) likely attributed to agricultural practices. Highest total sugar level was detected in kernels from Sharqia in QZS at 93.1 µg/mg followed by SZS at 42.5 µg/mg, while cvs. collected from Giza GZS showed lowest level at 4.0 µg/mg (Table 2). Such differences in sugar level among kernels collected from different regions was also observed though to less extent in case of Aweis and Fons kernels showing only slight variation in sugar levels based on origin, being detected at 14.6 µg/mg in SAS versus 8.9 µg/mg for QAS, while SFS and QFS encompassed 10.8 and 7.6 µg/mg, respectively.

Sugar alcohols

Sugar alcohols have increasing attention in dietary nutrition and health as low calorie sweeteners in bakery, beverage and confectionary33. These sugars are not readily absorbed providing fewer calories than other table sugars34. In mango kernels, sugar alcohols predominated in all cvs. ranging from 1.0 to 38.1 µg/mg, with the highest level found in Sharqia province represented by QQS and QRS cvs. at 38.1 and 29.1 µg/mg. Moderate levels were detected in Suez samples in SKS, SNS, and SZS ranging from 6 to 9.5 µg/mg.

Interestingly, one of the premium mango types including Fons and Aweis cultivated in both Suez and Sharqia recorded almost comparable sugar alcohols´ level at 1.3 and 1.6 µg/mg for SFS and QFS, whilst 2.0 µg/mg for SAS and QAS cvs. In contrast, Zabdeya (SZS, QZS, and GZS) from the three locations; Suez, Sharqia, and Giza, induced a clear variance in sugar alcohols´ level being detected at 6.0, 9.6, and 1.2 µg/mg respectively (Table 2). This finding was consistent with sugar levels suggestive that sugar alcohols could be affected by origin.

With regards to sugar alcohols, 1,5-anhydro-d-glucitol was detected at highest the level 2.7 µg/mg in QQS versus trace levels in all other kernels. It should be noted that few reports have investigated the health impact of 1,5-anhydro-d-glucitol, and little is known about its actions in vivo being a rare saccharide35. 1,5-anhydro-d-glucitol showed competitive inhibition of trehalase and trehalose phosphorylase, and is likely attributed to its structural similarity with d-glucose36, asides from its antidiabetic action as a low calorie sugar37.

Major sugar alcohols detected in kernels included ribitol, iditol, pinitol, and myo-inositol, with myo-inositol as major form detected at 20.3 µg/mg in QQS cv followed by QRS at 11.6 µg/mg, and in accordance with previous report revealing for its richness in mango fruit38. Myo-inositol is a potential sugar alcohol as low calorie sugar asides for its role in normal cell growth and survival, development and function of peripheral nerves39. The richness of QRS in myo-inositol, was also observed in case of ribitol at 10.1 µg/mg compared to other cvs. (0.1–3.7 µg/mg) (Table 2) and posing this cv. as potential source of sugar alcohols.

Fatty acids/esters

Fatty acids represented the second most abundant class in kernels as expected with mango fruits being enriched in fats11. No major differences were observed in fatty acids profile amongst the selected specimens either based on cv. type or localities. The total fatty acids were identified in all cvs. detected at levels ranging from 14.3 to 18.8 µg/mg, with the highest levels in QSS (18.8 µg/mg) versus lowest in QGS at 14.3 µg/mg (Table 2). Mango peels and kernels are regarded as most rich in lipids presenting good source of fatty acids13, some of which have the potential to be exploited in food industries13.

With regards to newly reported lipid species in mango kernel, fatty acid ester i.e., butyl caprylate was detected in most cvs. as a major form for the first time ranging from 11.7 to 14.4 µg/mg (Table 2). This compound was previously identified as main volatile constituent in mango aroma profile to display a potential repellent activity against insect pests40, adding to fruits shelf life. It also possesses a pleasant flavor and fragrance features posing QSS, SZS, SNS, QAS, and QDS cvs as the most rich source of that natural flavoring agent41. Aroma profiling of mango kernel using more sensitive techniques such as SPME should be considered based on these results. In comparison, monoglyceride non-volatile conjugates exemplified by 1-monopalmitin were detected at lower levels in most cvs. at 2–3 µg/mg, versus trace levels of methyl palmitate, palmitic, linoleic, and stearic acids. These results are though not in accordance with previously reported data in which linoleic, stearic, palmitic, and oleic acids were the major fatty acids in mango fruit13, and suggestive for different lipid profile in kernels from that of the fruit which has yet to be compared for mango from other origins. It should be noted that the high level of stearic, oleic, and palmitic acids enhanced mango kernel fats to be employed as cocoa butter substitute6, and to add to mango kernel fat nutritive and health properties. QRS and QQS cvs were found the richest in these fatty acids.

Phenolics/phenolic lipids

Phenolics and phenolic lipids are well known as potential antioxidant chemicals in food products asides from several health benefits42. The highest phenolics and phenolic lipids levels was detected at 18.3 µg/mg in QQS specimen followed by QZS, QDS and SKS cvs., later detected at 10.8, 10.6, 10.1 µg/mg, respectively. Quantitative differences in phenolics were detected in cvs. from Sharqia province, with highest level found in QQS (18.4 µg/mg) versus lowest in QGS (4.2 µg/mg). Beside Sharqia, Suez province was recorded a change in phenolics in which predominated with greater amount in SKS cv. (10.1 µg/mg) comparing to SNS (1.7 µg/mg) (Table 2). Additionally, comparable levels of phenolics were detected in Aweis kernels collected from different geographical regions (Suez and Sharqia) at ca. 8.4 µg/mg for QAS and SAS, and suggestive that phenolics provide better markers for cvs. than sugars as previously identified. Likewise, comparable levels of total phenolics were detected in Zabdeya and Fons cvs. from different origins i.e., Suez, Sharqia and Giza at 8–10 µg/mg in case of QZS, SZS, and GZS versus 4–6 µg/mg in SFS and QFS, respectively. These results confirm that specialized metabolites present better markers for classification of cvs., not being affected by regional habitat versus primary metabolites such as sugars, and in accordance with our previous results in other food26.

On the other side, it should be noted that four phenolic lipids were detected in mango kernels for the first time including phenol (3-heptadecenyl)-(cardanol), phenol(3-heptadecenyl)-(ginkgol), 1-(2,3-dimethoxyphenyl)ethanol, and 3-(eptadecadienyl) phenol. Ginkgol and cardanol were the major components amongst all cvs. ranging from 0.1 to 13.5 µg/mg (Table 2), and likely to contribute to mango kernels shelf life considering their potential antimicrobial actions43. In that regard, QQS should be assessed for its antimicrobial action against food borne pathogens considering its rich cardanol content (13.5 µg/mg). Examination of the potential health benefits of these phenolics should now follow to identify functional food or other uses for these mango kernels based on such metabolite profiling results.

Amino acid/nitrogenous

Amino acids are formed due to various metabolic processes during ripening stages of fruit maturity44, and detected in all cvs. at comparable levels ranging from 6.3 to 9.3 µg/mg. The amounts detected in Sharqia ranged from 6.3 to 8.4 µg/mg, and comparable to that in Suez cvs. being detected at 7.8–9.3 µg/mg. Besides, amino acids in Ismailia and Giza collected accessions represented by IMS and GZS were at similar levels of 8.5 and 9.1 µg/mg, respectively. Hence, amino acids content does not appear to be affected by cvs. or origin in this study. Profiling of mango kernels revealed for 5 major amino acids and nitrogenous compounds exemplified by sarcosine, ethyl ester, nicotinic acid and L-threonine, with sarcosine-methyl ester as major components ranging from 3.3 to 4.7 µg/mg. Sarcosine is recently recognized for its CNS effects against depression, anti-inflammation in the brain45, in addition for management of schizophrenia and Alzheimer’s disease46,47.

Miscellaneous

In addition to the aforementioned classes, other chemicals were detected though at trace levels including fatty alcohols, ketones and acids. Fatty alcohols were represented by two components including behenic alcohol and 1-docosanol detected at highest level in QRS sample at 2.0 µg/mg while traces of ketones and acids were identified in all examined cvs. Behenic alcohol was the major fatty alcohol in all tested kernels detected at 1.3 µg/mg at presented in Table 2.

Multivariate data analysis (MVA) of mango kernels in context to its cv. and/or geographical origin

MVA was further employed including principal component analysis (PCA), hierarchical cluster analysis (HCA), and orthogonal partial least square (OPLS) analysis for specimens’ classification in an untargeted manner (Fig. 3).

Figure 3
figure 3

GC–MS based principal analyses of different mango cvs. The metabolome clusters are located at the distinct positions described by two vectors of PC1 (86%) and PC2 (4.3%). (A) Score plot of PC1 versus PC2 scores. (B) Loading plot for PC1 and PC2 with contributing mass peaks and their assignments. (C) HCA dendrogram analysis of mango cultivars based on group average cluster analysis using GC–MS. For mango codes refer to Table 1.

Unsupervised PCA and HCA data analysis of different cvs of mango’s kernels

PCA is a potential modelling tool for assigning relative variability within cvs: from different origins, and to assess for method reproducibility48. Clustering result indicated that biological replicates for each cv: were clustered together suggestive of low biological variance within each specimen as results of replicates were more or less superimposable. The model described by principal component (PC1) and PC2 accounted for 90% of the total variance (Fig. 3A). PCA score plot indicated that QRS, QSS, and QZS specimens all from Sharqia province were well spaced and positioned on the right side of PC1, with QQS and some of QRS were clearly appearing as outliers. The left side of negative PC1 score plot showed an overlap between the remaining cvs belonging to GZS, SZS, SNS, and IMS. The close clustering of negative side samples of PC1 without significant segregation indicated their similar metabolite profiles in these cvs. Investigation of the loading plot (Fig. 3B) revealed that sugars mostly accounted for cvs segregation. For instance, talose, fructose, glucose and maltose were abundant in cvs in QRS, SKS and QSS cvs.

HCA was implemented for cvs. classification in an intuitive graphical displayed, Fig. 3C. Three clusters were observed in HCA dendrogram, with only Sedeeq cv. from Sharqia province (QQS) clustered in a separated group (group 1), whilst 4 cvs. originating from Suez and Sharqia viz. QZS, QSS, SNS, and SKS were clustered in group 2. The remaining cvs from different localities were all combined in group 3 suggestive of their more or less similar metabolic components, and in agreement with PCA results (Fig. 3A).

Unsupervised PCA of cvs. based on origin Sharqia versus Suez

We further attempted to assess whether a clear separation among cvs can be observed based on geographical origin. Mango’s kernels were stratified based on locality for origins from Sharqia denoted in blue and Suez in green. Whilst most of cvs. from Suez were segregated in the right side with positive PC1 values, some samples overlapped with kernels from Sharqia on the negative side along PC1 (Fig. 4A) representing 86.7% of the total variance and suggestive that no classification can be readily inferred from kernel derived trees origin from such dataset.

Figure 4
figure 4

GC–MS based principal component analysis of mango cvs. originated from Suez (green color) and Sharqia (blue color). The metabolome clusters are located at the distinct positions described by two vectors of PC1 and PC2 with total variance 91.1%. (A) Score plot of PC1 versus PC2 scores. (B) Loading plot for PC1 and PC2 with contributing mass peaks and their assignments. For mango codes refer to Table 1.

The corresponding loading plot (Fig. 4B) revealed that the samples in the right side were more enriched in sugars viz. fructose, talose and arabinose, whilst sarcosine methyl ester was more detected in cvs present in the left side of PC1.

PCA of Zabdeya from the three localities

To confirm whether clear distinction for one cv. can be observed from three locations, kernels from Zabdeya cv. being derived from three different localities “Suez, Sharqia and Ismailia” were modelled using unsupervised PCA. Figure 5 shows PCA score (A) and loading (B) plots for Zabdeya cv from the three locations within total variance coverage of 93.6% along PC1 and PC2. Figure 5A revealed acceptable segregation of cv. based on locality. Loading plot (Fig. 5B) showed that Zabdeya from Sharqia richness in sugars, i.e., glucose, talose, and fructose, and in agreement with PCA results (Fig. 3) for kernels from this region richness in sugars. In contrast, sarcosine methyl and ethyl esters predominated in other kernel origins.

Figure 5
figure 5

(A) Score plot of PC1 (81.6%) versus PC2 (12%) scores of Zebdya kernels from different localities viz. GZS, QZS, and SZS. (B) Loading plot of PC1 and PC2 of Zebdya. For mango codes refer to Table 1.

Supervised OPLS of Aweis and Fons against other cvs

Considering that both Aweis and Fons are considered premium mango tree types in the Egyptian market regarding their fruit’s composition, we attempted to observe using supervised MVA whether their kernels likewise have unique metabolite profile. Aweis kernels from different origins were modelled as one class group against all other kernel cvs., Fig. 6A with no clear separation of all its kernel specimens from other specimens mostly attributed for its richness in sugars, i.e., fructose, talose, and glucose as revealed from corresponding S loading plot (Fig. 6B). On the other hand, Fons cv. was chemically distinct from other cvs. in which clear separation was observed (Fig. 6C). The S-plot derived from Fons cv. against other mangos revealed that sugars, i.e., fructose, talose, and glucose accounted for discrimination of other cvs. (Fig. 6D). Thus, data analyses confirmed the previous quantification results in Table 2 and showed that the premium mango cvs. "Aweis and Fons" had lower sugar content, particularly glucose, fructose, and talose, enrichment. This may point to the variations in nutrients found in the edible part of the mango fruits, i.e., fruit pulps. It should be noted though that sugars do not present potential markers for cvs. being detected in specimens’ asides from their regulation by several other factors.

Figure 6
figure 6

(A) OPLS-DA score plot and (B) loading S-plots of Aweis mango kernels. (C) OPLS-DA score plot and (D) loading S-plots of Fons; each modelled one at a time against other cvs.

Conclusion

Metabolites heterogeneity in nutrient and volatile profiles of discarded kernel part from 17 mango cvs. originated from three localities alongside Egypt (e.g., Suez, Sharqia, Ismailia, and Giza) is comprehensively investigated herein for the first time via a holistic untargeted GC–MS based volatiles. A total of 41 constituents were identified belonging to sugars, fatty acids/esters, amino acids, and sugar alcohols as the major metabolite classes. Data analysis revealed that cvs. from Sharqia province was recorded a variable sugar content. Amongst other cvs. QQS, QRS, and QSS were the most abundant being detected at 297.1, 288.4 and 123.5 µg/mg. These results were further outlined in case of sugar alcohols data being detected at higher levels in QQS and QRS cvs. at 31.7 and 23.7 µg/mg compared with other localities. Major sugar alcohols included ribitol, iditol, pinitol, and myo-inositol, with the ribitol highest level detected in QRS cv. posing it as a potential source of sugar alcohols. In terms of novel chemicals detected in mango kernels, butyl caprylate was detected for the first time at 11.7–14.4 µg/mg. Other newly reported phenolics included ginkgol and cardanol in selected specimens at 0.1–13.5 µg/mg. These phytochemicals play a significant role enhancing mango kernels shelf life due to their potential antimicrobial action. In this regard, QQS should be addressed for assessment of its antimicrobial effect owing for its rich cardanol content (13.5 µg/mg). Therefore, the potential use of mango kernels for food or other applications with regard to the obtained characterized chemicals are highly recommended to be considered in nutraceutical preparations and as, for example, substitute of sugar cane and food preservative. MVA data confirmed quantification results and displayed that the premium mango cvs. “Aweis and Fons” were less enriched in sugars such as glucose, fructose, and talose. This might indicate the nutritional differences with the fruit pulps. Such comprehensive metabolite profiling could assist in identifying the best cvs. enriched in a certain chemical for future valorization purposes, and likewise targeting specialized metabolites to be analysed using LC–MS more suited for profiling such class of metabolites.