Introduction

Olive oil (OO) is one of the oldest and essential edible oils commercially traded in the history of mankind. Olive oils are commonly classified into extra-virgin olive oils (EVOOs), virgin olive oils (VOOs) or mixed with refined olive oils (refined OOs) (Fig. 1A), depending on among other factors, its fatty acids (FA) profiles and the trace compounds (e.g. concentration of free fatty acids (FFA) or acid value (AV)1,2,3,4, phenolic compounds5). FAs are predominantly defined by its saturation levels (e.g. saturated fatty acids (SAFA), monounsaturated fatty acids (MUFA) and polyunsaturated fatty acids (PUFA)) (Fig. 1B). The FFA content is influenced by a number of phytosanitary factors and extraction processes6,7,8. As a consequent of variation in processing (e.g. poor olive quality or inadequate extraction process), triacylglycerols structural breakdown may occurs (due to for example high temperature and moisture induced hydrolysis9), resulted in an increment in the final acidity of the oils3,4 (Fig. 1C, D).

Fig. 1: Identification and classification of Olive oils using home-built NMR-based PoC.
figure 1

A Olive oil production process. The mechanical processes (e.g. cold press) in the extraction of olive oils, in particularly in the separation phase (e.g. filtration, refining) play a major role in preserving the amount of free fatty acids in OOs, and thus in the final grading of the product (e.g. EVOOs, VOOs, and refined OO). EVOO must be obtained using exclusively mechanical extraction procedures and preserving an acid value (AV) of less than 0.8%. On the other hand, acid value of VOO and refined OOs must be below 2.0% and 1.0%, respectively. B In OOs, fatty acids are predominantly in the form of triacylglycerols and are defined by their saturation level (e.g. saturated, monounsaturated, polyunsaturated). FA concentration varies greatly based on the type of vegetable oil and the regions of origin of the starting products. OOs are dominated by oleic acid (monounsaturated FA), similarly to peanut oil. Palm oils are saturated FA dominant, in contrast to sunflower and corn oils which are polyunsaturated FA dominant. C Olive oil composition is dominated by triacylglycerols (e.g. triolein) with free fatty acids and ‘trace’ compounds (e.g. vitamin E, and other bioactive molecules) in smaller amounts. Fatty acids are characterized by their saturation level (n) and by the hydrocarbon chain properties (mainly length). For n = 0, FAs are known as SAFAs (saturated FAs). For n = 1 and n > 1, FAs are known as MUFAs (monounsaturated) and PUFAs (polyunsaturated), respectively. An increase in free fatty acid pool (e.g. acidity) occurs due to the hydrolysis of triacylglycerols esters. D A bird-eye view of the relationship between the NMR-based traits, physiochemical properties, and OOs grading (EVOO, VOO, and refined OOs). The classification of OOs is influenced by the FA and FFA profiles, among other factors. E NMR-based traits originate on the relaxation dynamics of nuclear spins of proton nuclei due to the composite effect of triacylglycerols, free fatty acids, trace compounds, and the overall environment of the protons. F The concept of using integrated intelligent machine as proposed in this work. The developed NMR-based PoC consists of a portable commercial console, home-built detection circuit coil, and a palm-sized permanent magnet (B = 0.5 T). For high-throughput analysis, a microcapillary tube designed to be slotted into the NMR detection coil is used to store minute sample (e.g. a single drop of oil). The entire assay completes in less than 5 min. The NMR measurements were carried out in single blinded manner on each oil.

The high demand of OO comes from its multiple nutritional benefits and its irreplaceable organoleptic properties10,11. Olive oil is by far one of the most frequently adulterated food products due to its high customer appeal and large profitable margin12,13. The highly desired and expensive EVOO is frequently diluted with cheaper adulterated oils, leading to indirect economic consequences and health concerns. Hence, olive oil has been the subject of rigorous quality regulations, with its standardization characteristics set amid tight legislation.

Laboratory-based methods, such as chromatography13,14,15,16, spectroscopy12,17,18,19,20,21, or DNA analysis13,22,23, have been extensively developed to reduce the cases of adulteration. Nuclear magnetic resonance (NMR) spectroscopy in the high-field frequency domain has also been proposed to be an effective method on the detection of authentication, quality control, and adulteration of the oils. High-field NMR, however, has a number of drawbacks, such as the requirement of large, dedicated laboratory facilities with costly cryogenic cooling gases, complicated pre-analysis steps, and the need of a highly specialized workforce17,24,25. None of the above-mentioned detection methods are simple to use, require minimal sample preparation, nor present short turn-around time.

We have recently demonstrated that two-dimensional time-domain NMR can be used to classify edible oils based on their physiochemical composition (e.g. saturation levels) with much higher accuracy than the conventional methods26. The low-field NMR-based point-of-care (PoC)27,28,29,30,31,32analysis is based on pairing the longitudinal (T1) and transversal (T2) relaxation times, which improves the sensitivity and specificity of the detection significantly. It works on the rationale that accumulative characteristics of each dimensionality form a specific and unique signature, in a way similar to the radiomics technique developed in the field of radiology.

In this work, we demonstrate that NMR-based phenotypic traits in the time-domain (at molecular level) can be used for classifying the OOs. Using just a single droplet, we demonstrated that using benchtop sized NMR33,34,35, olive oils can be rapidly classified (into EVOOs, VOOs or refined OOs) in non-destructive manner (i.e. label-free or without sample pre-treatment). The subtle differences in physiochemical composition and molecular microenvironment of the olive oils induce substantial changes in the relaxation mechanism in the time-domain NMR regime (Fig. 1E, F). With the aid of machine learning, the sensitivity and specificity of the detection were shown to have AUC = 0.95 using T1 relaxation and T2 relaxation, much higher than current gold-standards, the near-infrared spectroscopy (NIRS, AUC = 0.84) and Ultraviolet-Visible spectroscopy (UV-Vis, AUC = 0.73) (Table 1), and much better performance in the identification of regions of origin (Table 2). In addition, the proposed NMR-based detection methods were much cheaper per assay, user-friendly, and can be used at point-of-detection (Table 3). This work demonstrated the spirit of combining the (old-fashioned) machine with the (new-wave) of machine learning, to produce an ′intelligent machine′30,36,37, an attractive scientific solution for the food science community.

Table 1 Classification of olive oils using the Receiver Operating Characteristic analysis.
Table 2 Receiver Operating Characteristic analysis for regions of origin.
Table 3 Qualitative performance of the NMR-based PoC against conventional methods (UV-Vis, NIR).

Results

Rapid identification and characterization of olive oils with NMR-based PoC

In order to demonstrate the industrial applications, we use the proposed technique to validate the authenticity of EVOO from VOOs and refined OO (Fig. 2). The relaxometry measurements and acidity determination (details in Methods) were performed on thirty-six types of OOs (i.e. 21 EVOOs, 8 VOOs, and 7 refined OOs,) without disclosing the manufacturers label and country of origin. For each sample, the relaxation measurements were carried out using five different samplings, with the refined OO was performed as control experiment.

Fig. 2: Rapid identification and characterization of olive oil using NMR-based traits.
figure 2

A Two-dimensional mapping of EVOOs (green), VOOs (blue), and refined OO (red) in the T1-T2 magnetic phase diagram. A wide variety of Olive oils (i.e. 21 EVOOs, 8 VOOs, and 7 refined OOs) commercially available from different manufacturers were purchased off-the-shelf in Braga, Portugal or otherwise through online platforms. Each data point represents one sampling with a total 360 samplings collected. The mean T1 and T2 values and acid value were denoted below. B Average T1 and T2 relaxation times (ms) for the different types of OOs. The statistical analysis of the data was calculated using unpaired two-tailed Student t-tests (P < 0.005) (details in Supplementary Fig. 1). C Rapid classification of OOs using the NMR-based traits in the form of clustering analysis. This hierarchical clustering was constructed based on the Euclidean distance between the averaged measures per sample (details in Supplementary Table 1). Their quantitative linkages (e.g. inter- and intra-cluster similarity) are shown as a heatmap. D The ROC curves for NMR-based traits (red), NIRS (blue), and UV-Vis (grey) calculated from a number of supervised models (details in Supplementary Fig. 2). The error zone were 99% of the confidence band.

A two-dimensional map T1-T2 magnetic state diagram was used to enumerate the object clustering based on the composite intrinsic relaxation properties of the oils, thereof, forming a calibration standard for the (EVOOs, VOOs, refined OOs), and (150.5 ms, 168.0 ms), (153.2 ms, 174.4 ms), (146.3 ms, 162.8 ms), respectively (Fig. 2A and details in Supplementary Fig. 1).The oil types were significantly clustered (P < 0.005) indicate that the intra-variation samplings were much smaller than the inter-variation of the OOs (Fig. 2B).The details breakdown for each commercial brand is shown in heatmap (Fig. 2C).In addition, the Receiver Operating Characteristics (ROC) analysis (Fig. 2D and Supplementary Fig. 2) indicated that relaxometry measures have excellent detection sensitivity and specificity with Area Under the Curve (AUC) of 0.95 as compared to its counterparts NIRS (0.84) and UV-Vis (0.73), respectively (Table 1).

Identification of OO based on the regions of origin

We demonstrated the feasibility of using the proposed NMR analyses in identification of production based on their countries (or regions) of origin. Apart from the genotypic variation, the variation in phenotypic traits is governed by number of factors, such as migration drift (e.g. diversification and domestication events)38, and abiotic stress (e.g. local climate, soil conditions)39,40. For the identification of the regions of origin for OO, a matrix of data subsets, encompasses four different regions taken from the European regions (i.e. 3 Greece, 4 Italy, 9 Portugal, and 5 Spain) were enumerated using two-dimensional T1-T2 magnetic state diagram (Fig. 3) and the details of each oil variations (details in Supplementary Fig. 3, and Supplementary Table 1).

Fig. 3: Identification of the regions of origin of OO.
figure 3

EVOO samples were studied based on their regions of origin. Off-the-shelf EVOO samples originated from different European regions (i.e. 9 Portugal, 5 Spain, 4 Italy, 3 Greece) according to their labelling. Pair-wise two-dimensional mapping of EVOOs origin in the T1-T2 magnetic state diagram, according to unpaired two-tailed Student t-tests (details in Supplementary Fig. 3). The sensitivity and specificity of each region were calculated using receiver operating characteristic (ROC). The substantially high AUC, ranging from 0.6 to 0.9 of each pair/wise region were evaluated.

The mean T1 relaxation times of (166.3, 166.7, 168.9, and 168.9) ms, and for T2 relaxation times of (147.7, 150.1, 150.2, and 151.0) ms for (Greece, Spain, Italy, Portugal), respectively (Fig. 3, and details in Supplementary Fig. 3). The regional-based identification for NMR technique is AUC = 0.71, much higher or comparable to NIRS (AUC = 0.70) and UV-Vis (AUC = 0.69) (details in Table 2). Interestingly, when a pair-wise comparison matrix (i.e. pair-wise ROC-AUC evaluation) is employed using NMR-based traits (e.g. T1, T2, A-ratio) it resembles the geographical orientation (Fig. 4A). For example, Greece-Italy (AUC = 0.74), Greece-Spain (AUC = 0.84), and Greece-Portugal (AUC = 0.89) shown as a heatmap (Fig. 4B). The Iberian region (i.e. Spain-Portugal) and Italy-Greece displayed stronger similarities with AUCs of (0.69, 0.74), respectively. This is to be expected as neighbouring countries are expected to have much higher of species exchange due to its proximity in geographical location. The details of each oils purchased displayed a unique information on their location (Fig. 4C and D).

Fig. 4: NMR-based traits in identification of the regions of origin.
figure 4

A The NMR-based phylogenetic tree was built using the AUC distance matrix (details in Fig. 3) using neighbour joining algorithm (Supplementary Fig. 6) which splits the NMR-based traits into three main regions (i.e. Iberian, Italy, Greece). Iberian are Portugal and Spain which shared the same land border. The proposed NMR-relaxation based traits (legends of AUC is 0.25 in vertical) in agreement with their geographical orientation (shown in legend of 300 km per cm). Higher similarities are expected to be found species that are closely related. Neighbouring countries are expected to have higher species exchange and genes flow due to their geographical proximity. Similarities fade away with, for example, geographical distance. B A summary of the AUCs between countries evaluated using ROC analysis (in details in Fig. 3). C The detailed analysis of each commercial brands represented in the form of heatmap. Hierarchical clustering was constructed based on the Euclidean distance (between T1 relaxation, T2 relaxation, and A-ratio) of the averaged measurement per sample (Supplementary Table 1). D T1-T2 magnetic state diagram in the forms of box plots for the regions of origin.

Limit of detection of NMR-based traits technique

We evaluated the limit-of-detection of NMR-based traits by mixing sunflower oil into a selected EVOO, to mimic the cases of adulteration. For each sample, the relaxation measurements were conducted in double using five different samplings, covering from 0% (as control) to 100% of OO in the mixed edible oil (Supplementary Fig. 4). As clearly indicated in the T1-T2 magnetic state diagram, a linear relation (r2 = 0.93) between NMR-based traits and the concentration of sunflower oil (PUFA-rich) reduced into EVOOs (MUFA-rich) relaxation effect becomes clearer (due to a decrease in saturation level). Therefore, the (T2, T1) coordinates were (188.3, 202.9) and (155.3, 174.6) for sunflower oils and EVOO (controls), respectively. The limit of detection for NMR-based traits were approximately (1%), were either comparable to NIRS (1%) or much better than UV-Vis spectroscopy (5%) (details in Supplementary Fig. 4).

Discussion

We report NMR-based point-of-care technology for fast, label-free, and distinctive OO profiling and to assure its high quality, which can be used to reduce the attempts in adulteration. The NMR-based phenotypic traits represent the intrinsic molecular relaxation dynamics (or molecular mobility) due to the composite effect of the FA profiles (e.g. saturation level) and concentration of FFA (e.g. acid value). Nevertheless, despite OOs consists of predominantly the monounsaturated fat (more than 70%), we found in this work that the overall saturation levels (e.g. increasing PUFA/MUFA ratio, lower SAFA content) has profound impact on the NMR traits (details in Supplementary Fig. 5). Secondly, we observed that FFA concentration has direct effect on the NMR-based phenotypic traits. We hypothesized that, with similar mechanism i.e. the saturation levels and FFA concentration disrupts the packing41,42 ‘efficiency’ (i.e. weakening of Van der Walls forces) leading to a disruption in the molecular mobility and hence introducing much longer pathways for relaxations (i.e. longer T1 and T2). This is in agreement with the recent work reported by Cistola43.

Conventionally, chromatographic-based techniques, are extremely slow, time-consuming, require complicated multiple sample preparation steps with expensive laboratory equipment, while complicated chemometric analysis (e.g. vibrational, RAMAN spectroscopy) is required for in depth data interpretation, in comparison to the proposed NMR-based detection methods and other state-of-the-art technologies (refer to SWOT-like Table 3). The information derived from the analytical instrument represents one of the major challenges faced by food scientist during the identification and classification of pure and adulterated food samples. With the introduction of EU Protected Designation of Origin registration and equivalent in other geographical locations, rapid classification (preferably in non-destructive manner) of EVOOs will be invaluable to industry and regulatory agencies alike.

On the other hand, the proposed NMR-based technology provides rapid, precise, low-cost, label-free, and accurate analysis for grading the olive oils quality using the NMR-based phenotypic traits in the time-domain NMR. In this framework, the central hypothesis of radiomics is that it is possible to decode tissue characteristics and pathology by examining the textural features in medical images. Similarly, clustering NMR techniques work on the rationale that accumulative characteristics of each dimensionality form a specific and unique signature (‘molecular fingerprint’) is extremely powerful for rapid and accurate classification of OOs based on the NMR-based phenotypic traits. In addition, with the introduction of machine learning, it is now inexpensive to process large datasets running in almost real-time setting, opening door to intelligent machine which can make interpretation with much higher sensitivity and specificity.

Methods

Details and sample preparation of the OOs

OOs analyzed were cooking oils bought locally in Braga, Portugal or purchased online (e.g. international brands). The commercial brands names were disclosed (in details in Supplementary Table 2). No further processing was made before the NMR measurements and all other measurements.

NMR measurements and parameters

The 1H magnetic resonance measurements of olive oils were acquired at the resonance frequency of 21.7579 MHz polarized using a portable permanent magnet (Metrolab Instruments, Switzerland), Bo = 0.5 T, using a benchtop-type console (Kea Magritek, New Zealand). A temperature controller was set to maintain the measurement chamber at 30 °C. The T1 relaxation and T2 relaxation times were acquired using standard inversion recovery (IR) and Carr-Purcell-Meiboom-Gill (CPMG) train pulse sequences, respectively. The experimental parameters used were echo time = 200 μs, number of echoes = 10,000, and signal averaging = 32. A recycle delay of 2 s was set between each experiment to provide sufficiently long time to allow all molecular spins to return to thermal equilibrium. (T1 relaxation, T2 relaxation) measurements were carried out on commercial EVOOs, VOOs and refined OOs. NMR measurements were performed blindly on each oil ten repeated times, with a total of 360 points for olive type classification, and 210 points for origin assessment. Clustering NMR methodology uses a pair of relaxation times (T2, T1) for each object (oils in this case) to construct a (pseudo) two-dimensional map (Figs. 2A and 3).

UV-VIS and NIR measurements and detection

UV-Vis measurements were performed in a SHIMADZU UV-2550 spectrophotometer (Kyoto, Kyoto, Japan), while for NIR measurements a PerkinElmer LAMBDA 950 instrument was used. All samples were measured in matched 1 cm path length quartz or optical glass cells, running an empty cell as a reference. UV-Vis spectra were measured within 200–800 nm spectral range at 1 nm spectral resolution, while NIR, spectra were obtained within 500–2200 nm with 5 nm steps. NIR spectra spike removal algorithms44 were applied (cut-off = 6, threshold = 10).Every sample was measured three times and the mean values were taken as representation.

Acid value measurements

The acid value determination was performed under the EN ISO 660:200945 protocol for oleic acid quantification. Simply, 10 mL of edible oil were weighted and diluted in 20 mL of ethanol (φ = 99%,) with small amounts of phenolphthalein. Titrations with 0.1 mol/L of potassium hydroxide (KOH) were done under magnetic stirring until slight colour changes appear (and persisted for +10 s). Measures were executed twice per sample. The acid value was extrapolated from the amount of KOH required for each sample, defined as the amount of KOH required to neutralize 1 g of chemical substance, with the following formula:

$$w_{AV} = \frac{{56.1 \times cV}}{m}$$
(1)

where, c is the exact concentration of the standard KOH solution (mol/L), V the volume of KOH added (mL), and m the mass (g) of the test portion. Acidity, or the free fatty acid content, can be estimated by:

$$w_{{\mathrm{FFA}}} = \frac{{VcM}}{{10 \times m}} \approx 0.5 \times w_{AV}$$
(2)

wherein, M is the molar mass (g/mol) of the predominant fatty acid in the edible oil, in this case oleic acid (282.47 g/mol).

Machine learning algorithm and workflows

Using statistical programming languages (e.g. Orange 3.1.246 or R), the raw datasets were processed using supervised and unsupervised learning techniques. The machine learning algorithms were written and run on a personal laptop (Intel Core Pentium i7 CPU @ 2.70 GHz, 8.00GB RAM). Once the model in machine learning was built, all the tasks run simultaneously and completed typically in less than 1 min. Using unsupervised learning, the relationship between each object was rapidly constructed using clustering analysis (e.g. hierarchical clustering) and its quantitative linkages (e.g. inter-/intra-cluster similarity) were shown on a dendrogram and a heatmap. Supervised learning models (i.e. Neural Network, kNN, Logistic Regression, Naive Bayes, and Random Forest) were used to train the datasets and the best model with the highest accuracy was chosen to predict the object classification (e.g. oil classification) using pre-trained datasets.

Statistical analysis

For any two groups of separation, it is considered as statistically significant when this criterion (P < 0.5) is achieved or otherwise denote as non-significant (n.s). The student’s unpaired t-test was used throughout this study. One-tailed and two-tailed were used as mentioned in the figure captions. OriginLab–Pro 8 was used to handle all the graphs plotting.

Receiving operating characteristic

The analyses were used to evaluate the specificity and sensitivity of the diagnostic techniques. Various supervised models were used for the ROC tests. These were namely the kNN, Logistic Regression, Naïve Bayes, Neural Network, and Random Forest models. A fitting of power function y = axb were used through the study. Iterations were run with the Levenberg–Marquardt algorithm until a chi-squared tolerance of 10−9 was achieved. Final function AUC was compared to the real averaged AUC from all supervised models (details in Supplementary Fig. 2).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.