Introduction

Hyperspectral Imaging (HSI) is a valuable non-destructive technology for online monitoring and high throughput screening, with huge potential to be automated1,2,3,4. HSI, specifically Near InfraRed (NIR), is both fast and non-destructive, which makes it excellently suitable to perform real-time “online” in identification of polymers3,5. The combination of HSI as a powerful tool for the optical sorting of plastics, and discriminant analysis, allows automatic sorting in real time under industrial conditions6 into new well-characterized circular feedstock to replace virgin polymers. The high throughput required to process sufficient material of the enormous amount of plastic waste puts strong demands on the computational time and storage resources available to support the choice for sorting every waste element into an adequate stream: data analysis, processing, and decision-making should be as fast as possible7. An early implementation of NIR hyperspectral imaging-based cascade detection, exemplified by industry leaders like Steinert, Tomra, and Pellend, was already able to specifically sort post-consumer plastic packaging waste. Innovations in the development of such technology have been continuously ongoing8,9,10. The considerable promise of HSI, with its challenging combinations of multivariate spectral and spatially resolved information, have made multivariate methods essential to extract hidden chemical information11. Indeed, deep learning is widely applicable in the field of waste management, and one may find a vast number of commercial products available on the market, such as waste stream analyzers and vision systems for robotic sorting. As for adopting deep learning methods to hyperspectral imaging, deep learning models would require more effort in crafting a high-quality training dataset on one hand while on the other hand generally provide less interpretability.

Chemometrics translates abstract spectroscopic fingerprints into throughput plastic species compositions of each waste element plays an indispensable role in the analysis of plastics during the recycling process. Chemometrics has proven invaluable, by variable selection12, Multivariate Curve Resolution (MCR)13, and classification algorithms to mix chemical domain knowledge with data-driven machine learning. It plays a vital role in tackling the complexities presented by challenging objects, particularly multilayers, as it enables the analysis and interpretation of the intricate spectral information embedded within these composite structures. However, the size of Hyperspectral data sets in both the spatial and spectral dimensions may considerably limit the speed of information extraction from such data. The Spectral and spatial redundancy of HSI data however enables a considerable data reduction to reduce resources sufficiently for high-throughput applications in the circular economy14,15.

Non-negative matrix factorization (NMF) with the inherent non-negativity property originates in signal/image processing with some reports in analytical chemistry yet is largely mathematically equivalent to Multivariate Curve Resolution (MCR-ALS)16,17 that is a cornerstone of chemometrics. The NMF algorithm has been evaluated in chemistry e.g. for the resolution of overlapped GC–MS spectra, as well as for the deconvolution of GC × GC data set18. Besides, NMF has been conducted to classify complex mixtures based on the extracted feature and to analyze time-resolved optical waveguide absorption spectroscopy data15. NMF recovers pure spectra and their concentration contribution maps of HSI Raman images19. Another study applied an NMF filter to triboluminescence (TL) data traces of active pharmaceutical ingredients20, leading to simultaneously recovering both photon arrival times and the instrument impulse response function, i.e. is of considerable value to recover chemical information from HSI data. Zushi proposed an NMF-based spectral deconvolution approach, coupled with a web platform GC mixture21, that utilizes a faster multiplicative update method instead of the traditional projection step. This approach is highly advantageous for analyzing large mass spectrometry imaging datasets due to its improved speed22.

Monolayer materials composed of a single polymer species or multilayer materials made up of several polymers are common in plastics. With the help of a single spectral profile, it is possible to identify the polymer composition of monolayer objects23. When it comes to identifying the polymer composition of waste streams, the objects encountered are often much more complex than monolayer materials. These objects can be multilayered or composed of multiple polymer species, or even coated with labels made of different polymers. Furthermore, an object may be made up of known polymers with an unknown composition. In the case of multilayer plastics, the type and ratio of plastics used can also vary significantly, such as 70/30 or 50/50. This adds significant complexity to the classification process, which is where the importance of data unmixing comes into play.

Data unmixing is a valuable tool that can handle these issues effectively. However, when sorting the objects, the decision to translate a model observation into a specific engineered materials stream may only be made on a small subset of the complete object. Therefore, reducing the size of the data using convex hull is an excellent innovation in curve resolution that can reduce the data to less than one percent of the original data. Handling large HSIs, data compression/size reduction coupled with rapid data analysis tools has top priority. So, unnecessary pixels (data rows) and wavelengths (data columns) should be jointly removed and the rest are essential to characterize. For the study in this research, removing all unnecessary information and the reduced data will be subjected to further decomposition, NMF. The proposed strategy is designed to address scenarios in which novel materials, previously unaccounted for, are introduced into the sorting process. In such instances, two primary scenarios warrant consideration; Partially Known Materials: When an object comprises a blend of a known polymer and an unidentified material, it is not classified solely within the category of the known material. Instead, it is directed into multilayer streams containing the known polymer. This methodology ensures the proper identification and separation of the known component, even in the presence of unidentified materials. New Types of polymers: In the event of entirely new polymer types being introduced into the packaging for the first time, this method demonstrates adaptability. The model can be updated and extended to encompass the characteristics of the new material, facilitating the effective handling of previously unmodeled materials within the sorting process.

Materials and methods

A brief description of Non-Negative Matrix Factorization

Nonnegative Matrix Factorization (NMF) deconvolutes a matrix, \({\mathbf{R}}_{IJ\times K}\), into the product of matrices \({\mathbf{W}}_{IJ\times n}\) and \({\mathbf{H}}_{n \times K}^{{\text{T}}}\) with an intrinsic non-negativity property as a minimal constraint. I and J represent the number of spatial pixels, while K represents the number of variables in the case of a hyperspectral image.

$${\mathbf{R}}_{IJ\times K}={\mathbf{W}}_{IJ\times n}{\mathbf{H}}_{n\times K}^{{\text{T}}}+{\mathbf{E}}_{IJ\times K}$$
(1)

To this, formalize NMF optimizes the following cost function:

$${{\text{min}}}_{(\mathbf{W}\ge 0, \mathbf{H}\ge 0)} \Vert {\mathbf{R}}_{IJ\times K}-{\mathbf{W}}_{IJ\times n}{\mathbf{H}}_{n\times K}^{{\text{T}}}\Vert_F $$
(2)

Different algorithms are introduced in the literature to calculate component matrices, The multiplicative update rule introduced by Le ad Seung is simple to implement5. The updating parts are:

$${{\varvec{H}}}_{K\times n} \leftarrow {{\varvec{H}}}_{K\times n} \frac{({{\text{W}}}_{n\times IJ}^{T}{\mathbf{R}}_{IJ\times K})}{({{\text{W}}}_{n\times IJ}^{T}{\mathbf{W}}_{IJ\times n}{\mathbf{H}}_{n\times K}^{{\text{T}}})}$$
(3)

and

$${\mathbf{W}}_{IJ\times n} \leftarrow {\mathbf{W}}_{IJ\times n}\frac{({\mathbf{R}}_{IJ\times K}{\mathbf{H}}_{n\times K}^{{\text{T}}})}{({\mathbf{W}}_{IJ\times n}{\mathbf{H}}_{n\times K}^{{\text{T}}}{{\varvec{H}}}_{K\times n})}$$
(4)

Equations (3) and (4) denote element-wise multiplications. These update rules preserve the non-negativity of \({\mathbf{W}}_{IJ\times n}\) and \({{\varvec{H}}}_{K\times n}\) where \({\mathbf{R}}_{IJ\times K}\) is element-wise non-negative.

Approach

For any bilinear data set, a minimum number of rows and columns carry the most informative and independent part of the data24. Consequently, in HSIs, essential pixels, and essential wavelengths are necessary to extract the pure contribution maps and spectral profiles of all components. The main steps of this approach for plastic characterization are visualized in Fig. 1 and summarized as:

  1. 1.

    Object detection: Recording a first-order spectrum with K channels for every pixel in an I by J scene into a data cube, \(\mathop {\widetilde{\mathbf{\underline {R}}}} \nolimits_{I \times J \times K}\). To analyze this cube for plastic characterization, object detection is needed. In this work the corresponding pixels of each object were detected by correlation growing algorithm. The results of object detection on \(\mathop {\widetilde{\mathbf{\underline {R}}}} \nolimits_{I \times J \times K}\) are several cubes (as many as the number of objects) and each of them contains information about one of the objects. Object detection decomposes \(\mathop {\widetilde{\mathbf{\underline {R}}}} \nolimits_{I \times J \times K}\) to some cubes.

  2. 2.

    Essential information extraction: This step will start by unfolding each data cube (resulted from previous step) to a matrix (\({\mathbf{R}}_{IJ\times K}\) ) with each row a pixel for further analysis. Calculating the most informative pixels (Essential Spectral Pixels, ESPs) and variable/wavelengths (Essential Spatial Variables, ESVs) for each object, which are based on the convexity property in the normalized abstract row and column spaces, is the next step. ESP/ESVs are the smallest set of points needed to generate the whole data in a convex way in the abstract score space. Once ESP/ESVs are identified for all objects separately, all other measured pixels are removed and the reduced data is moved to the next step. Left/right eigenvectors of \({\mathbf{R}}_{IJ\times K}\) using SVD, \({\mathbf{R}}_{IJ\times K}\) = \({\mathbf{U}}_{IJ,n}\) \({\mathbf{D}}_{n,n}\) \({\mathbf{V}}_{n,K}^{{\text{T}}}+ {\mathbf{E}}_{IJ,K}\) can be calculate, where \({\mathbf{U}}_{IJ,n}\) and \({\mathbf{V}}_{n,K}^{{\text{T}}}\) are the left and right eigenvectors, respectively. \({\mathbf{D}}_{n,n}\) and \({\mathbf{E}}_{IJ,K}\) contains singular values and residuals, individually. In addition, n is the number of factors. The number of factors is set up to five. Because most of the objects contain less than five types of polymers/materials. “Convhulln” as a MATLAB function can calculate the convex set of \({\mathbf{U}}_{IJ,n}\) and \({\mathbf{V}}_{n,K}^{{\text{T}}}\) and explore the ESPs/ESVs. After join selection of essential pixels and essential variables, \({\mathbf{R}}_{IJ\times K}\) turn into \({{\varvec{R}}}_{{p}_{ESP}\times {K}_{ESV}}\). This step is explained in detail in the previous works24.

  3. 3.

    Data decomposition: The reduced data sets for all objects, \({{\varvec{R}}}_{ESPs\times ESVs}\) can be analyzed by NMF in parallel with multiplicative updates, to calculate the reduced concentration contribution maps, \({{\varvec{W}}}_{ESPs\times n}^{{\varvec{r}}}\) and reduced spectral profiles \({{\varvec{H}}}_{{\varvec{E}}{\varvec{S}}{\varvec{V}}{\varvec{s}}\times {\varvec{n}}}^{{\varvec{r}}}\) using Eqs. (3) and (4). Using least square, full concentration contribution maps, \({{\varvec{W}}}_{{\varvec{I}}{\varvec{J}}\times {\varvec{n}}}\), and full spectral profiles, \({{\varvec{H}}}_{{\varvec{K}}\times {\varvec{n}}}\) can be produced from reduced versions \({{\varvec{W}}}_{ESPs\times n}^{{\varvec{r}}}\) and \({{\varvec{H}}}_{{\varvec{E}}{\varvec{S}}{\varvec{V}}{\varvec{s}}\times {\varvec{n}}}^{{\varvec{r}}}\) through Eqs. (5) and (6). This step needs \({\mathbf{R}}_{ESPs\times K}\) and \({{\varvec{R}}}_{{\text{IJ}}\times {\text{ESVs}}}\) which are one-mode reduced data in the row and column direction respectively. Finally, it is easy to reshape the columns of \({{\varvec{W}}}_{{\text{IJ}}\times {\text{n}}}\) to generate the full concentration contribution maps.

    $${{\varvec{W}}}_{{\text{IJ}}\times {\text{n}}}={{\varvec{R}}}_{{\text{IJ}}\times {\text{ESVs}}}*pinv ({{\varvec{H}}}_{{\text{ESVs}}\times {\text{n}}}^{{\varvec{r}}})$$
    (5)
    $${{\varvec{H}}}_{\mathbf{K}\times \mathbf{n}}=pinv\left({{\varvec{W}}}_{{\text{ESPs}}\times {\text{n}}}^{{\varvec{r}}}\right)\boldsymbol{*}{\mathbf{R}}_{{\text{ESPs}}\times {\text{K}}}$$
    (6)
  4. 4.

    Decision making: The matrix \({{\varvec{H}}}_{\mathbf{K}\times \mathbf{n}}\) consists of pure spectral profiles of the components, which can be utilized for qualification by comparing them to a reference library. However, in some cases, \({{\varvec{W}}}_{{\text{IJ}}\times {\text{n}}}\) has complementary information for identification. \({{\varvec{W}}}_{{\text{IJ}}\times {\text{n}}}\), contains characteristic information about the composition of unknown objects. The sum of the squares of the elements in each column of WIJ×n represents the variance of the signal contributed by each polymer in an unknown object. This variance can be used to differentiate between mono-material and multi-material objects using a statistical F-test, which is commonly employed in statistics to compare the standard deviations of two populations. To conduct the F-test, the variance of all columns in WIJ×n is calculated in the first step, and then each value is divided by the noise variance. A significance level of P < 0.05 is used to determine whether the variance is statistically significant. If the object contains only one layer of material, only one type of polymer will pass the F-test. However, if it is a multilayer object, multiple types of polymer will pass the test. Finally, it should be emphasized that all of the computation and reported times in this work are based on a laptop (Intel(R) Core(TM) i7-10850H CPU @ 2.70 GHz) which for real industrial purposes can be dramatically improved by using better computers. The utilization of this algorithm enables the exploitation of the computational capabilities of a standard computer system, thereby eliminating the reliance on specialized hardware such as GPUs or high-performance computing clusters. The selection process of this approach, as explained in step 2, represents a delicate balance, and the aim is to protect essential information from loss. As hyperspectral imaging applications continue to expand, addressing challenges related to data storage and efficient analysis becomes increasingly crucial. This contribution lies in the development of an innovative approach that optimizes data reduction, streamlining the process and making it well-suited for real-time applications, such as industrial plastic sorting specially in the presence of multilayer and multicomponent packages.

Figure 1
figure 1

A Graphical illustration of the approach based on the essential pixel/wavelength selection and NMF. The hyperspectral data with a 3D structure, needs object detection first. Then the selected parts of the data which correspond to the objects unfold to matrices to calculate the ESPs/ESVs. Finally, the reduced data sets are analyzed by NMF and full contribution maps and spectral profiles are retrieved. In the last step, F-test will help for decision making to monolayer/multilayer plastic sorting.

Data description

Simulated data set

A hyperspectral image was simulated to visualize the effect of essential pixels and variables selection using convex polytope and further unmixing by NMF. The concentration contribution maps (re-folded concentration profiles) and pure spectral profiles are shown in Fig. 2. The simulated hyperspectral data set is of dimensions 253 × 186 pixels by 141 variables and the unfolded two-way data matrix of 47,058 pixels and 141 pseudo-spectral channels. Despite the simplicity of the simulation, it should be noted that care was taken to avoid the pure pixels or selective spectral channels, this corresponds to a non-trivial situation for NMF analysis. For this purpose, small random numbers were added to the pure contribution maps. In this case, eight objects are on the hypothetical conveyor belt which are made of polypropylene (PP), polyethylene (PP), and polyethylene terephthalate (PET). Five and three objects are monolayers and multilayers, respectively. Figure 2 presents the pure contribution maps and spectral profiles of PP, PE, and PET.

Figure 2
figure 2

The three components simulated the HSI data set. Concentration distribution maps and pure spectral profiles are presented.

Experimental hyperspectral images of plastics

A collection of monolayer and multilayer objects made from PP (polypropylene), PS (polystyrene), PET (polyethylene terephthalate), and polypropylene on top of polyethylene (multilayer), all with known compositions, were gathered from packaging waste. The characteristic information of these objects was known in advance. To collect data, the objects were randomly placed on a conveyor belt, and two HSI measurements were recorded in the 900–1700 nm range. The resulting data sets are presented in Fig. 3.

Figure 3
figure 3

The experimental cases are visualized in (a) and (b).

Result and discussions

To illustrate the effect of essential spectral pixels (ESPs) and/or essential spectral variables (ESVs) selection of the plastic sorting using NMF, the results obtained on the simulation and real HSI data set are discussed in detail.

The procedure starts with object detection for the simulated case as it does not need any data per-processing. Then, the ESPs and ESVs should be selected using a convex hull on the normalized abstract spaces of the data set for each object. First five Principal Components were used to generate the abstract spaces. The selected essential pixels and wavelengths are presented in Fig. 4a and b. In Fig. 4, the left and right panels contain the mean image and the spectral data of the simulated case. The selected ESPs and ESVs are shown by white crosses and red lines. The raw data matrix of the simulated case is composed of 57,078 rows (pixels) and 173 variables were reduced to eight matrices by object detection. For each object, two or three ESPs and a few ESVs are selected as it is shown in Fig. 4a,b. In total for the whole data, only 0.04% data variance remains. It means only 0.04 percent of the data is essential/enough to analyze.

Figure 4
figure 4

The selected ESPs and ESVs for the simulation case are indicated as white crosses and red lines, respectively.

Later, the reduced data sets were analyzed by NMF. The results of NMF are called W and H. To make a better visualization of the full contribution maps, W changed to binary matrices and visualized in Fig. 5. The total calculation time for NMF analysis of the reduced data sets was nearly 0.001 s, 1% of the time required for the same calculations on all pixels and wavelengths.

Figure 5
figure 5

The full contribution maps of all components in simulated data are visualized. Each color is used for special composition. Blue, green, red, aqua, and purple, brown, are used for, PP, PE, PET, PP/PE (multilayer), PP/PET, and PE/PET, respectively.

In addition, strategic data pre-processing methods are essential prior to data analysis to linearize the data, remove artifacts and thereby optimally align the data and the NMF with the Beer–Lambert–Bouguer law. Figure 6 presents the effect of each pre-processing step on the shape of the spectral data for an example object. Figure 6a is the recorded raw HSI of an optional object. Some spectral profiles which reflected the light can be removed first. These profiles can be recognized by the slope of the spectral profiles in the range of 1100–1500 nm. In this wavelength range the slope of the profiles are zero (all saturated).

Figure 6
figure 6

Depicts the data set corresponding to the first unknown object, visualized after undergoing various pre-treatment steps. These steps include object detection and removal of pixels affected by signal saturation, baseline correction of the negative logarithm of the data, and denoising using Singular Value Decomposition.

Figure 6b presents the normalized data after removing the saturated spectral profiles. The next step (6c) is the treated data after baseline correction of the using asymmetric least squares on the minus logarithm of the data set. Finally, the data set is reconstructed by a few principal components for denoising purposes. The corrected data is presented in Fig. 6d.

Once the data has been preprocessed, it undergoes object detection, as detailed in the theory section. This process yields smaller data sets, each one corresponding to a single object and containing relevant information. In the current scenario, object detection yields five data sets (corresponding to the five objects on the conveyor belt). The crux of the proposed algorithm lies in selecting the appropriate ESPs and ESVs based on the data's inner polygon in abstract spaces. Figures 7 and 8 illustrate the chosen ESPs and ESVs for both real cases, denoted by white crosses and red vertical lines. Finally, the size of the data is decreased to 48 × 12 and 149 × 45 , from the original dimensions of 144,000 × 173.

Figure 7
figure 7

Depicts the mean image and spectral profile of the first experimental case, presented in (a) and (b) respectively. The chosen ESPs and ESVs are denoted by white crosses and red lines.

Figure 8
figure 8

Depicts the mean image and spectral profile of the second experimental case, presented in (a) and (b) respectively. The chosen ESPs and ESVs are denoted by white crosses and red lines.

The reduced data sets were analyzed by NMF. The final resulting maps which identified the composition of each object are presented in Fig. 9. To generate these single contribution maps, one least square with a non-negative least square was necessary which was followed by binary maps construction. In the resulting images, each color is used for special composition as is indicated on the top of the figure as well. For comparison, the full data sets were analyzed by NMF as well which took almost 220 s. However, NMF for the reduced data sets took 0.003 s.

Figure 9
figure 9

The full contribution maps of all components in both experimental data sets are visualized in (a) and (b). Each color is used for special composition. Cyan, blue, purple, and red, are used for PS, PP, PP/PE (multilayer), and PET, respectively.

Conclusion

In this study, we present a pioneering application of our earlier research, focusing on the selection of essential pixels and variables within hyperspectral imaging (HSI) datasets. This application holds significant promise, particularly in the context of industrial plastic sorting, where it addresses the challenges of characterizing complex multicomponent and multilayer plastics. Building upon the foundation of our previous findings, which demonstrated the efficacy of choosing Essential Spectral Pixels (ESPs) and Essential Spatial Variables (ESVs) to minimize redundancy and streamline data analysis, we have embarked on an innovative journey. Here, we deploy these principles in an entirely novel context, employing them in the realm of hyperspectral imaging for plastic sorting. Implementing variable and pixel selection algorithms significantly enhances computational efficiency and material detection capabilities. Striking a balance between computational speed and information retention is vital. Our proposed method carefully preserves the most informative pixels and variables, reducing the risk of data loss. Focusing on information-rich elements improves material detection precision and accuracy, holding promise for advancements in HSI data analysis across various applications.

While conventional plastic recycling typically relies on RGB color sorting, our method leverages hyperspectral imaging for deeper insights into plastic characteristics, especially in complex multicomponent and multilayer scenarios. Nowadays, HSIs are used in plastic sorting with the aim of plastic-type identification. So coupling data size reduction (which seems logical in big HSI analysis) with a fast algorithm is promising. The benefit of data reduction is not just for the analysis part. However, the reward can be used in the data recording scheme. Considering the wide NIR domain, recording a few wavenumbers rather than the whole scope can be dramatically advantageous. On the other hand, NMF is suggested to decompose reduced data in plastic sorting. This procedure can be induced to analyze data from complementary domains, like remote sensing, etc. However, plastic sorting is a case in that we explained the procedure based on it (Supplementary Information).