Hyperspectral signature-band extraction and learning: an example of sugar content prediction of Syzygium samarangense

This study proposes a method to extract the signature bands from the deep learning models of multispectral data converted from the hyperspectral data. The signature bands with two deep-learning models were further used to predict the sugar content of the Syzygium samarangense. Firstly, the hyperspectral data with the bandwidths lower than 2.5 nm were converted to the spectral data with multiple bandwidths higher than 2.5 nm to simulate the multispectral data. The convolution neural network (CNN) and the feedforward neural network (FNN) used these spectral data to predict the sugar content of the Syzygium samarangense and obtained the lowest mean absolute error (MAE) of 0.400° Brix and 0.408° Brix, respectively. Secondly, the absolute mean of the integrated gradient method was used to extract multiple signature bands from the CNN and FNN models for sugariness prediction. A total of thirty sets of six signature bands were selected from the CNN and FNN models, which were trained by using the spectral data with five bandwidths in the visible (VIS), visible to near-infrared (VISNIR), and visible to short-waved infrared (VISWIR) wavelengths ranging from 400 to 700 nm, 400 to 1000 nm, and 400 to 1700 nm. Lastly, these signature-band data were used to train the CNN and FNN models for sugar content prediction. The FNN model using VISWIR signature bands with a bandwidth of ± 12.5 nm had a minimum MAE of 0.390°Brix compared to the others. The CNN model using VISWIR signature bands with a bandwidth of ± 10 nm had the lowest MAE of 0.549° Brix compared to the other CNN models. The MAEs of the models with only six spectral bands were even better than those with tens or hundreds of spectral bands. These results reveal that six signature bands have the potential to be used in a small and compact multispectral device to predict the sugar content of the Syzygium samarangense.

come from different trees or positions on the tree, climate conditions, and different cultivation methods.Measuring the sugar content of each fruit non-invasive and destructively would bring at least two benefits.Cultivation methods could be optimized by observing the sugar content distribution of the fruits in each tree or a single tree on the farm without destroying these fruits.The quality of the fruits can be further graded by non-destructively measuring the sugar content to increase the economic value of fruits.Thus, the better-quality fruits can have a higher price, while the lower-quality fruits can be processed as canned or preserved food.
Hyperspectral imaging (HSI) systems collecting both reflectance spectrum and image data of a sample can be developed as a non-destructive measurement [5][6][7] .The deterministic methods for hyperspectral analysis include the Multiple Linear Regression (MLR), Principal Component Regression (PCR), and Partial Least Square Regression (PLSR).Peris et al. 8 applied the MLR with near-infrared spectral data to measure the solids content of peaches.Mendoza et al. 9 used the PLSR on hyperspectral scattering data to detect apple fruit firmness.Liang et al. 10 detected the zebra chip disease in potatoes using the PLSR with near-infrared (NIR) spectroscopy.Kemps et al. 11 assessed the concentration of anthocyanins, polyphenols, and the sugar content in grapes by PLSR.Chuang et al. 12 proposed the Independent Component Analysis (ICA) and PLSR with NIR spectral data to quantify the sugar content of wax apples.Viegas et al. 13 suggested that the total anthocyanin content (TAC) and the total phenolic compounds (TPC) could be determined by the PLR with NIR spectral data.In addition to the deterministic methods, deep learning methods have been developed to predict the quality of fruits.The convolution neural network (CNN) and the support vector machine (SVM) were proposed to detect the ripeness of strawberries 14 and the maturity of citrus 15 .Furthermore, Tu et al. 16 used the region-based CNN-based model to identify and detect the number of passion fruit in orchards.Fajardo and Whelan 17 detected the fruit in orchards using CNNbased models.Marani et al. 18 obtained high accuracy of grape bunch segmentation with a deep neural network.
HSI and multispectral imaging (MSI) systems act as the important instruments to conduct non-destructively analysis and detection.HSI systems measure hundreds to thousands of spectral bands with submicron-level resolution but have the highest measurement duration in all spectral systems.Contrarily, multispectral imaging (MSI) systems detect fewer bands but with a shorter measurement duration and smaller volume compared to those of HSI systems.Thus, the HSI systems are suitable to collect high-resolution hyperspectral data for spectral analysis.A dozen or several signature bands can be extracted from the hyperspectral data and can be applied in a portable MSI system for rapid and massive detection.
Several studies developed the spectral band selection or signature-band extraction based on the deep learning models, as listed in Table 1.To improve the computation duration, Zhan et al. 19 eliminated the redundant bands by the distance density.To find the importance of the corresponding bands in a model, Darling et al. 20 applied the Frobenius norm to obtain the value of each row vector delivering a contribution vector by getting the trained weight matrix.Mou et al. 21proposed an unsupervised deep reinforcement technique for hyperspectral band selection.Elkholy et al. 22 proposed a deep-encoder-based unsupervised hyperspectral band selection method to perform classification.Cai et al. 23 reduced redundant bands with contribution map-based CNN.
The deep learning models with hyperspectral data were successfully used to predict the sugar content of the Syzygium samarangense with the MAE of 0.5°Brix in our previous study 24 .The results indicated that the deep learning models with hyperspectral data were beneficial in conducting non-destructive prediction for the sugar content of fruits.If the sugar content can be predicted by several spectral bands, the detection time can be greatly shortened.How to extract the signature bands from hyperspectral data to conduct a rapid and accurate sugar content prediction is a critical issue that is focused in this study.To deal with this issue, this paper presents a new method to extract six signature bands from the deep learning models trained with spectral data.The six signature bands were further utilized to train new deep-learning models to perform the sugar content prediction.The models using six signature bands had the mean absolute errors (MAEs) lower than 0.5°Brix when predicting the sugar content.Thus, the resulting performance of those models using six signature bands could be as good as that of the commercial Brix meters.Furthermore, the performance of the models using the signature bands with different bandwidths in different spectral ranges was evaluated.Firstly, the bandwidth of signature bands can be optimized for sugar content prediction because the bandwidth of spectral bands measured by an MSI system can be adjusted.Secondly, the spectral range of an MSI system is associated with the sensor equipped on this system.For example, the spectral response of a silicon-based complementary metal-oxide-semiconductor sensor with and without an infrared filter covers the visible (VIS) spectrum ranging from 400 to 700 nm and the visible to near-infrared (VISNIR) spectrum ranging from 400 to 1000 nm, respectively.The spectral response of an indium gallium arsenide sensor covers the short-wave infrared (SWIR) ranging from 900 to 2500 nm.Thus, which spectral range and bandwidth of signature bands can be employed to effectively identify the sugar content is a novel exploration in this paper.Hyperspectral data collection.The hyperspectral data collection was conducted in a dark room and performed by the coaxial heterogeneous HSI system 25 .This system can concurrently acquire the VIS and SWIR spectral data of the wavelength ranging from 400 to 1700 nm.The raw hyperspectral image is three-dimensional data (W, L, Λ), where W is image width, L is image length, Λ is the number of spectral bands.The spectral data were divided into four wavelength ranges for spectral analysis, including 400-1700 nm, 400-700 nm, 400-1000 nm, and 900-1700 nm where the band numbers of the hyperspectral data in these four spectral ranges are 1367, 575, 1053, and 424, respectively.
Destructive sugar content measurement.After the hyperspectral data collection, the wax apple slices were squeezed to extract their juice.The sugar contents of these juice were measured by a commercial refractometer ATAGO PAL-1.Organisation Internationale de métrologie légale 26 recommended that the refractometers can be used to determine the sugar content of fruit juices.The refractometer provides the index of °Brix, which correlates to total soluble solids (TSS) concentration.The TSS of the juice is mainly composed of glucose, fructose, and sucrose.Thus, TSS concentration in juice can represent its sugar content.

Spectral calibration and denoising.
To eliminate the effect of the dark offset and system response, the raw hyperspectral data, I s (x, y, λ), minus dark, I D (x, y, λ), and divided by the spectrum of a standard calibration whiteboard, I(x, y, λ).Thus, the reflectance R(x, y, λ) could be derived from Eq. (1), written as The noise of the reflectance data was reduced by adopting a Savitzky-Golay filter, written as (1) R x, y, i = I S x, y, i − I D (x, y, i ) I w x, y, i − I D (x, y, i ) .www.nature.com/scientificreports/Data augmentation.The bottom part of wax apples has the highest sugar content.Thus, the bottom slices were used for sugar content modeling and prediction.A total of 1034 slices were recruited for spectral analysis.
The image width and length of the slices are approximately 30-100 pixels and 70-160 pixels, respectively.An area of 20 by 20 pixels in each slice was used.Thus, the hyperspectral cubic data of each slice had the size of 20 × 20 × 1367.These three-dimensional data were the input of the CNN model.For the FNN model, 20 by 20 values in each band were averaged.Thus, the three-dimensional data were transferred to the one-dimension array with the size of 1 × 1367.Each slice is seen as an individual sample which is measured with a °Brix value.The °Brix of the samples mostly ranged between 8 to 14.The samples were divided into five groups to observe the regression results of sugar content according to their °Brix values.The taste of fruits below the °Brix of 10 is not actually sweet.Thus, the first group contained the samples whose °Brix value was lower than 10.The second, third, and fourth groups contained the samples whose °Brix value was between 10 to 11, 11 to 12, and 12 to 13, respectively.The fifth group contained the samples whose °Brix value was larger than 13.Basically, 218 samples in most of °Brix levels were randomly selected for sugar content regression and divided into training, validation, and test sets, as shown in Table 2.The sample numbers of training, validation, and test sets were 131, 44, and 43, respectively.

Proposed methods
This study aims to extract the signature bands from hyperspectral data for rapid sugar content prediction.The bandwidth of the hyperspectral data measured by the coaxial heterogeneous HSI system is nanometer to subnanometer.In contrast, the multispectral data measured by an MSI system is approximately greater than 5 nm.As shown in Fig. 2, block A introduces that the hyperspectral data are converted to spectral data with multiple bandwidths to simulate multispectral data.The spectral data are used to create the CNN and FNN-based sugar content prediction models.The models are trained, verified, and tested.Block B illustrates that these models are used to extract the signature bands by the absolute mean of the integrated gradients method.Block C depicts that the CNN-based and FNN-based models are trained, verified, and tested with signature-band data for sugar content prediction.
Modeling using multispectral data.Hyperspectral data conversion.In the coaxial heterogeneous HSI system, the VIS spectrometer has a spectral resolution of ~ 0.5 nm in the wavelength ranging from 400 to 1000 nm; the SWIR spectrometer has a spectral resolution of ~ 2.5 nm in the wavelength ranging from 900 to 1700 nm.The hyperspectral data were converted to spectral data, R M (x,y,λ c ), which is written as where R M (x,y,λ c ) is the mean of the integrated intensity of the central band, λ c ; the central band λ c ranges from 400 to 1700 nm with an interval of bandwidth, w.
A total of six sets of spectral data were converted from hyperspectral data according to the six bandwidths, w , of ± 2.5 nm, ± 5 nm, ± 7.5 nm, ± 10 nm, ± 12.5 nm, and ± 15 nm.Each set was grouped into four subsets based on the spectral range of 400-1700 nm, 400-1000 nm, 400-700 nm, and 900-1700 nm, as shown in Table 3.
We apply the exponential learning rate decay method in Eq. ( 4) with the initial learning rate lr of the CNN and FNN models.The exponential learning rate decay method is written as where lr is the learning rate, and k is the decaying rate; j is the iteration number of the epoch.The performance of the CNN and FNN models for sugar content prediction was evaluated by the mean absolute error, MAE, which is defined as where y i is the prediction value of the sugar content, y is the true value of the sugar content, and n is the sample size.The goodness-of-fit of the CNN and FNN models was assessed by the R-squared value which is defined by where y is the mean of the prediction values.An R 2 value of zero indicates that the regression of a model doesn't fit the data properly, while an R 2 value of 1 indicates that the regression of the model fits perfectly.
CNN modeling and verification using multispectral data.The architecture of the CNN model for modeling multispectral data is shown in Fig. 3.The input of this model is three-dimensional data (W, L, Λ).The number  where R is the input data, and M is the number of intervals along the path from the baseline from R ′ to R. Most deep learning frameworks could efficiently perform the calculation of the gradient.
The MIG of the CNN models is a three-dimensional matrix.The three-dimensional MIG is averaged along the width, W, and length, L. Afterward, a one-dimensional array, Score, with the data points corresponding to the bands, λ c , is obtained, as defined in Eq. (8).The MIG of the FNN models is a one-dimensional matrix.The Score of the FNN is the absolute of the MIG.The Score represents the contribution score of bands in a deep learning model.The significance of a band in a model is positively correlated to the Score of this band.
The band with the first highest Score is selected for the first signature band.The second signature band is selected according to the band which has the second highest Score and is out of the ± 20 nm range of the first selected band, and so on.
Modeling using signature bands.The hyperspectral data of the signature bands are resampled into the spectral data with the bandwidths of ± 25 nm, ± 20 nm, ± 15 nm, ± 10 nm, and ± 5 nm.The spectral data are used to train new CNN and FNN models for sugar content prediction.The input data of the CNN and FNN models were three-dimensional hyperspectral and one-dimensional spectral data of the signature bands, respectively.The band number of a signature-band set is six.
,  The architecture of the CNN model for modeling resampled spectral data is shown in

Experimental results
Results of the modeling using multispectral data.Modeling with the spectral data in different wavelength ranges might have various results; thus, the spectral range was divided into four spectral ranges, including VIS (400-700 nm), VISNIR (400-1000 nm), SWIR (900-1700 nm), and VISWIR (400-1700 nm).The hyperparameters of the CNN and FNN models are shown in Tables 4 and 5.The CNN and FNN models using spectral data of six bandwidths assumed the same ten hyperparameters and nine hyperparameters, respectively.The exponential learning rate decay method is applied with the initial learning rate, and the decaying rate k is 1 × 10 -6 .Adam optimizer was also applied.In the CNN model, the regularization parameter is 1 × 10 -10 for L2 regularization.The prediction results of the CNN and FNN models using multispectral data with six bandwidths in four spectral ranges are shown in Table 6.The CNN and FNN models using the spectral data of ± 2.5 nm bandwidth had the worst prediction performance compared to those using the spectral data with the other bandwidths.Furthermore, the CNN and FNN models using the spectra in the SWIR spectral range had the worst prediction performance compared to those using the spectra in the other spectral ranges.Thus, the spectral data of ± 2.5 nm bandwidth and the spectra in the SWIR spectral range were excluded from the signature-band extraction.

Results of signature-band extraction.
Thirty deep learning models were trained by the spectral data with five bandwidths in three spectral ranges.The five bandwidths included ± 5 nm, ± 7.5 nm, ± 10 nm, ± 12.5 n m, and ± 15 nm, and the three spectral ranges were 400-700 nm, 400-1000 nm, and 400-1700 nm.The contribution of the bands used by each model was assessed by the Score, which is the absolute of the MIG, according to Eqs. (7) and (8).The top six spectral bands with the highest Score values of each model are listed in Tables 7, 8, 9, 10 and 11.Each entry in these tables includes the spectral band and its MIG value.The absolute MIG values of the FNN model using spectral data with a bandwidth of ± 5 nm in a spectral range of 400-700 nm are shown in Fig. 7.The six signature bands in this FNN model were selected according to the criteria introduced in subsection B of section III and marked in the bars with a slash texture.www.nature.com/scientificreports/Results of modeling using signature bands.Each model trained by the spectral data with a bandwidth in a spectral range had one set of the six signature bands with the six highest absolute MIGs, as listed in Tables 7, 8, 9, 10 and 11.Thus, a total of thirty sets of spectral bands associated with the bandwidths of ± 25 nm, ± 20 nm , ± 15 nm, ± 10 nm, and ± 5 nm in three spectral ranges of 400-1700 nm, 400-1000 nm, and 400-700 nm were selected as signature bands.The performance of the CNN and FNN models trained using these thirty sets of signature bands was further evaluated.The eight hyperparameters of the CNN and FNN models trained by these signature bands with the lowest MAE in each spectral range are shown in Tables 12 and 13, respectively.The exponential learning rate decay method is applied with the initial learning rate, and the decaying rate k is 1 × 10 -6 .Adam optimizer was also applied.In the CNN model, the regularization parameter is 1 × 10 -10 for L2 regularization.
The sugar content prediction results of the CNN and FNN models using the signature bands are shown in Table 14.The FNN model using the VISWIR signature bands with a bandwidth of ± 12.5 nm had a minimum MAE of 0.390° Brix compared to the other FNN models.The CNN model using the VISWIR signature bands with a bandwidth of ± 10 nm had the lowest MAE of 0.549° Brix compared to the other CNN models.The MAEs of the FNN models were significantly lower than those of the CNN models.Table 6.Sugar content prediction results of the convolution neural network and feedforward neural network models using the multispectral data of six bandwidths in four spectral ranges.MAE denotes the mean absolute error of the predictions.R 2 denotes the R-squared values of the models.Bold values denote the best results of the sugar content prediction in the CNN and FNN models.www.nature.com/scientificreports/one-dimension, respectively.The minimum MAE of the CNN and FNN models using six signature bands were 0.549°Brix and 0.390°Brix.Both results were less than 0.55°Brix; and close to or better than that of the CNN and FNN models using hundreds of spectral bands or thousands of spectral bands.The performance of the FNN models was almost better than that of the CNN models in the three spectral ranges of 400-1700 nm, 400-1000 nm, and 400-700 nm.The correlation between two adjacent bands is reduced in spectral data compared to hyperspectral data; the CNN model tends to find the correlation between two spectral bands; this might be why the performance of the CNN model was worse than that of the FNN model.The results reveal that the FNN model with one-dimensional input data might perform better on the sugar content prediction than the CNN model with three-dimensional input data.Furthermore, the CNN and the FNN models using only six signature bands have a high potential to predict the sugar content of wax apples.These six signature bands could be used in an MSI system to non-destructively and rapidly predict the sugar content of the wax apples in the future.The spectral data was not overlapped in the spectrum when it was converted from the hyperspectral data in this study.However, the spectrum of each spectral band could overlap.Spectral data with overlapped spectrums could be considered for signature-band extraction.Furthermore, the spectral bands used in an MSI system could have different bandwidths.Thus, the signature bands might be chosen from the bands with different bandwidths.These two variables greatly increase the complexity of the band extraction.In the future, an artificial intelligence model may perform signature-band extraction from spectral data with different bandwidths or overlapped spectrums.

Figure 1 .
Figure 1.(a) An intact wax apple marked with cut lines and (b) slices of three wax apples fixed on a flat plate for hyperspectral measurements.

Figure 2 .
Figure 2. Procedure of the proposed method for signature-band extraction.

Fig. 5 .
The input of this model is three-dimensional data (W, L, Λ).The number of bands, Λ, was six.The CNN model consists of two Conv-ReLu-BN layers with 64 kernel filters, two Conv-ReLU-BN layers with 64 kernel filters, one Average Pooling layer, two FC-ReLU-BN-DP layers, and the output layers with the rectified linear unit (ReLU) as the activation function.Because the number of bands, Λ, diminished to 6, some layers were reduced and the Average Pooling instead of Max-Pooling for the smaller size resampled data was used.The architecture of the FNN model for modeling multispectral data is shown in Fig. 6.The input of the model is one-dimensional data with a size of six bands.The FNN model consists of 4 FC-ReLU-BN layers.The final output Y is the sugar content prediction result of the model.The FNN model is similar to the proposed CNN model; the weights with the smallest RMSLE are recorded for the test datasets.

Figure 3 .
Figure 3.The proposed convolution neural network model with the input of three-dimensional spectral data for sugar content prediction.

Figure 4 .
Figure 4.The proposed feedforward neural network model with the input of one-dimensional spectral data for sugar content prediction.

Figure 5 .
Figure 5.The proposed convolution neural network model with the input of three-dimensional multispectral data for sugar content prediction.

Figure 6 .
Figure 6.The proposed feedforward neural network model with the input of one-dimensional signature-band data for sugar content prediction.

Figure 7 .
Figure 7. Absolute mean of the integrated gradients of the feedforward neural network model with a bandwidth of ± 5 nm in a spectral range of 400-700 nm.The top six Score ranked bands are shown by the bars with slash texture and selected as the signature bands, as listed in Table 7.The blue bars represent that the MIG value of the bands is positive.The red bars represent that the MIG value of the bands is negative.

Table 1 .
Studies on signature-band extraction or selection with deep learning methods.

Table 2 .
Some statistics of experimental materials, including the mean and standard deviation in each °Brix interval of samples and the sample numbers of training, validation, and test sets in each data type and spectral range.STD denotes the standard deviation of °Brix values.

Table 3 .
Band numbers of hyperspectral data and multispectral data with six bandwidths in four spectral ranges.

Table 4 .
Selected hyperparameters of the convolution neural network model with multispectral data.

Table 5 .
Selected hyperparameters of the feedforward neural network model with multispectral data.

Table 7 .
Top six Score ranked spectral bands associated with the convolution neural network and feedforward neural network models using the spectral data with a bandwidth of ± 5 nm in three spectral ranges.The Socre is the absolute mean of the integrated gradients.

Table 8 .
Top six Score ranked spectral bands associated with the convolution neural network and feedforward neural network models using the spectral data with a bandwidth of ± 7.5 nm in three spectral ranges.The Socre is the absolute mean of the integrated gradients.

Table 9 .
Top six Score ranked spectral bands associated with the convolution neural network and feedforward neural network models using the spectral data with a bandwidth of ± 10 nm in three spectral ranges.The Socre is the absolute mean of the integrated gradients.

Table 10 .
Top six Score ranked spectral bands associated with the convolution neural network and feedforward neural network models using the spectral data with a bandwidth of ± 12.5 nm in three spectral ranges.The Socre is the absolute mean of the integrated gradients.

Table 11 .
Top six Score ranked spectral bands associated with the convolution neural network and feedforward neural network models using the spectral data with a bandwidth of ± 15 nm in three spectral ranges.The Socre is the absolute mean of the integrated gradients.

Table 12 .
Selected hyperparameters of the convolution neural network models using the signature bands with the lowest prediction error in each spectral range.