Fully connected-convolutional (FC-CNN) neural network based on hyperspectral images for rapid identification of P. ginseng growth years

Chen, Xingfeng; Du, Hejuan; Liu, Yun; Shi, Tingting; Li, Jiaguo; Liu, Jun; Zhao, Limin; Liu, Shu

doi:10.1038/s41598-024-57904-3

Download PDF

Article
Open access
Published: 26 March 2024

Fully connected-convolutional (FC-CNN) neural network based on hyperspectral images for rapid identification of P. ginseng growth years

Xingfeng Chen^1,2,
Hejuan Du³,
Yun Liu⁴,
Tingting Shi²,
Jiaguo Li¹,
Jun Liu¹,
Limin Zhao¹ &
…
Shu Liu⁵

Scientific Reports volume 14, Article number: 7209 (2024) Cite this article

641 Accesses
Metrics details

Subjects

An Author Correction to this article was published on 17 April 2024

This article has been updated

Abstract

P. ginseng is a precious traditional Chinese functional food, which is used for both medicinal and food purposes, and has various effects such as immunomodulation, anti-tumor and anti-oxidation. The growth year of P. ginseng has an important impact on its medicinal and economic values. Fast and nondestructive identification of the growth year of P. ginseng is crucial for its quality evaluation. In this paper, we propose a FC-CNN network that incorporates spectral and spatial features of hyperspectral images to characterize P. ginseng from different growth years. The importance ranking of the spectra was obtained using the random forest method for optimal band selection. Based on the hyperspectral reflectance data of P. ginseng after radiometric calibration and the images of the best five VNIR bands and five SWIR bands selected, the year-by-year identification of P. ginseng age and its identification experiments for food and medicinal purposes were conducted, and the FC-CNN network and its FCNN and CNN branch networks were tested and compared in terms of their effectiveness in the identification of P. ginseng growth years. It has been experimentally verified that the best year-by-year recognition was achieved by utilizing images from five visible and near-infrared important bands and all spectral curves, and the recognition accuracy of food and medicinal use reached 100%. The FC-CNN network is significantly better than its branching model in the effect of edible and medicinal identification. The results show that for P. ginseng growth year identification, VNIR images have much more useful information than SWIR images. Meanwhile, the FC-CNN network utilizing the spectral and spatial features of hyperspectral images is an effective method for the identification of P. ginseng growth year.

Climate change impacts and adaptations of wine production

Article 26 March 2024

Legume rhizodeposition promotes nitrogen fixation by soil microbiota under crop diversification

Article Open access 04 April 2024

Plant–microbiome interactions: from community assembly to plant health

Article 12 August 2020

Introduction

Panax ginseng (family Araliaceae) is an herb that is very popular all over the world^1,2,3,4. This species is a good source of several bioactive components (e.g., phenols, proteins, alkaloids, vitamins, ginsenosides, amino acids, etc.)^5,6. Ginsenosides are the main bioactive compounds in P. P. ginseng, which have capabilities to inhibit the ROS (reactive oxygen species), production of nitric oxide and also maintain blood circulation^{7,8,9,10,11,12}. More than 100 ginsenosides have been reported from different plant parts of the species (i.e., rhizomes, roots, stems, leaves, fruits, and flowers) showing a variety of therapeutic effects including anti-inflammatory, anti-allergic, anti-cancer, and anti-diabetic^{13,14,15,16,17,18,19,20,21,22,23}. Previous studies have reported the pharmacological and physiological significance of the species, which has been traditionally consumed to enhance physical fitness, improve endurance, provide energy, etc. The production of wild P. ginseng is low and the current large supply of P. ginseng is dominated by garden cultivated P. ginseng⁶. In China, cultivated P. ginseng that is 5 years old and less than 5 years old is used as food, and cultivated P. ginseng that is more than 5 years old is used for medicinal purposes²⁴. The accumulation of active ingredients in P. ginseng of different growth years is different, and the medicinal and economic values vary greatly, so it is very important to identify the growth years of P. ginseng by reliable technical methods.

The traditional manual identification of the growth year of P. ginseng is by observing the appearance characteristics such as roots, fibrous roots, and rhizomes^25,26, which is time-consuming, laborious, and highly subjective. Therefore, it is more objective to utilize scientific instrumental methods to identify the growth year of P. ginseng. For example, microscopic identification can identify the growth age of P. ginseng by the content of calcium oxalate clusters in the rhizome²⁷, and mass spectrometry²⁸ and liquid chromatography^26,29 can identify the growth age of P. ginseng by the active ingredients such as saponins. However, these methods are time-consuming and destructive, requiring complex pre-treatment and skilled operators. In addition, these methods are only applicable to laboratory conditions and are inefficient, making them difficult to carry out on a large scale.

Hyperspectral imaging (HSI), as a non-chemical and non-destructive technique that acquires one-dimensional spectral information and two-dimensional image information, offers significant advantages in the comprehensive analysis of samples. In the past few years, HSI has received increasing attention for quality assessment and species classification in the fields of agriculture^30,31,32, food³³ and traditional Chinese functional foods^34,35. In a study on the classification of traditional Chinese functional foods using hyperspectral imaging, Ru et al.³⁶ proposed a data fusion method in the visible-near-infrared and short-wave infrared spectral ranges and obtained a 97.3% accuracy for the classification of the geographic origin of Atractylodes macrocephala. Xia et al.³⁷ investigated the effect of different wavelength selection methods on the identification of different sources of Japanese maitake, with an optimal accuracy of 99.1%. HIS provides three-dimensional information including spectrum, space and radiation. However, most of the current research methods only utilize its spectral features and do not make full use of the spatial information of its images; in addition, the radiometric information is not sufficiently utilized because of the imperfect methods of relative and absolute radiometric correction of the radiometric information³⁸.

Effective analysis of the huge amount of data acquired from hyperspectral imaging is a great challenge that hinders its application. Currently, machine learning methods and deep learning methods have been developed and are considered ideal for processing and analyzing hyperspectral images. Given the unique self-learning capability and excellent performance of neural network methods, they have been widely welcomed by researchers and have been applied to the processing of spectral and hyperspectral images^39,40 and remote sensing data.

The above studies are useful for identifying the year of growth of P. ginseng. However, these studies could not learn both spectral and spatial features of HSI in one model. To deal with this multidimensional and multistep classification problem, we propose a fused Fully Connected Neural Network (FCNN) and Convolutional neural network (CNN) model to synthesize HSI spectral and spatial features, named FC-CNN network, aiming at exploring the feasibility of utilizing hyperspectral images of two different spectral ranges, neural networks, and data fusion to discriminate the year of P. ginseng growth.

Materials

Sample collection and preparation

The collection of ginseng plant material is carried out according to institutional, national, and international guidelines. All methodologies are carried out in accordance with relevant institutional, national and international guidelines. Ginseng samples are professionally certified and provided by Professor Zhang Xiaobo of the China Academy, of Chinese Medical Sciences.

A total of 84 P. ginseng samples were collected from Jilin, Liaoning, Heilongjiang provinces, covering all main production regions in China, and their growth years ranged from 1 to 7 years. The sample numbers corresponding to different growth years of 1 to 7 years are 17, 11, 12, 12, 8, 12, 12.

Hyperspectral image acquisition

Hyperspectral images were acquired by a hyperspectral imaging system, which contains of HySpex VNIR-1800 and SWIR-384 hyperspectral cameras (Fig. S1). The HySpex VNIR-1800 and SWIR-384 hyperspectral cameras are push broom instruments that collect spectral data in the 400–1000 nm range and 930–2500 nm range, respectively.

A platform integrated the two hyperspectral cameras. For the platform, two 150 W illumination system were used. This source features a DC regulation scheme providing stable light output intensity; eliminating output fluctuations caused by alternating current or changes in line voltage. A conveyer belt driven by a stepper motor was used for P. ginseng samples motion. A darkroom was used for imaging of P. ginseng samples. The scanned background is black. A whiteboard with Lambertian reflectance was scanned together with P. ginseng samples for absolute and relative radiometric correction.

A raw hyperspectral image acquired by the hyperspectral imaging system includes 288 bands for VNIR image and 108 bands for SWIR image, respectively. In total, the data size of one P. ginseng sample is about 2 gigabytes. Figure S2 shows the true color image in the VNIR band and the false color composite image in the SWIR band. The image in Fig. S2 clearly shows the shape and texture features of P. ginseng sample, as well as the flare noise at the edge of the whiteboard. Figures S1 and S2 are included in the supplementary materials.

Methodology

This section describes the methodological system for P. ginseng growth year identification using graph-spectral information from hyperspectral images, as shown in Fig. 1. The method system mainly consists of three steps: (1) obtaining P. ginseng spectral information: pre-processing the original hyperspectral data to obtain P. ginseng hyperspectral reflectance images and reflectance spectral curves; (2) obtaining P. ginseng graph information: extracting several bands with the greatest importance to form a new P. ginseng image through band optimization; (3) constructing a model for P. ginseng growth year identification: using the graph-spectrum information and fusing two deep learning algorithms, FCNN and CNN, to construct a P. ginseng growth year identification model and verify the accuracy.

P. ginseng spectrum information

The acquisition of P. ginseng spectral information includes three data processing processes: (1) image segmentation; (2) radiometric calibration; and (3) reflectance spectrum calculation.

In the first step: P. ginseng samples and white plates are extracted from the original hyperspectral images. In the second step: the original hyperspectral images of P. ginseng samples are converted from digital number (DN) to surface reflectance. In the third step: the average reflectance profile of each band is calculated for each P. ginseng sample.

Image segmentation

The segmentation of P. ginseng samples, whiteboard and background is crucial for accurately extracting spectral information. Firstly, each P. ginseng sample is defined as a P. ginseng region of interest (GROI), and the corresponding whiteboard is defined as a whiteboard region of interest (WROI). Then, a mask was built by conducting image binarization on the gray-scale image at 622 nm and 1597 nm for VNIR images and SWIR images, respectively. The mask was used to extract GROI and WROI, as shown in Fig. S3. Pixel-wise spectra within each ROI were extracted. The head-to-tail spectra with high random noise levels were first removed. Figure S3 in supplementary materials.

Radiometric calibration

The purpose of radiometric calibration is to eliminate the interference of the sensor itself and convert the DN value recorded in the original image into the true reflectance data, which is an indispensable step in P. ginseng hyperspectral image analysis. In this study, a radiometric calibration method for P. ginseng hyperspectral images was defined based on the principle of hyperspectral imaging system and the characteristics of the original hyperspectral images.

Firstly, based on the Lambertian reflectance properties of the whiteboard and the fact that its reflectance is stable over the imaging time (3 days), the whiteboard can be used as a standard reference for absolute radiometric calibrations. Also, the edge pixels of the whiteboard are eliminated due to the flare in the image.

Secondly, as can be seen in Fig. S2, the pixels of each P. ginseng sample acquired by the hyperspectral imaging system are under different illumination and geometric conditions along the scanning direction. As can be seen in Fig. S2, due to the illumination and optical lens differences along the width direction, columns in the image have different brightness, which are bright in the center columns and dark in edge columns.

Finally, since the relative positions of GROI and WROI in the image are fixed, the reflectance of GROI is defined as:

$$\rho_{{\lambda \left( {i,j} \right)}} = { }\frac{{DN_{{GROI\left( {i,j} \right)}} }}{{E\left( {DN_{{WROI\left( {:,{ }j} \right)}} } \right)}}$$

(1)

where ${\rho }_{\lambda }$ represents the reflectance at wavelength $\lambda$, $i,j$ are the rows and columns of pixels in the GROI, $E({DN}_{WROI(:, j)})$ is the mean of all DNs in the column $j$ in the whiteboard, ${DN}_{(i,j)}$ is the DN located at $(i,j)$ in the GROI.

When the DNs of GROI were processed column by column using Eq. (1), the reflectance images of the P. ginseng samples were obtained after absolute and relative radiometric calibrations.

Reflectance spectral curves

The reflectance of each pixel above the GROI in the hyperspectral image of P. ginseng obtained by the hyperspectral imaging system differs due to the different lighting conditions at different angles of the P. ginseng sample surface during the imaging process. It is not rigorous to calculate the average hyperspectral reflectance profile of GROI by just selecting a certain number of sample points randomly on the surface of GROI. In this study, a method to calculate the hyperspectral reflectance curves of P. ginseng was proposed.

The hyperspectral reflectance images of the P. ginseng samples after the previous radiometric calibration process included 288 bands of VNIR images and 108 bands of SWIR images, for a total of 396 bands. A separate GROI image is generated for each band, i.e., a single-band GROI image, and the average reflectance value of the GROI on that band is found using Eq. (2). By analogy, 396 average reflectance values can be found. The 396 average reflectance values are plotted into a curve, which is the hyperspectral reflectance curve of GROI.

The average reflectance of single-band GROI images was calculated as

$$\rho_{\lambda I} = \frac{{\sum \rho_{{\lambda \left( {:,:} \right)}} }}{{N_{GROI} }}$$

(2)

where ${\rho }_{\lambda I}$ is the average reflectance of a single-band GROI image with wavelength $\lambda$. On the right side of Eq. (2), the numerator represents the sum of reflectance of all pixels and the denominator represents the number of pixels in the GROI image. By sequentially processing the 396 bands of the P. ginseng hyperspectral images, one corresponding average reflectance spectral curve can be obtained for each P. ginseng sample.

P. ginseng image information

In addition to spectral information, P. ginseng hyperspectral images also include information such as shape and texture, which need to be obtained using images. Since the P. ginseng hyperspectral image contains 396 bands, if the image composed of all the bands is involved in deep learning, it will inevitably affect the computational efficiency because of the large amount of data, thus the hyperspectral image information needs to be processed for band optimization.

In this study, 288 bands of VNIR images and 108 bands of SWIR images were ranked by random forest (RF) algorithm for band importance, respectively. Based on the ranking results, the top 5 bands with the highest importance were selected and the corresponding VNIR and SWIR images were synthesized as image data for P. ginseng growth year identification, respectively. After band optimization of the images, the data size of each GROI is compressed from 2 GB to 8 MB.

Model construction for P. ginseng growth year identification

FC-CNN model

Figure 2 presents the framework of the FC-CNN model, which is comprised of two main branches: an FCNN-based spectral extractor, and a CNN-based image extractor. These two branches were finally concatenated in the fully connected layer, and the year of P. ginseng growth was finally output by “softmax” layer.

The hyperspectral reflectance data of P. ginseng samples with different growth years were input to the FCNN-based spectral extractor, as depicted in the left of Fig. 2, which extracted the spectral features of P. ginseng with different growth years. Note that the FCNN-based spectral extractor only exploits the spectral information. Meanwhile, the hyperspectral image data of P. ginseng with different growth years after band optimization were input to the CNN-based image extractor, as depicted in the right of Fig. 2, which extracted the image features of P. ginseng with different growth years. It is worth mentioning that not all of the five VNIR and five SWIR bands, which were preferentially selected by Section “P. ginseng image information”, were used in the P. ginseng image feature extraction, and the specific bands involved in the calculation were based on the comparison of multiple experimental results.

The innovation of our model is two-fold: (1) The spectral extractor and image extractor handled spectral information and image information, respectively, leading to an effective and interpretable prediction result. The FCNN-based spectral extractor fed with hyperspectral reflectance data of P. ginseng for different growth years, which could handle the spectral sensitivity of P. ginseng growth years. Meanwhile, the image extractor combined the predictions of the spectral extractor. (2) We selected the most important 10 bands based on band importance of 5 VNIR and 5 SWIR to combine the image data, instead of using data from all bands, which averted over-fitting and mitigated the computational challenge.

Model architecture

The FCNN is designed as spectral extractor which input is hyperspectral reflectance. This branch of FC-CNN model is composed of layer-group including full connected layer, activation layer, batch normalization layer and drop out layer. The architecture is shown as Table 1.

Table 1 The architecture of FCNN-based spectral extractor.

Full size table

The CNN is designed for extracting spatial information. This branch is composed of convolution, max pooling and drop out layers. The architecture is shown in Table 2.

Table 2 The architecture of CNN-based image extractor.

Full size table

After two branches, all features extracted from spectrum and image are combined with a concatenate layer. Then, the combined features are processed by a FCNN to the output. The architecture of the last FCNN part is shown in Table 3.

Table 3 The architecture of the combining FCNN to output.

Full size table

Evaluation metric

For the evaluation of the identification of the year of P. ginseng growth, the accuracy (the ratio of the number of correct predictions to all predictions) based on the confusion matrix is used as an evaluation metric. The formula for quantitative assessment is as follows:

$$Accuracy = \frac{TP + TN}{{TP + FP + FN + TN}}{ }$$

(3)

where TP is the classified accurate positive class, FP is the misclassified positive class, TN is the classified accurate negative class, and FN is the misclassified negative positive class.

Loss function

In this study, mean absolute error (MAE) is chosen as the loss function for multi-classification tasks. The MAE loss function compares the predicted class with the target class for each pixel, and its expression is as follows:

$$MAE = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left| {ys_{i} - y_{i} } \right|{ }$$

(4)

where $N$ is the number of samples, ${ys}_{i}$ is the $i$ th actual value, and ${y}_{i}$ is the $i$ th predictive value.

Results

P. ginseng hyperspectral reflectance

Firstly, the reflectance of the 84 P. ginseng samples was obtained by radiometrically correcting the hyperspectral images of the P. ginseng samples according to Eq. (1). Secondly, the average reflectance of each sample in 396 bands was calculated according to Eq. (2) and plotted as a reflectance spectral curve. In order to study the spectral characteristics of P. ginseng in different growth years, the reflectance spectral curves of 84 P. ginseng samples were categorized year by year according to 1–7 years, as shown in Fig. 3.

The reflectance of P. ginseng samples from different years was discretely distributed over a wide range of 0.15–0.7. Therefore, there is no obvious reflectance threshold to identify the growth year of P. ginseng. However, the spectral curves of P. ginseng samples from different years showed a discontinuity near the wavelength of 1000 nm, which was caused by the different irradiation angles of the VNIR and SWIR hyperspectral cameras. By comparing the reflectance curves of P. ginseng from seven years, it was found that the difference between the reflectance of VNIR and SWIR at wavelength 1000 nm gradually decreased with the increase of the year of P. ginseng growth. The reflectance profiles of P. ginseng of different years varied with its molecular and chemical properties¹.

In the reflectance curves of P. ginseng samples of the same year, in the range of VNIR wavelengths, the differences in reflectance between different samples increased with the longer wavelengths, and the reflectance of all P. ginseng samples reached the peak at 800 nm, and the differences in reflectance between different samples were also the largest; in the range of SWIR wavelengths, the differences in reflectance between different samples were smaller with the longer wavelengths. The reflectance spectral curves of all P. ginseng samples of different years were input into the FCNN branch of the FC-CNN model as spectral data for the identification of the year of P. ginseng growth.

Important bands selection

As can be seen from the previous section, in the wavelength range of 400–2400 nm, VNIR (400–1000 nm) has 108 spectral bands, and SWIR (1000–2400 nm) has 288 spectral bands. In order to test the spectral band importance for the identification, the training and test dataset were selected randomly for 10 times. Random forest is a machine learning method that uses multiple weak decision trees to vote to determine the output, and it can provide the importance of the input spectral band. It contains many decision trees representing a distinct instance of the classification of data inputs into the random forest model^41,42.

According to 10 randomized experiments, the importance of each band is obtained by ordering the importance of the bands in the RF algorithm, as shown in Fig. 4. The sum of the importance of the bands contained in each of the VNIR and SWIR accounted for 89.4% and 10.6% of the total importance of all bands, respectively. This shows that the VNIR band is more important than the SWIR band in P. ginseng growth year identification.

Based on the results of the importance of each band shown in Fig. 4, the five VNIR bands and five SWIR bands with the largest importance values were selected. Among these 10 bands, the images generated using different combinations of bands were input into the CNN branch of the FC-CNN model as image data for P. ginseng growth year recognition. Among them, the five selected VNIR bands corresponded to center wavelengths of 517 nm, 559 nm, 649 nm, 766 nm, and 898 nm, while the five selected SWIR bands corresponded to center wavelengths of 1142 nm, 1280 nm, 1412 nm, 1895 nm, and 2495 nm, respectively.

Year-by-year identification

According to the structure of the FC-CNN model proposed in this paper, three groups of P. ginsengs year-by-year growth recognition experiments, named Test 1, Test 2 and Test 3, were established respectively. Each group of experiments is randomly trained and validated 10 times respectively. In Test 1, the input data were P. ginseng images generated from the layer stacking of five VNIR bands and 84 P. ginseng reflectance spectral curves; in Test 2, the input data were P. ginseng images generated from the layer stacking of five SWIR bands and 84 P. ginseng reflectance spectral curves; in Test 3, the input data were P. ginseng images generated by a layer stacking of three VNIR bands and two SWIR bands and 84 P. ginseng reflectance spectral curves. These data were fed into the FC-CNN model containing both FCNN and CNN branches, and the model outputs the classification results for the seven years of P. ginseng. After 10 validation tests, it can be guaranteed that the P. ginseng samples cover 7 different years. After 1800 iterations of each experiment, the value of the loss function gradually decreases until it converges and tends to 0, and the descending curve of the loss function gradually stabilizes.

Table 4 shows the results of the 10 tests for the three sets of trials. Table 4 visualizes that the average accuracies of the three sets of tests are, in descending order, Test 1, Test 3, and Test 2. This indicates that in the identification of the year of P. ginseng growth, the recognition is worst when the image information contains only SWIR bands; when the image information is a layer stacking of three VNIR bands and two SWIR bands, the average accuracy of the recognition is 3.6% higher than that of only SWIR bands; and when the image information contains only VNIR bands, the average accuracy of the recognition is another 3.5% higher than that of the layer stacking of three VNIR bands and two SWIR bands. This further verifies that the image information of VNIR bands is much more important than that of SWIR bands in P. ginseng growth year recognition.

Table 4 Validation results of FC-CNN model with 3 sets of different input data.

Full size table

In the 10 randomized experiments of Test 1, the highest and lowest accuracies of P. ginseng growth year recognition by the FC-CNN model proposed in this paper were 82.3% and 52.9%, respectively, with an average accuracy of 71.2% (Table 4).

The 10 confusion matrices obtained from each of the 10 randomized trials are shown in Fig. 5. The results showed that the year prediction error rate was higher for 2-, 3- and 5-year old P. ginseng. Most of the misclassified P. ginseng growth years are susceptible to overestimation, which may be explained by the fact that the 84 samples were divided into seven categories and the sample size may be insufficient. Increasing the number of samples is necessary to improve the accuracy of P. ginseng growth year identification. The significance of the P. ginseng growth year-by-year identification experiment is not only that it can test the ability of the FC-CNN model to identify the growth year of P. ginseng, but more importantly, the ability of the model to carry out the identification of P. ginseng as a food or medicine.

Identification of food and medicinal uses

P. ginseng can be categorized into food (1–5 years old) and medicinal (6–7 years old) according to the year of growth. In this paper, 7 sets of comparative experiments were designed to validate the FC-CNN model's ability to recognize P. ginseng for food and medicinal use, and the results are shown in Table 5.

Table 5 Results of food and medicinal identification of P. ginseng using different models.

Full size table

Before these experiments of FC-CNN models, CNN model only inputting regular RGB (622 nm, 546 nm and 443 nm spectral bands) had been tested for the RGB image is the most popular format and easiest to obtain in VNIR bands. The RGB-CNN model achieved an accuracy of 82.3%, which shew a worse performance than only HIS-FCNN model as shown in Table 5 Model 1. For a higher application accuracy, the HSI spectrums were jointly used together with images VNIR bands by FC-CNN model. Recognition of P. ginseng for food and medicinal use varies by numbers of optimal VNIR band. The accuracy of P. ginseng food and medicinal identification varies as the images with different numbers of optimal VNIR band are fed into the FC-CNN model, as shown in Table 5. It can be clearly seen that as the numbers of optimal VNIR band increases, the amount of image information that can be learned by the FC-CNN model increases, and the recognition accuracy of P. ginseng for food and medicinal purposes increases. Of course, as the amount of data increases, therefor does the time it takes the computer to run. When the numbers of optimal VNIR band are increased from 4 to 5 and 6, the recognition accuracy of P. ginseng for food and medicinal purposes can reach 100%. Generally, the Model 6 is the best choice for food and medicinal identification.

Discussion

Compared with other commonly used identification methods for expensive Chinese Materia Medicas, FC-CNN can fulfil the requirements of high efficiency, rapidity and easy operation. At the same time, it does not damage P. ginseng, and professionals with non- traditional Chinese medicine background can also quickly identify the year of P. ginseng growth.

(1)
Previous methods for identifying Chinese Materia Medica in terms of species, quality, and growth year, such as microscopic identification²⁷, mass spectrometry²⁸, ultraviolet detection⁴³, and high-performance liquid chromatography⁴⁴, involve slicing, grinding, and liquid extraction, which can cause damage to expensive Chinese Mate-ria Medica, and greatly limit large-scale implementation of relevant identification of such Chinese Materia Medica. The FC-CNN model does not even need to touch the Chinese Materia Medica themselves, and only utilizes the spectral and spatial information of their images, which will not cause any damage to the Chinese Materia Medica being identified and will not affect their reuse.
(2)
Mass spectrometry^28,45 requires a large mass spectrometer, chemical reagents, and so on, which are beyond the capabilities of non-specialized personnel. It requires professional technicians to operate in specialized laboratories. The hardware measurement system and software method of FC-CNN model support the operation by people with non- traditional Chinese medicine or chemistry background.
(3)
Chromatography^45,46 requires a lot of pre-processing and consumes a lot of time. With the support of GPU-accelerated deep learning algorithms, the recognition waiting time of the FC-CNN method can reach less than 1 s, and there is still room for acceleration, with the ability of real-time recognition.

Different species contain different chemical compositions, and the characteristic spectra between them are easier to find, which are generally reflected in the absorption or reflection peaks at fixed wavelengths. On the other hand, P. ginsengs with different growth years belong to the same species, and the chemical composition contents may be different, but it is difficult to find the characteristic spectra reflecting the differences of P. ginseng years in the spectra. For the lack of characteristic spectra and differences in spectral information, it is more appropriate to choose the neural network method with strong nonlinear fitting ability to fit the relationship between spectral—image information and growth year⁴⁷. A total of 396 absolute reflectance values were given by VNIR and SWIR in the wavelength range of 400–2500 nm, which were used to characterize the spectra of different P. ginseng samples; at the same time, the images of the optimal wavelength bands selected according to the Random Forest method were used to characterize the spatial features of different P. ginseng samples. With the support of the neural network method, the rich spectral and image information can help to ensure the high recognition accuracy of P. ginseng growth year.

More studies have been conducted on the identification of Chinese Materia Medica using spectral methods such as visible/near-infrared and short-wave infrared wavelength data, but fewer studies have been conducted on combining images with spectra. Due to the large amount of data, the important role of hyperspectral image information has been neglected. Therefore, the FC-CNN model proposed in this paper has more information and higher identification accuracy than methods that only utilize spectral information. In addition, the FC-CNN model maximizes the data compression by selecting the important bands, which solves the problem of large amount of network data.

Hyperspectral imaging equipment is expensive, which poses a great obstacle to the popularization and application of FC-CNN model in the P. ginseng plantation, trade, medicine and catering industry chain. However, according to the FC-CNN model, a combination of VNIR and SWIR can be used in the spectral dimension with imaging independence. Only multispectral bands of VNIR are needed, and full-spectral imaging is not required. This will provide the basis and model reference for simplifying the hardware manufacturing and reducing the economic cost. The accuracy and application range of the FC-CNN model is limited by the selection of training samples. Based on the available data, more samples for testing are needed to carry out method modeling and design research as well as to develop systems to support practical applications.

Conclusion

In this paper, 84 P. ginseng samples with growth years ranging from 1 to 7 years were used, and after a series of data processing, the optimal VNIR and SWIR bands extracted by RF were utilized, combined with the spectral and spatial information of hyperspectral images of P. ginseng samples, to propose a FC-CNN model for the identification of ginseng growth years by fusing FCNN and CNN algorithms, and validated the model, and the following conclusions were obtained.

(1)
The useful information of VNIR image is more than SWIR image, FC-CNN model can ignore the information of SWIR image in P. ginseng growth year recognition, which provides an effective reference for the parameter simplification of other network models for P. ginseng growth year recognition.
(2)
In the P. ginseng food and medicinal identification experiments using 5 years of ginseng age as a reference, the accuracy of FC-CNN model identification can be up to 100% by only utilizing the image information of the 5 optimal VNIR bands and the reflectance spectral information of all the bands. Therefore, image information of all bands is not required in the identification of P. ginseng for food and medicinal use. This indicates that the method can greatly compress the amount of data involved in the computation while ensuring high identification accuracy. Even RGB image is the lowest-cost hardware solution, the RGB-CNN model can not achieve high accuracy. HSI information is important to be jointly used, and at least a 5 bands multi-spectral imaging capability is needed.
(3)
The FC-CNN model proposed in this paper can better recognize P. ginseng of different growth years, and its recognition effect for food and medicinal P. ginseng is superior to the FCNN branching model using only spectral information and the CNN branching model using only image information.

The FC-CNN model proposed in this paper makes full use of the spectral and image information of hyperspectral images to obtain a high recognition accuracy of P. ginseng growth year quickly and non-destructively. The method does not require professional technicians with background in traditional Chinese medicine and specialized laboratories. In addition, the FC-CNN model can not only realize the identification of P. ginseng growth year, food and medicinal purpose, but also provide a non-destructive and fast reference method for the rapid identification of other Chinese herbal species, origin, year and active ingredients. Since the neural network algorithm is limited by the sample selection, its identification mechanism is still unclear. Therefore, in the future research, the identification of the year of P. ginseng growth will be carried out by combining the chemical composition of P. ginseng and hyperspectral data, and the identification mechanism of P. ginseng will be analyzed.

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Change history

17 April 2024
A Correction to this paper has been published: https://doi.org/10.1038/s41598-024-59638-8

References

Woo, Y., Cho, C., Kim, H., Yang, J. & Seong, K. Classification of cultivation area of ginseng by near infrared spectroscopy and ICP-AES. Microchem. J. 73, 299–306 (2002).
Article CAS Google Scholar
Ang-Lee, M. K., Moss, J. & Yuan, C.-S. Herbal medicines and perioperative care. Jama 286, 208–216 (2001).
Article CAS PubMed Google Scholar
Attele, A. S., Wu, J. A. & Yuan, C.-S. Ginseng pharmacology: Multiple constituents and multiple actions. Biochem. Pharmacol. 58, 1685–1693 (1999).
Article CAS PubMed Google Scholar
Christensen, L. P., Jensen, M. & Kidmose, U. Simultaneous determination of ginsenosides and polyacetylenes in American ginseng root (Panax Quinquefolium L.) by high-performance liquid chromatography. J. Agric. Food Chem. 54, 8995–9003 (2006).
Article CAS PubMed Google Scholar
Dai, Y.-L. et al. Comparing eight types of ginsenosides in ginseng of Different plant ages and regions using RRLC-Q-TOF MS/MS. J. Ginseng Res. 44, 205–214 (2020).
Article PubMed Google Scholar
Xu, X. et al. Identification of mountain-cultivated ginseng and cultivated ginseng using UPLC/Oa-TOF MSE with a multivariate statistical sample-profiling strategy. J. Ginseng Res. 40, 344–350 (2016).
Article CAS PubMed Google Scholar
Kim, S.-K. & Park, J. H. Trends in ginseng research in 2010. J. Ginseng Res. 35, 389 (2011).
Article PubMed PubMed Central Google Scholar
Lee, D.-H. et al. Inhibitory effects of total saponin from Korean red ginseng via vasodilator-stimulated phosphoprotein-Ser157 phosphorylation on thrombin-induced platelet aggregation. J. Ginseng Res. 37, 176 (2013).
Article CAS PubMed PubMed Central Google Scholar
Siddiqi, M. H. et al. Ginseng saponins and the treatment of osteoporosis: Mini literature review. J. Ginseng Res. 37, 261 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kang, K. S. et al. Heat-processed Panax ginseng and diabetic renal damage: Active components and action mechanism. J. Ginseng Res. 37, 379 (2013).
Article ADS PubMed PubMed Central Google Scholar
Lee, S. et al. Protective effect of ginsenoside Re on acute gastric mucosal lesion induced by compound 48/80. J. Ginseng Res. 38, 89–96 (2014).
Article PubMed Google Scholar
Lee, C. H. & Kim, J.-H. A Review on the medicinal potentials of ginseng and ginsenosides on cardiovascular diseases. J. Ginseng Res. 38, 161–166 (2014).
Article PubMed PubMed Central Google Scholar
Xie, H.-P., Jiang, J.-H., Chen, Z.-Q., Shen, G.-L. & Yu, R.-Q. Chemometric classification of traditional Chinese medicines by their geographical origins using near-infrared reflectance spectra. Anal. Sci. 22, 1111–1116 (2006).
Article CAS PubMed Google Scholar
Shi, W., Wang, Y., Li, J., Zhang, H. & Ding, L. Investigation of ginsenosides in different parts and ages of Panax ginseng. Food Chem. 102, 664–668 (2007).
Article CAS Google Scholar
Choi, K. Botanical characteristics, pharmacological effects and medicinal components of Korean Panax ginseng CA Meyer. Acta Pharmacologica Sinica 29, 1109–1118 (2008).
Article CAS PubMed Google Scholar
Xie, J.-T. et al. Anti-diabetic effect of ginsenoside Re in Ob/Ob mice. Biochim. Biophys. Acta (BBA) Mol. Basis Dis. 1740, 319–325 (2005).
Article CAS Google Scholar
Metori, K., Furutsu, M. & Takahashi, S. The preventive effect of ginseng with Du-Zhong leaf of protein metabolism in aging. Biol. Pharm. Bull. 20, 237–242 (1997).
Article CAS PubMed Google Scholar
Jiang, B. et al. Antidepressant-like effects of ginsenoside Rg1 are due to activation of the BDNF signalling pathway and neurogenesis in the hippocampus. Br. J. Pharmacol. 166, 1872–1887 (2012).
Article CAS PubMed PubMed Central Google Scholar
Block, K. I. & Mead, M. N. lmmune system effects of echinacea, ginseng, and astragalus: A review. Integr. Cancer Ther. 2, 247–267 (2003).
Article PubMed Google Scholar
Jia, L. & Zhao, Y. Current evaluation of the millennium phytomedicine-ginseng (I): Etymology, pharmacognosy, phytochemistry, Market and Regulations. Curr. Med. Chem. 16, 2475–2484 (2009).
Article CAS PubMed PubMed Central Google Scholar
Angelova, N. et al. Recent methodology in the phytochemical analysis of ginseng. Phytochem. Anal. Int. J. Plant Chem. Biochem. Tech. 19, 2–16 (2008).
Article CAS Google Scholar
Chang-Xiao, L. & Pei-Gen, X. Recent advances on ginseng research in China. J. Ethnopharmacol. 36, 27–38 (1992).
Article Google Scholar
Lu, J.-M., Yao, Q. & Chen, C. Ginseng compounds: An update on their molecular mechanisms and medical applications. Curr. Vasc. Pharmacol. 7, 293–302 (2009).
Article CAS PubMed PubMed Central Google Scholar
Announcement on Approval of Ginseng (Artificial Cultivation) as a New Resource Food (No. 17 of 2012). http://www.nhc.gov.cn/sps/s7891/201209/e94e15f2d9384b6795597ff2b101b2f1.shtml (accessed on 1 July 2022).
Chen, H., Tan, C. & Lin, Z. Identification of ginseng according to geographical origin by near-infrared spectroscopy and pattern recognition. Vib. Spectrosc. 110, 103149 (2020).
Article CAS Google Scholar
Pisano, P. L., Silva, M. F. & Olivieri, A. C. Anthocyanins as markers for the classification of Argentinean wines according to botanical and geographical origin. Chemometric modeling of liquid chromatography-mass spectrometry data. Food Chem. 175, 174–180 (2015).
Article CAS PubMed Google Scholar
Zhao, Z., Liang, Z. & Ping, G. Macroscopic identification of Chinese medicinal materials: Traditional experiences and modern understanding. J. Ethnopharmacol. 134, 556–564 (2011).
Article PubMed Google Scholar
Bai, H. et al. Localization of ginsenosides in Panax ginseng with different age by matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry imaging. J. Chromatogr. B 1026, 263–271 (2016).
Article CAS Google Scholar
Yang, Y. et al. Localization of constituents for determining the age and parts of ginseng through ultraperfomance liquid chromatography quadrupole/time of flight-mass spectrometry combined with desorption electrospray ionization mass spectrometry imaging. J. Pharm. Biomed. Anal. 193, 113722 (2021).
Article CAS PubMed Google Scholar
Feng, L., Wu, B., Zhu, S., He, Y. & Zhang, C. Application of visible/infrared spectroscopy and hyperspectral imaging with machine learning techniques for identifying food varieties and geographical origins. Front. Nutr. 8, 680357 (2021).
Article PubMed PubMed Central Google Scholar
Oerke, E.-C., Leucker, M. & Steiner, U. Sensory assessment of Cercospora beticola sporulation for phenotyping the partial disease resistance of sugar beet genotypes. Plant Methods 15, 1–12 (2019).
Article CAS Google Scholar
Zhao, Y. et al. Application of hyperspectral imaging and chemometrics for variety classification of maize seeds. RSC Adv. 8, 1337–1345 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Wu, N. et al. Variety identification of oat seeds using hyperspectral imaging: investigating the representation ability of deep convolutional neural network. RSC Adv. 9, 12635–12644 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, Y. et al. Non-destructive detection of flos lonicerae treated by sulfur fumigation based on hyperspectral imaging. J. Food Meas. Charact. 12, 2809–2818 (2018).
Article Google Scholar
He, J. et al. Nondestructive determination and visualization of quality attributes in fresh and dry Chrysanthemum morifolium using near-infrared hyperspectral imaging. Appl. Sci. 9, 1959 (2019).
Article CAS Google Scholar
Ru, C., Li, Z. & Tang, R. A hyperspectral imaging approach for classifying geographical origins of rhizoma atractylodis macrocephalae using the fusion of spectrum-image in VNIR and SWIR ranges (VNIR-SWIR-FuSI). Sensors 19, 2045 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Xia, Z., Zhang, C., Weng, H., Nie, P. & He, Y. Sensitive wavelengths selection in identification of ophiopogon japonicus based on near-infrared hyperspectral imaging technology. Int. J. Anal. Chem. (2017).
Chen, X. et al. Joint retrieval of the aerosol fine mode fraction and optical depth using MODIS spectral reflectance over Northern and Eastern China: Artificial neural network method. Remote Sens. Environ. 249, 112006 (2020).
Article Google Scholar
Sellami, A., Abbes, A. B., Barra, V. & Farah, I. R. Fused 3-D spectral-spatial deep neural networks and spectral clustering for hyperspectral image classification. Pattern Recognit. Lett. 138, 594–600 (2020).
Article ADS Google Scholar
Mou, L. & Zhu, X. X. Learning to pay attention on spectral domain: A spectral attention module-based convolutional network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 58, 110–122 (2019).
Article ADS Google Scholar
Zhou, W. et al. Hyperspectral inversion of soil heavy metals in three-river source region based on random forest model. Catena 202, 105222 (2021).
Article CAS Google Scholar
Zhao, L. et al. Hyperspectral identification of ginseng growth years and spectral importance analysis based on random forest. Appl. Sci. 12, 5852 (2022).
Article CAS Google Scholar
Li, W. & Fitzloff, J. F. HPLC determination of ginsenosides content in ginseng dietary supplements using ultraviolet detection. J. Liq. Chromatogr. Relat. Technol. 25, 2485–2500 (2002).
Article CAS Google Scholar
Shangguan, D. et al. New method for high-performance liquid chromatographic separation and fluorescence detection of ginsenosides. J. Chromatogr. A 910, 367–372 (2001).
Article CAS PubMed Google Scholar
Li, W. et al. Use of high-performance liquid chromatography-tandem mass spectrometry to distinguish Panax ginseng CA Meyer (Asian Ginseng) and Panax Quinquefolius L. (North American Ginseng). Anal. Chem. 72, 5417–5422 (2000).
Article CAS PubMed Google Scholar
Schulten, H.-R. & Soldati, F. Identification of ginsenosides from Panax ginseng in fractions obtained by high-performance liquid chromatography by field Desorption mass spectrometry, multiple internal reflection infrared spectroscopy and thin-layer chromatography. J. Chromatogr. A 212, 37–49 (1981).
Article CAS Google Scholar
Jayapal, P. K. et al. Analysis of RGB plant images to identify root rot disease in Korean ginseng plants using deep learning. Appl. Sci. 12, 2489 (2022).
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by the Xizang Autonomous Region Project for Local Scientific and Technological Development Guided by the Chinese Central Government (Grant No. XZ202202YD0030C) and Scientific and technological innovation project of China Academy of Chinese Medical Sciences (Grant No. CI2021A03902, CICI2021B013) and National Natural Science Foundation of China (Grant No. 82003901) and the National Key R&D Program of China (Grant no. 2023YFC3504000). The authors would like to acknowledge Professor Xiaobo Zhang from Chinese Academy of Chinese Medical Sciences for his professional accreditation and providing P. ginseng samples.

Author information

Authors and Affiliations

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China
Xingfeng Chen, Jiaguo Li, Jun Liu & Limin Zhao
State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medica, Chinese Academy of Chinese Medical Sciences, Beijing, 100700, China
Xingfeng Chen & Tingting Shi
The School of Information Engineering, Xizang Minzu University, Xianyang, 712089, China
Hejuan Du
The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang, 050000, China
Yun Liu
Jilin Provincial Key Laboratory of Chinese Medicine Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, 130022, China
Shu Liu

Authors

Xingfeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hejuan Du
View author publications
You can also search for this author in PubMed Google Scholar
Yun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Shi
View author publications
You can also search for this author in PubMed Google Scholar
Jiaguo Li
View author publications
You can also search for this author in PubMed Google Scholar
Jun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Limin Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shu Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

X.C., H.D. and T.S.: Conceptualization. X.C. and T.S.: Formal analysis. X.C. and H.D.: Funding acquisition. H.D.: Investigation. X.C. and T.S.: Methodology. J.Liu.: Project administration. S.L.: Resources. J.Li.: Supervision. X.C.: Validation. X.C. and L.Z.: Writing— original draft. X.C. and Y.L.: Writing—review & editing.

Corresponding author

Correspondence to Tingting Shi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this Article was revised: The original version of this Article omitted an affiliation for Xingfeng Chen. Their correct affiliations are listed in the correction notice.

Supplementary Information

Supplementary Figures.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, X., Du, H., Liu, Y. et al. Fully connected-convolutional (FC-CNN) neural network based on hyperspectral images for rapid identification of P. ginseng growth years. Sci Rep 14, 7209 (2024). https://doi.org/10.1038/s41598-024-57904-3

Download citation

Received: 16 December 2023
Accepted: 22 March 2024
Published: 26 March 2024
DOI: https://doi.org/10.1038/s41598-024-57904-3

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Climate change impacts and adaptations of wine production

Legume rhizodeposition promotes nitrogen fixation by soil microbiota under crop diversification

Plant–microbiome interactions: from community assembly to plant health

Introduction

Materials

Sample collection and preparation

Hyperspectral image acquisition

Methodology

P. ginseng spectrum information

Image segmentation

Radiometric calibration

Reflectance spectral curves

P. ginseng image information

Model construction for P. ginseng growth year identification

FC-CNN model

Model architecture

Evaluation metric

Loss function

Results

P. ginseng hyperspectral reflectance

Important bands selection

Year-by-year identification

Identification of food and medicinal uses

Discussion

Conclusion

Data availability

Change history

17 April 2024

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Figures.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Comments

Search

Quick links