Interactive machine learning for soybean seed and seedling quality classification

New computer vision solutions combined with artificial intelligence algorithms can help recognize patterns in biological images, reducing subjectivity and optimizing the analysis process. The aim of this study was to propose an approach based on interactive and traditional machine learning methods to classify soybean seeds and seedlings according to their appearance and physiological potential. In addition, we correlated the appearance of seeds to their physiological performance. Images of soybean seeds and seedlings were used to develop models using low-cost approaches and free-access software. The models developed showed high performance, with overall accuracy reaching 0.94 for seeds and seedling classification. The high precision of the models that were developed based on interactive and traditional machine learning demonstrated that the method can easily be used to classify soybean seeds according to their appearance, as well as to classify soybean seedling vigor quickly and non-subjectively. The appearance of soybean seeds is strongly correlated with their physiological performance.

Scientific RepoRtS | (2020) 10:11267 | https://doi.org/10.1038/s41598-020-68273-y www.nature.com/scientificreports/ high-throughput screening for quantifying thrips damage 12 , quantification and spatial analysis of features in histological images of rodent brains 13 , and others. However, there is still no information on application of interactive machine learning via Ilastik in plant science and, in particular, in studies on seeds and seedlings.
Considering that the use of image analysis in seed science technology has led to considerable advances in seed quality assessment 14 , the Ilastik software could be integrated in this image analysis, which would make seed quality assessment even more efficient. Thus, the aim of this study was to propose an approach based on interactive and traditional machine learning methods to classify soybean seeds and seedlings according to their appearance and physiological potential. In addition, we correlated the appearance of seeds with their physiological performance.

Materials and methods
plant material and image acquisition. Seven commercial soybean seed lots from production in the 2019/2020 crop season were acquired, from which 700 seeds were selected. The seeds were divided into seven groups according to physical, health, and physiological aspects. The groups were composed of mechanically damaged seeds, seeds infested by fungi, and green seeds (with low chlorophyll degradation). Initially, images of the seeds were obtained from an Epson Perfection V800 scanner. The seeds were tested for germination capacity by placing them on rolls of germination paper moistened with water in the amount of 2.5 times the dry weight of the paper and kept in a germinator (Mangelsdorf) at 25 °C for three days. The seedlings produced were individually evaluated through image analysis to determine seed vigor. Images of the seedlings were acquired through an HP Scanjet 200 scanner fastened in an inverted position within an aluminum box. Both seed and seeding images had 300 dpi resolution.
image pre-processing and segmentation. Image pre-processing and segmentation were performed with the Ilastik software 4 using the pixel classification tool. The images were semantically segmented, and two segmentation classes were defined: "seed or seedling" and "background". The features of the seeds or seedlings were based on color and pixel intensity and border and texture descriptors, computed as pre-smoothed filters with a sigma ranging from 0.3 to 1. To create the probability maps, the pixels belonging to the regions of each class were selected by painting brushstrokes of different colors directly on the input data. A Random Forest classifier was then used for pixel classification (Ilastik standard). The probability that the pixel belonged to the semantic segmentation classes ("seed/seedling" or "background") was estimated for each pixel of the image. The trained classifier was applied to all images, individual seeds and seedlings, in batch mode, and the probability maps were exported for each image.
Seed appearance and physiological quality classification. In this study, we developed independent classifications for soybean seed appearance and physiological quality, considering seedling growth and the nongerminated seed data. The methods used for both classification procedures were similar; therefore, the descriptions were presented only once.
Seed appearance classes. Seeds were classified into seven different classes, based on visual inspection of each individual seed: (1) high-quality seed (HQS)-seed nearly round, firm, with smooth skin, and a single color; (2) mechanically kneaded seed (KNS)-seeds with irregular surfaces and surface depressions due to mechanical damage; (3) purple stained seed (PSS)-seeds with pink to light or dark purple discoloration, with size ranging from a small spot to covering the entire seed coat; (4) broken seed (BRS)-seeds with poor structural integrity; (5) seed coat tear (SCT)-seeds showing coat ruptures; (6) moisture damaged seed (MDS)seeds showing wrinkles in the region opposite the hilum; (7) greenish seed (GRS)-seeds with a greenish seed coat or cotyledon. physiological quality classes. The seedlings were placed in two classes: (1) vigorous seedlings (VSD)morphologically healthy seedlings showing vigorous growth of the hypocotyl and roots; and (2) weak seedlings (WSD)-seedlings showing absence, underdevelopment, or deformation of some essential structure (cotyledons, primordial leaves, and roots). In addition, the class (3) non-germinated seeds (NGS)-seeds unable to germinate after three days under suitable conditions-was established.
Interactive classification of seed appearance and vigor. Interactive classification was performed with the Ilastik software, using the inbuilt object classification tool. This step was the main human task in the application of interactive machine learning. First, the training images were input (10% of the total images were used) and their respective probability maps were obtained in the pre-processing stage. Then 103 variables 15 , all the features available in the Ilastik software, were calculated for each seed and seedling, including convex-hull, skeleton-based shape descriptors, and property intensity statistics. Finally, we trained the classifier by selecting known individual classes. This process was performed by clicking on the reference objects of the different classes. Immediately, the classification was made. That way, real-time classification allowed us to track the classification errors and to correct them. This approach differs from traditional machine learning, which requires a lot of effort to detect point flaws in classification and often requires significant amounts of training data 6 . It is important to highlight that, in this study, the human effort of performing the classifications was not evaluated, but it comprised only the initial steps, and once the classification was completed, the classifier could be used automatically.
Then, the trained classifier was applied to the entire set of images in batch mode. The descriptors and the individual prediction maps were exported in CSV and PNG file formats, respectively. This approach did not Scientific RepoRtS | (2020) 10:11267 | https://doi.org/10.1038/s41598-020-68273-y www.nature.com/scientificreports/ require hardware with a high level of processing capacity. We performed all classifications on an Intel CoreCPU i5-42000 @ 1.60 GHz and 4 GB of RAM.
External algorithms for seed appearance and physiological quality classification. The descriptors generated by Ilastik for each seed and seedling were used to develop classification models using traditional machine learning techniques based on three methods: Linear Discriminant Analysis (LDA), Random Forest (RF), and Support Vector Machine (SVM). The data were divided into training sets and validation sets, in the proportion of 70% and 30%, respectively. The three classifiers were developed with the R 3.6.3 software (R Core Team, 2019). For LDA, the Mass (https ://cran.r-proje ct.org/web/packa ges/MASS/index .html) package was used. For RF models, the cforest function of the partykit package (https ://www.rdocu menta tion.org/packa ges/party kit) was used, with 500 decision trees and default hyperparameters. For SVM, the caret (https ://cran.r-proje ct.org/web/packa ges/caret /index .html) package was used, with a radial kernel and default parameters.
Model validation. The models developed with the interactive method using the Ilastik software were validated using 90% of the data set. These data had not been used for training. The external models developed were validated through cross-validation (10-folds) and through an independent validation set that had not been used before. They were evaluated based on True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) data. The Accuracy, Kappa, Precision, Sensitivity, and Specificity metrics were calculated: Association of seed appearance and physiological quality. In order to understand the relationship between seed appearance and seed physiological quality, we made an association between seed classes and their respective seedling classes. Seedling length was measured and the data were used to obtain the growth, uniformity, and vigor indices and root length using the Vigor-S software 16 . Multivariate principal component analysis was applied to these data, adopting the seed classes as individuals and the seedling vigor indices as vectors.

Results
Machine learning models for seed classification. We developed and compared four different models to classify soybean seeds based on relevant visible aspects using a high-performance tool. The model based on interactive machine learning showed an overall accuracy of 0.83 (Table 1). The BRS and SCT classes had the highest hit rates, with sensitivity greater than 0.97. In contrast, the HQS and MDS classes had lower individual accuracy, with a higher rate of false-positives and false-negatives. At least 21% of HQS class seeds were confused with MDS class seeds, while the confusion increased to 30% in the opposite direction. The seed descriptors generated by the Ilastik software were tested applying three different machine learning methods ( Table 2). In general, high performance was observed for all three models developed. The LDA method stood out, achieving an accuracy of 0.93 in the cross-validation set and 0.94 in the independent validation set, and a Kappa coefficient above 0.91. In independent validation, the classes SCT and BRS had the highest hit rates in the LDA, RF, and SVM models.
Machine learning models for physiological quality classification. The physiological data were used for seed classification into three classes according to seed germination capacity and seedling growth: vigorous seedlings (VSD), weak seedlings (WSD), and non-germinated seeds (NGS). Four models were developed to that end-the first based on interactive machine learning and the others using the traditional machine learning approach to analyze the data generated by the Ilastik software. Note that although the data set in this study had 700 seeds, only 600 were used for development and evaluation of these models. The BRS class was excluded since it is not usually used to assess seed vigor.
A high precision classification model within the interactive Ilastik machine learning method was used ( Table 3). The model achieved average accuracy of 0.97, and values above 0.92 for kappa, precision, sensitivity, and specificity metrics for the VSD class. The highest rate of false-negatives was found in the WSD class and the lowest rate in the VSD class.
Relationship between seed appearance and seedling growth. The seeds exhibited different physiological performance according to their appearance (Fig. 2). In multivariate principal component analysis, the physiological quality vectors were positioned to the right of the ordering diagram, indicating that individuals located in negative scores of PC1 had significantly lower values for these variables (Fig. 2a). In more detail, HQS and SCT classes showed higher values of seedling length and higher vigor, uniformity, and growth indices (Fig. 2a). For those classes, all seeds germinated and generated a high proportion of vigorous seedlings (Fig. 2b). In addition, in the PSS and MDS classes, the seeds generated mainly vigorous seedlings and weak seedlings. The MDS also had a high proportion of non-germinated seeds. For the KNS class, the seeds generated mostly weak seedlings or did not germinate. Finally, the seeds of the GRS class mostly did not germinate. The seedling appearance that predominated in each seed class is shown in Fig. 2c.

Discussion
The quality of soybean seed has a direct impact on its market price and affects seedling establishment in the field 7 . Visual-based inspection methods are currently used by laboratories to precisely evaluate soybean seed quality, including seedling evaluations. However, these methods are subjective, inconsistent, time-consuming, and usually destructive 3 . In this study, we present an approach based on interactive and traditional machine learning methods to classify soybean seeds and seedlings automatically, efficiently, and without costly resources.
The proposed methods were highly satisfactory. The accuracies found in discriminating seeds in their different appearance classes were high, ranging from 0.92 to 0.99 for the interactive machine learning model and greater than 0.90 of overall accuracy in the independent validation set for external classification (Tables 1, 2). These results indicate high potential for practical application of these methods. In this study, we used several seed lots, which is a good first step toward generalization, and it raises hope concerning possible generalization for industrial purposes. The models developed can easily be inserted in the quality control programs of seed companies or seed analysis labs, or even in studies in research centers. We are aware that this approach does not entirely dispense with human effort, as some manual tests are required to meet ISTA standards 17 , but it can be used in the process of sorting lots regarding seed quality and identifying reasons for loss of quality rapidly and accurately. It is noteworthy that this is a pioneering study using interactive machine learning methods for classifying soybean seeds according to their appearance. Table 1. Confusion matrices and metrics of the interactive classification of soybean seeds according to their visual appearance. The Random Forest classifier was applied and 10% of total images were used for training. a In the columns are the true seed classes, and in the rows are the estimated classes.  In previous studies, researchers have tried to identify damage to soybean seeds using different techniques. Lin et al. described the representation of jointly multi-modal bag-of-feature (JMBoF) to inspect the appearance quality of post-harvest dry soybean seeds using images obtained in the visible spectrum. The proposed algorithm reached an accuracy of 82% in the test set. Mahajan et al. used RGB and X-ray images to assess the physical purity, viability, and vigor of soybean seeds. The classifier developed that was based on a neural network achieved an accuracy of 91%. Momin et al. developed a machine vision system to detect impurities in samples of harvested soybeans and achieved accuracy ranging from 75 to 98%. Liu et al. developed a system to identify and eliminate damaged soybean seeds using computer vision, image-processing technologies, neural networks, and automated mechanical control. The system showed an average recognition accuracy of 97%. Several other techniques used to assess soybean seed quality are Raman spectroscopy 18,19 , near-infrared spectroscopy 20,21 , and nuclear magnetic resonance 22,23 . Table 4. The number of seedlings classified correctly in each class and metrics for external classification using Ilastik descriptors of soybean seeds and seedlings according to their physiological quality.

Method Class
Training set  The machine learning methods applied in this study were efficient for classification of seed physiological quality. We found overall accuracy of 0.94 in the interactive method and accuracy of up to 0.94 in the external classification using the independent validation set for RF (Tables 3, 4). This technique opens new perspectives for rapid analyses of soybean seed vigor and enables soybean seedling phenotyping for genetic studies. Other software programs have been developed to analyze soybean seed quality, mainly considering the length of seedlings, such as SVIS 24 , VIGOR-S 16 , SAPL 25 , and GroundEye (https ://www.tbit.com.br/). However, some software programs are commercial products, which restricts their use. Here, we present a free and efficient alternative for seed/seedling analysis that can be used by many researchers, laboratories, and institutions interested in application of a highly efficient method for vigor classification of soybean seeds and seedlings.

Cross-validation
A summary of the steps of the interactive method applied to soybean seed and seedling classification is shown in Fig. 3. Image acquisition is made on a blue background to facilitate segmentation and to improve the ROI probability map. The identification of individual seeds and seedlings can be seen immediately after the probability map. Prediction of the classification of each seed or seedling is shown by the color of the respective groups.
The models developed can be improved to deal with more classes, and the Ilastik software provides all the tools necessary for this. The entire workflow of Ilastik (generic segmentation resources, nonlinear classifiers, probabilistic graphical models) is wrapped in an intuitive interface for interactive classifier training and postprocessing of segmentation and classification algorithms 4 .
The machine learning algorithms applied to the external classification in this study have been used for complex data sets of plant phenotyping 26 . Methods such as LDA, RF, and SVM have been successfully used in plant science. The LDA is a popular machine learning algorithm that increases the distance between classes and reduces separability within the class, linearly combining features and creating limit estimates. The germination and vigor of Jatropha curcas seeds were accurately predicted using the LDA classifier combined with morphometric and tissue integrity resources obtained from X-ray images of seeds 14 . The RF, for its part, is a non-linear classifier formed by many decision trees. It generates the final classification based on a voting system for each tree, and in the end, the algorithm selects the most voted class 27,28 . In maize production areas in China, an RF-based classifier was developed using high-resolution remote-sensing images. The classifier was able to differentiate fields of seed and grain production and identify maize varieties 29 . By 4, the interactive machine learning method of Ilastik uses an RF classifier with 100 trees for pixel classification 4 . Lastly, SVM is a supervised algorithm that projects data into a higher dimensional resource space and identifies a hyperplane to separate classes with the most significant possible margin. The benefit of a hyperplane is its robustness to extreme values, considerably reducing false classifications 30 . In addition, in a recent study on individual plants in Arabidopsis thaliana under different levels of various abiotic stresses, an SVM algorithm using microRNA concentrations as input resources was able to predict plant stress with high precision (R 2 = 0.96) 31 . These studies show the great potential of machine learning techniques. www.nature.com/scientificreports/ Although our results were very promising, some limitations were noticed. The combined presence of different classes in the same individual seed is common in soybean. For example, a seed can be greenish and mechanically kneaded. This may have negative implications for classification if the objective is to accurately quantify individual damages. Another limitation is related to the acquisition method. Although the use of 2D scanners and cameras are the most accessible and straightforward approach to obtain images, the 2D image does not cover all the seed faces. In our study, this limitation caused confusion between the HQS and MDS classes (Table 1) since the moisture damage in some cases was not very evident. In future studies, 3D images could be applied.
Genetic and environmental factors regulate seed appearance, which is strongly correlated with seed physiological performance, as shown in the present study (Fig. 2). Interestingly, seeds with ruptured seed coats showed high vigor, which was similar to the high-quality seeds. It is believed that these ruptures are caused by genetic factors associated with environmental conditions during seed maturation. Although it is very common, there are few reports on the effect of this trait on seed physiological quality 32 .
Furthermore, the mechanically kneaded seeds resulted in a high number of non-germinated seeds and weak seedlings (Fig. 2b). This damage is caused by mechanical impacts during seed harvesting or processing. Depending on the location of the damage, the seeds may generate an abnormal seedling or not germinate at all. Seeds with extremely low moisture content are more susceptible to breakage during mechanical operations at harvest and in processing. These seeds were in the BRS class, which had practically no germinated seeds.
Purple stained seeds generally resulted in weak seedlings (Fig. 2b). This characteristic is mainly caused by the fungi Cercospora kikuchii. Previous reports showed a weak correlation between this characteristic and seed physiological quality, though it does compromise seed marketing.
Environmental conditions during seed production and mainly after seed maturation can lead to moisture damage. These seeds generally have low seed quality 33 , which leads to weak seedlings and a significant number of non-germinated seeds, as observed in the present study (Fig. 2b). Furthermore, high temperatures and water deficit during seed maturation can lead to green seed formation. These seeds have a low rate of seed germination (Fig. 2b).
Fast assessment of seed quality is essential for the seed industry. Rapid decision making regarding disposal or destination of seed lots saves time and resources. Thus, tools that can accurately evaluate seed lots and identify seeds of low physiological quality is of great importance. This study showed that it is possible to classify seeds according to their appearance, and these characteristics are strongly correlated with their physiological potential. Therefore, the proposed approach has potential for application in soybean seed lot classification for non-destructive, non-subjective, and fast screening of seeds. In addition, this method was able to classify soybean seedling vigor effectively and precisely. The interactive machine learning method for classification of soybean seeds from their appearance is highly accurate. This approach effectively identifies seeds with damage and classifies seedlings in vigor levels. The use of LDA, RF, and SVM algorithms is recommended for classifying soybean seeds and seedlings based on data generated with the Ilastik software. Soybean seeds with changes in chlorophyll degradation, fungal stains, and mechanical damage have low physiological quality.