Image processing techniques to estimate weight and morphological parameters for selected wheat refractions

The geometric and color features of agricultural material along with related physical properties are critical to characterize and express its physical quality. The experiments were conducted to classify the physical characteristics like size, shape, color and texture and then workout the relationship between manual observations and using image processing techniques for weight and volume of the four wheat refractions i.e. sound, damaged, shriveled and broken grains of wheat variety PBW 725. A flatbed scanner was used to acquire the images and digital image processing method was used to process the images and output of image analysis was compared with the actual measurements data using digital vernier caliper. A linear relationship was observed between the axial dimensions of refractions between manual measurement and image processing method with R2 in the range of 0.798–0.947. The individual kernel weight and thousand grain weight of the refractions were observed to be in the range of 0.021–0.045 and 12.56–46.32 g respectively. Another linear relationship was found between individual kernel weight and projected area estimated using image processing methodology with R2 in the range of 0.841–0.920. The sphericity of the refractions varied in the range of 0.52–0.71. Analyses of the captured images suggest ellipsoid shape with convex geometry while the same observation was recorded by physical measurements also. A linear relationship was observed between the volume of refractions derived from measured dimensions and calculated from image with R2 in the range of 0.845–0.945. Various color and grey level co-variance matrix texture features were extracted from acquired images using the open-source Python programming language and OpenCV library which can exploit different machine and deep learning algorithms to properly classify these refractions.

Many researchers have developed methods to elucidate varietal dissimilarities in grains using machine vision systems based on seed size and colour characteristics of the sample. These techniques have been used to identify and classify food grains based on kernel variety, type, refractions and insect infestation. A classification methodology was given to segregate kernels like wheat, oats and rye based on color and texture parameters 3 .
A rapid assessment of head rice yield based on computer vision systems (CVS) as a tool was conducted 4,5 . The relationships describing the whiteness of milled rice was presented that can improve rice quality. In another study, the bran layer area on surface of rice was determined the using digital image analysis 6 . Some studies have also showed various techniques to classify products as different grades based on kernel size and shape variations 7 . The rice kernels were classified into short, long, slender, round and bold grades using a support vector machine classifier using variations in length, width and its ratio of individual seeds as the classification criteria. Based on relative differences in kernel sizes, the proposed technique could also differentiate head rice from brokens and brewers 8 . In another study, the visual grading of soybean was described using image analysis techniques taking size uniformity as the classification criteria 9 . Few studies have been carried out to identify whole and broken fractions in wheat, rice, corn and soybeans [10][11][12][13] . Additionally, a method was developed to detect the presence of fissures in rice kernels using machine vision and image processing that would result in lesser head rice yields during milling 14 . Few researchers studied the effect of changing imaging backgrounds (mostly white and black) for better object identification 15,16 . A classification criteria for barley, wheat and rice was developed based on grain shape variations for better pattern recognition using image processing 17,18 . Some studies were conducted based on variations in kernel colour to differentiate between different pulses 19,20 . The accuracies of varietal classification have been reported as high as 99%; with process time in few seconds 21 . However, it is well known seed colour changes in certain lentils due to oxidation reactions. Therefore colour as the sole feature would not provide accurate classification for pattern recognition. Many studies have been reported for varietal classifications of grains with high accuracies by extracting morphological [22][23][24] , colour 25,26 and textural features 27 and their combinations 28,29 . The combination of colour, morphological and textural features were used for classification of dockage and foreign matter 30 . A classification methodology was reported for insect damaged wheat kernels identification in bulk 31,32 along with identification of adult insect pests using captured images 33 . Some studies suggested use of variations in colour space to identify discoloured and chalky kernels by machine vision systems 34 . A classification model was developed to identify fungal damages in soybean kernels based on multivariate discriminant analysis was developed using Red, Green, Blue (RGB) color space with acceptable levels of accuracy 35 . A handheld device was developed for classification of Indian basmati rice into healthy and discoloured kernels 36 . Further, a methodology was proposed for identifying maturity by studying variations in colour of paddy using RGB colour space features as the classification criteria 37 . The current study is based on extracting quantifiable information from the digital images for different quality refractions of wheat grains 38 with the use of digital images.
The weight, geometric and color parameters of kernels are an important parameter for assessment of quality of grains. The present study focuses on to estimate weight, geometric and color parameters of different refractions of wheat through image process techniques and to develop relationship between the images and the physical parameters of these refractions.

Materials and methods
The samples of wheat variety PBW 725 were collected from Punjab Agricultural University, Ludhiana experimental fields. The procedures adopted in study are in compliance with Indian Standard IS 4333 Part 1(1996) guidelines. The kernels were grouped into four categories such as sound grains, damaged grains, shriveled grains and broken grains. The randomly selected one hundred kernels from each category were manually placed on the scanner. The images are acquired using a flatbed scanner model Cannon scan 5600F with CCD 6-line color with white fluorescent light source. All the captured images were of 24-bit color format at 200 dots per inch (dpi) resolution. Black background was used for all the images as it gives more contrast to wheat color and nullify the shadow effect. The images were saved as non compressible bitmap images (bmp format). The computer algorithm was developed using open-source Python language and OpenCV library to extract size, shape, color and textural parameters for each kernel (Fig. 1). First, the color picture (BGR format) was converted into three 8-bit grayscale images i.e. blue, green and red bands. After applying the low pass Gaussian filter of 5 × 5 kernel size for removing small discrepancies in the image, the combined thresholding algorithm of binary and Otsu operator was applied to perform the background segmentation using opencv threshold function with combined flag of binary and OTSU thresholding techniques 39 . It is an automatic thresholding technique developed to segment the desired object from background image, i.e., whole sound wheat grain in this case. For other refractions i.e. damaged, shriveled and broken grains, the binary thresholding algorithm was used to background segmentation 40 . The threshold value is selected based on the results of the histogram analysis and, was invariable for the same environment conditions. The threshold values of pixels was calculated using equation given below for damaged, shriveled and broken grains refractions (dominated the area of gray levels) less than 37, 74 and 35 respectively for proper segmentation and took the shape of normal distribution. The next process involves using a mask and bitwise operation over all the other pixels that do not lie in our described range of pixels.
where src(x,y) and dst(x,y) refers to the intensity of pixels(x,y) of source and destination images respectively. Later, edge detection of each kernel was accomplished by detecting contours for each binary image of grain using OpenCV CHAIN_APPROX_NONE function for non compression of pixels and stores all boundary points. The size, shape, color and texture features were extracted from each contour(cnt) and the whole data was saved in csv/html file. For the other dimensional measurement of extracted image, the minimum bounding rectangle www.nature.com/scientificreports/ (MBR) of each contour was determined. The MBR (ferret diameter) is the minimum rectangle with the smallest area which encloses desired contour. Two dimensional measurement of the kernels are taken as width and height of the MBR in this study. The computer algorithm was programmed in Python programming language version 3.7 using the OpenCV 4, Numpy, Scikit-Image and Scipy scientific computing libraries 41 . OpenCV image processing library 42 was used for reading and pre-processing of images, color space conversions and image features extraction based on size, shape and color. Scipy image processing library 43 was used in extracting statistical features like mean, standard deviation, skewness and kurtosis from the images. Scikit-Image image processing library 44 was used for the extraction of Gray-level co-occurrence matrix features (GLCM texture features) from the images. All algorithm development steps and experiments were conducted on a personal computer (Intel Core 2 Duo i7 2.70 GHz with 6 Gbytes of RAM).
The physical properties like axial dimensions, equivalent diameter, sphericity, roundness and weight of each kernel were estimated. Weight of each kernel was measured using a sensitive electronic weighing scale (Citizen CY220, USA) with an accuracy of 0.001 g. A digital vernier caliper (make Mitutoyo) having 0.01 mm least count was used to determine the axial dimensions of each kernel.
The arithmetic mean diameter (D a ) and geometric mean diameter (D g ) of the kernels has been calculated by considering spherical shape for a grain 45 .
where: L, W, T are length, width and thickness of the sample (mm) respectively.
The sphericity (Sp) can be described as the ratio geometric mean diameter to major dimension of sphere kernels; while roundness indicates the sharpness of the corners to detect spherical, oblate, regular and oblong shapes was determined using following formulas 45 www.nature.com/scientificreports/ Thousand grains weight was computed by weighing 100 kernels using electronic weighing scale (Citizen CY220, USA) with least count 0.001 and then estimated mass of 1000 kernels by factor of 10. This is usually denoted by gm per 1000 kernels 46,47 .
The surface area is one of the significant physical properties of the kernels and can be related to respiration rate, colour evaluation and heat transfer studies in heating and cooling processes. It was calculated by the following relation 48 .
where, Sa is surface area in mm 2 and B is lateral geometric mean diameter.
The principle dimensions of length and width were used to calculate the volume of the different kernels considering them prolate and oblate spheroid. The following formulas were use in its calculations The shape of the kernels was also estimated using aspect and ellipsoid ratios. Aspect ratio (A.R.) and Ellipsoid ratio (E.R.) has been defined 49,50 as: where a, b, c are major, intermediate and minor diameters, respectively (mm).
Feature extraction using image processing. A total of 30 features related to size and shape (15 features), color (9 features) and texture (6 features) of the selected kernels has been estimated using the python software. A list of extracted features using OpenCV library has been given in Table 1. All the functions in the library are loaded into the algorithm using ' cv2' command. A flow diagram for features extraction from acquired images is given in Fig. 2. One hundred segmented images of each refractions extracted using image processing software are shown in Fig. 3.
Size and shape estimation. One hundred kernels from each refraction i.e. sound grains, damaged grains, shriveled grains and broken grains having different sizes were chosen from a given lot based on its visual appearance. The shape of the selected kernels and its size has been computed using the open-source Python software. The parameters relating the size and shape of the object were broadly selected to describe the geometric make up of the kernels. The features like minimum bounding rectangle (MBR), area (A), perimeter (P), solidity, minor diameter (m), major diameter(M) were estimated from the image analysis. These features were extracted using different functions available in OpenCV library ( Table 1). The derived features from the image analysis is given in Table 2. The brief explanation of these parameters is as follows: Area. It is the calibrated area of the image. It is measured as number of pixels in the image and it is converted into the mm 2 by multiplying with a calibration factor. The calibration factor was calculated on the basis of scanning resolution of 200 dpi.
Major and minor axis. Major and minor axis of minimum bounding rectangle (MBR) in pixels which is converted into mm using calibration factor.
Volume. Volume of the kernels is worked out using major and minor axis derived from formula for prolate and oblate spheroid objects. www.nature.com/scientificreports/ Bounding rectangle fill. It signifies how much an image shape matches to a rectangular pattern. It is expressed as the ratio of pixel count of filled MBR to the calibrated area of image worked out by multiplying the height and width of the image. Its value is one for a completely rectangular shape.
Bounding rectangle to perimeter. It is the ratio of perimeter of region of interest of image to the perimeter of bounding rectangle. For convex shape, the interior angle between the outline is always less than 180 degree and its value is one.  www.nature.com/scientificreports/ Solidity is the ratio of contour area to its convex hull area. The convex hull area is the smallest possible set of pixels enclosing the shape. The solidity value varies from 0 to 1.
Equivalent diameter or Feret diameter of a circular object has the same area as the computed kernels. Circulation factor is expressed as diameter of the circle with a perimeter equal to the perimeter of the kernel.
Compactness. It is the ratio of area of object to the area of square. It describes the resemblance of the object to square shape. When the seed is a circle, compactness is equal to 1, for a square seed, it is (π/4 = 0.78) and as the value approaches zero, it indicates an increasingly elongated polygon. Elongation refers to the difference between the lengths of the major and minor axes of the best fit ellipse divided by the sum of lengths. It is zero for circle and one for long and narrow ellipse.
Aspect ratio. It is the ratio of the height to its width inside MBR. The log 10 of this ratio gives symmetric measure of aspect ratio.
Color and texture estimation. The color analysis was carried out for each kernel. The 24-bit captured images are processed in open-source Python image processing software. All the images were imported into python Integrated Development Environment (IDE) as a three dimensional array with size [x, y, 3] (where x denotes rows, y denotes columns and 3 denote the red, green and blue channels). For each category of refractions, red (R), green (G), and blue (B) color densities were extracted by OpenCV library using python language. All the three channels consists of two-dimensional array with the size of 'x' rows and 'y' columns. Since the imported files are of 24-bit color images, all the three separated channels would be 8-bit grayscale images with the integer values ranging between 0 and 255. H, S and V channels were also separated by converting the original RGB image to HSV image using BGR2HSV function in OpenCV library. The standard Royal Horticultural Society Color Chart (RHSCC) was selected to describe nearest color of scanned kernels [51][52][53] . An algorithm was developed to find the mean distance of color (sRGB values) between the RHSCC and scanned images by euclidean distance formula after converting sRGB values into Numpy arrays in python. The extracted color features are given in Table 3. Among the different bases possible for the color representation, the Hue-Saturation-Value (HSV) model has potential to our problem since this model is closer to human color perception than RGB.
The color based texture features like gray-level co-occurrence matrix (GLCM) were extracted using greycomatrix function from skimage library. The GLCM texture features include contrast, dissimilarity, homogeneity, angular second moment(ASM), energy and correlation ( Table 4). The statistical analyses is carried out by using descriptive statistic parameters available in Microsoft Excel 2007.

Results and discussion
Physical characteristics of the selected refractions. A perusal of the Table 5 indicates the important axial dimensions of the selected refractions of wheat. The three average axial dimensions vary between 4.91 and 6.20 mm lengthwise, 2.47-3.43 mm width wise and 2.24-2.87 mm thickness wise. All the refractions are geometrically convex with both aspect ratio and ellipsoid ratio more than one and can be best described as ellipsoid shaped rather than sphere. The values of ratios L avg /T avg and W avg /T avg more than one indicate scalene type of ellipsoid. The other physical characteristics describing weight and shape features of the refractions have been presented in the Table 6. The thousand grain weight of the selected refractions varies between 12.56 and 46.32 g. The roundness ratio and sphericity values indicate the ellipsoid shape of all the refractions. The average single kernel weight of the refractions varied from 0.021 to 0.045 g. The information regarding the surface area and Table 4. GLCM Texture features extracted from image analysis. Where P[i,j,d,theta] is an 4-D ndarray grey level co-occurence matrix and represents a histogram of co-occuring greyscale values at a given offset over an image. The P value represents a matrix with value number of times that grey-level j occurs at a distance d and at an angle theta from grey-level i. The syntax for function greycomatrix, P[i,j,d,theta] is given below. The default values were used for these parameters. P[i,j,d,theta] = skimage.feature.texture. greycomatrix(image, distances, angles, levels = 256, symmetric = false, normed = false).   www.nature.com/scientificreports/ volume of prolate and oblate spheroid shape is very important in differentiating and classification of different types of kernel refractions.

Characteristics of size and shape of wheat refractions by image processing. The relationship
between the length and width of selected refractions and the number of pixels as obtained from the image analysis is linear with R 2 in the range of 0.81-0.95 and 0.80-0.92 respectively (Figs. 4,5). Hence the images of refractions acquired with flatbed scanner at 200 dpi resolution can be used. The important parameters describing the shape and size of the refractions through its image are presented in Table 7. The Bounding rectangle fill and bounding rectangle to perimeter indicate that the calibrated area of the image may be considered as out bulging type shape without any indent i.e., convex geometry with resemblance to rectangle by 73-78%. The compactness values for the refractions lie in the range of 0.67-0.77 so the calibrated areas cannot be assumed as square. The elongation parameter of refractions varies between 0.2-0.44 indicating that the shape of sound grains, damaged and shriveled grains is better considered as elliptical in comparison to brokens. The ratio of the area evaluated by counting the pixels in the image to area of MBR fill containing the image of refractions lie in the range of 0.74-0.77 indicating the effectiveness in its acquisition and processing (Table 7). Hence along with size, image processing can be used to provide useful information about the shape of the refractions.
Weight estimation of selected wheat refractions. Weight of the kernels is an important parameter currently being used in physical quality evaluation of kernels. It indicates soundness and wholesomeness of kernel. Using machine vision technology, a relationship was established between the weight and volume of the kernel with the calibrated area estimated from the acquired images using machine vision technology (Fig. 6).
A plot between the weight and calibrated area suggest linear relationship with R 2 value varies from 0.84 to 0.92. Sound grains indicate good correlation followed by Shriveled grains. Considering prolate spheroid shapes of the kernels, a plot between volume estimated using Mohesnin's equivalent diameter and calibrated area suggested a linear relationship with R 2 value varies from 0.84 to 0.94 (Fig. 7). Here, shriveled and sound grains showed good correlation. Thus machine vision system has good potential for application in online quality estimation during wheat handling, packaging and storage operation.  www.nature.com/scientificreports/  www.nature.com/scientificreports/ Characteristics of color and texture features of wheat refractions by image processing. The red, green and blue channels were separated from 24-bit captured images of all wheat refractions. The mean and normalized differential indexes for red, green and blue channels were calculated ( Table 8). The shriveled grains are having maximum values for all three channels while broken grains are having least values for these channels. As such, the variation in color is subjected to number of factors and exact match is not possible when images are acquired with two different systems. So, an algorithm was developed to find the closest matching color of acquired images with standard charts. The closest color code between the standard RHSCC and scanned images was calculated as 148A, 165A, 199B and N199A for sound grains, damaged grains, shriveled grains and broken grains respectively ( Table 8). The values for NDI rg, and NDI gb were calculated for all the refractions. The maximum values of NDI rg, , NDI rb and NDI gb correspond to shriveled grains, brokens grains and sound grains whereas minimum values of NDI rg, correspond to sound grains, and minimum value for NDI rb and NDI gb was correspond to shriveled grains ( Table 8). The HSV values for all refractions are also estimated from the algorithm using OpenCV library 42 and the values are shown in Table 8.This information can be very helpful in classification of these different refractions. The color based texture features (GLCM) were extracted from images using skimage library. The values for texture features such as contrast, dissimilarity, homogeneity, angular second moment(ASM), energy and correlation are given in Table 8. The contrast, dissimilarity and energy texture value for broken grains is maximum at 2269.7, 31.61 and 0.04 whereas these values are least for sound grains. The homogeneity texture value is highest for sound grains and minimum for shriveled grains. The correlation texture value also varies from 0.49 to 0.65 and is maximum for shriveled grains and minimum for broken grains. These color and texture features clearly showed the variation in values for all these refractions and certainly will be helpful in classification and identification of these kernels using different machine learning and deep learning algorithms.

Conclusion
Image processing technique was used to estimate weight, geometric and color parameters of four different types of the wheat refractions with the help of a flatbed scanner. The open-source Python image processing software was used to obtain their size, shape, volume, color and texture features. A linear relationship was observed between the axial dimensions of refractions between manual measurement and image processing method with R 2 in the range of 0.798-0.947. The individual kernel weight and thousand grain weight of the refractions were observed to be in the range of 0.021-0.045 and 12.56-46.32 g respectively. Another linear relationship was found between individual kernel weight and projected area estimated using image processing methodology with R 2 in the range of 0.841-0.920. The sphericity of the refractions varied in the range of 0.52-0.71. A linear relationship was observed between the volume of refractions derived from measured dimensions and calculated from image with R 2 in the range of 0.845-0.945. Further, nine color and six GLCM texture features were also estimated which can exploit different machine and deep learning algorithm to properly classify these refractions for accurate identification of kernel conditions.