A novel approach for nitrogen diagnosis of wheat canopies digital images by mobile phones based on histogram

The accurate and nondestructive assessment of leaf nitrogen (N) is very important for N management in winter wheat fields. Mobile phones are now being used as an additional N diagnostic tool. To overcome the drawbacks of traditional digital camera diagnostic methods, a histogram-based method was proposed and compared with the traditional methods. Here, the field N level of six different wheat cultivars was assessed to obtain canopy images, leaf N content, and yield. The stability and accuracy of the index histogram and index mean value of the canopy images in different wheat cultivars were compared based on their correlation with leaf N and yield, following which the best diagnosis and prediction model was selected using the neural network model. The results showed that N application significantly affected the leaf N content and yield of wheat, as well as the hue of the canopy images and plant coverage. Compared with the mean value of the canopy image color parameters, the histogram could reflect both the crop coverage and the overall color information. The histogram thus had a high linear correlation with leaf N content and yield and a relatively stable correlation across different growth stages. Peak b of the histogram changed with the increase in leaf N content during the reviving stage of wheat. The histogram of the canopy image color parameters had a good correlation with leaf N content and yield. Through the neural network training and estimation model, the root mean square error (RMSE) and the mean absolute percentage error (MAPE) of the estimated and measured values of leaf N content and yield were smaller for the index histogram (0.465, 9.65%, and 465.12, 5.5% respectively) than the index mean value of the canopy images (0.526, 12.53% and 593.52, 7.83% respectively), suggesting a good fit for the index histogram image color and robustness in estimating N content and yield. Hence, the use of the histogram model with a smartphone has great potential application in N diagnosis and prediction for wheat and other cereal crops.

Measurement items and methods. Determination of the N concentration of the plants and leaves: five plants from each plot were sampled and then separated into leaves and stems. The samples were pre-dried at 105 °C for 30 min and then dried to a constant mass at 70 °C in an oven to determine the dry mass (DM). The plant materials were ground to pass through a 2-mm mesh screen, and aliquots were ground for further analyses. The samples were digested with H 2 SO 4 and H 2 O 2 , and the total N concentration of the digested samples was determined using an automated continuous flow analyzer (Seal, Norderstedt, Germany). In the wheat harvesting period, 5 m 2 wheat was artificially harvested in the production area of each plot, and the yield was counted after drying.
Crop canopy image acquisition method. The shooting stage includes the reviving stage and jointing stage of winter wheat. Photographs were taken between 12:00 and 14:00 on cloudless days. Smartphones were used to obtain the wheat canopy image. The shooting height was 1.2 m above the ground.
Crop canopy image processing and color parameter design. Mobile digital images have three color channels, including R, G, and B. This can capture the vegetation canopy's reflected light characteristics of the three brands (red, green, and blue), which are directly related to the vegetation absorption characteristics. Chlorophyll is the most important pigment in wheat and has strong absorption of blue and red light, but less absorption of green light 30  Among them, G/R is the ratio of the green light channel image to the red channel. The higher the green channel, the lusher the vegetation, and the smaller the red channel, the more the chlorophyll. G/R can highlight vegetation information 31 . The G/B ratio is similar to G/R, in that the higher the ratio, the more chlorophyll the vegetation has 32 . In R/(R + G + B), the greater the proportion of red light in the entire image, the less the chlorophyll content of the vegetation 32 , while in G/(R + G + B), the higher the green channel, the lusher the vegetation 33 . In B/(R + G + B), the larger the red light channel, the less the vegetation 34 , and (G-R)/(R + G + B) is the subtraction between the green and red channels and thus highlights the vegetation information and reflects the plant status.
The color index image is converted into an independent variable in two ways. The first method takes the mean value of the vegetation component in the image as the model independent variable to construct a single-variable diagnostic approach; the second method calculates the histogram of the color index images. Correlation analysis between histogram curves and leaf N concentration and yield was used to filter highly interrelated curve regions. The selected histogram region was set as independent variables, and a multi-variable diagnostic approach was constructed. The color index, the mean value of the index image, and the histogram of the index image were calculated by ENVI software. The specific histogram-based method used in this study is shown in Fig. 1. Neural network diagnostic model design. In order to compare the adaptability of IIMV and IIH to different wheat cultivars, a neural network was used to combine the data of the six cultivars. The relationship between the filtered IIH segment and its dependent variable, i.e., the diagnostic factor, cannot be easily explained by a simple linear relationship. Therefore, an artificial neural network was used for fitting. A neural network model is a highly non-linear model that performs well in complex nonlinear problems 35 . The neural network model (multilayer perceptron neural networks) of MATLAB (R2015b, MathWorks, USA) was used to create and train a network by using a neural network fitting tool. There were 1 hidden layer with 10 neurons and 1 output layer with 1 neuron in our network. The transfer functions of the 2 layers were sigmoid and linear, respectively. The Levenberg-Marquardt backpropagation algorithm was used in weight optimizing. The independent variables near the peak value (Fig. 2a) were selected from the histogram of the color parameters of the canopy image for neural network training. For comparative analysis, the neural network model was used to train and fit the mean value of the image color parameters with the independent variables, and then the optimal result was selected.
In artificial neural network simulation, through repeated comparisons and trials, LM (Levenberg-Marquardt) neural net fitting in MATLAB is often used as a training method. Canopy images of winter wheat microscopical performance in the field were collected and used in conjunction with the optimal histogram section information corresponding to the optimal color index as the input variable of the neural network. The leaf N content and yield corresponding to the image were used as the output variables. The training results were obtained by fitting the data of the reviving stage and jointing stage with the leaf N content and yield. There were 90 samples in each stage and 15 samples per cultivar per stage, and three images were used for each N-fertilizer level. Sixty samples were used for training, and the other 30 samples were used to test the training effect to ensure the stability and accuracy of the training and then obtain the network. The results produced by neural network learning were used as part of the system knowledge to achieve a non-destructive N nutrition diagnosis of wheat.

Effects of N fertilizer application on the N nutrition and yield of different wheat cultivars. From
the RGB images of the canopies (Fig. 3), it is obvious that wheat plants of the same cultivar, but grown without N application, were yellowish-green and grew less vigorously, while the plants grown at the various N concentrations did not differ visually in these parameters. At the reviving stage, N application significantly increased the N content of the wheat leaves (Table 1)  Irrespective of the cultivar, grain yield was maximum for plants that were N-fertilized at 120 and 180 kg ha −1 (Table 1). However, there were some minor differences between the yield responses of the plants of some cultivars, i.e., the plants of 'HY198' performed well even without additional N, while those of Zhongmai 1 (ZM1) and Ping'an 8 (PA8) had maximal yields at N240 and N360, respectively. The results suggest that appropriate N application can increase wheat yield, but that too much N application may decrease yield. Tables 2 and 3 show the mean value of the different color parameters under different N fertilizer treatments during the reviving stage and jointing stage, respectively. The R, G, and B light values were in the order of G > B > R during the reviving stage and G > R > B during the jointing period. The mean values of the color parameters G/R and [(G-R)/(R + G + B)] were relatively stable in the different wheat growth stages. In the different cultivars, the mean value of the non-N-fertilized color parameters was significantly different from that of the other N-fertilized treatments.

Mean value of the canopy image color parameters and its relationship with leaf N content and yield.
The correlation coefficients between the mean value of the color parameters and leaf N content and yield of the six cultivars were unstable ( Fig. 4). At the reviving stage, the correlation coefficients of Huayu 198 (HY198), Xinong 979 (XN979), and PA8 with leaf N content and yield were relatively high and stable. The absolute values of the correlation coefficients between XN979 and yield and leaf N content remained at 0.459-0.694 and 0.745-0.889, respectively. At the jointing stage, the mean value color parameters G/B and NBI of PA8 had the highest correlation coefficients with leaf N content and yield. The correlation coefficients of G/B with yield and leaf N content were − 0.555 and − 0.774, respectively. In addition, from the reviving stage to the jointing stage, the correlation between the mean value of the color parameters and the leaf N content and yield of the six cultivars showed a downward trend.

Histogram of canopy image color parameters and its relationship with leaf N content and yield.
There was a close relationship between the histogram of the color parameters and leaf N content. In the histogram of [(G−R)/(R + G + B)], as an example (Fig. 2a,b), there are two peaks in the histogram that correspond to the dark region and light region in the [(G−R)/(R + G + B)] image. Peak a is on the left in the histogram, indicating that this peak represents the proportion of soil pixels in the image. On the contrary, peak b, the right peak, represents the light part of the image, which is the crop in the image (Fig. 2a,b). The histogram curves indicated that with the increase in leaf N content, peak b in the histogram increased and peak a decreased. This trend indicates that the better the crop growth, the larger the proportion of vegetation in the image, with a relative decrease in bare soil. The histogram can not only express the overall color information of the vegetation (peak b displacement) but also show the growth information of the vegetation leaves (the relative height of peak  Table 1. Leaf N content and yield of wheat in different wheat cultivars. N0, N120, N180, N240, and N360 represent N application rates of 0, 120, 180, 240, and 360 kg N ha-−1 , respectively. CV is the coefficient of variation. Different lowercase letters indicate significant differences at the 5% probability level. Different lowercase letters indicate statistically significant differences between different treatments at P < 0.05 level.

Cultivar
Nitrogen levels Comparison of exponential image mean value and exponential image histogram on leaf N content and yield expression ability. The correlation coefficients of IIMV and IIH in leaf N content and yield were plotted as scatterplots to compare the expression of IIMV and IIH (Fig. 6). It can be seen from Fig. 6a,b that at the reviving stage, the correlation coefficient between IIH and leaf N content and yield was mostly higher than that of IIMV. The scatter points in the graph were all concentrated above the 1:1 line at the jointing stage (Fig. 6c,d). When the correlation coefficient of IIMV was low, the correlation coefficient between IIH and leaf N content and yield was about 0.6. Therefore, IIMV has a better ability to express leaf N content and Application of a neural network model to compare exponential image mean value and exponential image histogram. The leaf N content and yield of IIMV and IIH were predicted using a multilayer perceptron (MLP) neural network model. Comparing the estimated and measured values of IIMV and IIH, it can be seen that most of the values were concentrated near the 1:1 line (Figs. 7, 8). After analyzing the error of the neural network, it was concluded that the RMSE and MAPE of IIMV were smaller in the training dataset during the reviving stage (Table 4). This shows that the IIMV training data had a smaller dispersion during the reviving stage of wheat, and the results are thus better. Combining the training dataset and the validation dataset, it is evident that the yield estimation results of IIH were better in the reviving stage (Fig. 8d). However, the leaf www.nature.com/scientificreports/ N content of IIH had a smaller prediction error and dispersion at the jointing stage. The MAPE and RMSE of IIH were lower than those of IIMV, and the results showed that IIH had a better application effect for different wheat cultivars.

Discussion
Differences in the images of different wheat cultivars. Differences in the N supply of crops largely affect their leaf N content, which, in turn, results in variations in chlorophyll content and thus leaf color and ultimately plant growth. In contrast to ground hyperspectral devices, multispectral unmanned aerial vehicle sensors, and other high accuracy optical sensors, the diagnosis of N nutrition by a mobile phone camera has the advantages of being convenient and low cost, and has thus gradually attracted the attention of scholars both locally and globally [36][37][38][39] . However, the mobile phone camera is not designed for scientific research. Its radiometric sensitivity, spectral response, and signal-to-noise ratio are insufficient in comparison with the precision optical instruments mentioned above. Without the narrow spectral bands and radiometric accuracy, the color indexes from a mobile phone camera developed using remote sensing techniques can only reflect the basic RGB color characters of the winter wheat canopy. Other factors such as leaf coverage, which is also an important indicator for N nutrition, will be neglected when IIMV is used. Thus, the different plant growth statuses at different growth stages will directly affect the performance of IIMV models. If a specific model needs to be built for each wheat cultivar, the mobile phone diagnostic method will be difficult to use at a large scale. Previous studies have indicated that the correlation between canopy color information and chlorophyll content differs greatly for different wheat cultivars (Aozao 8, Hengguan 35, Xinmai 19, Puzhan 4110, Yumai 49-198, and Zhengmai 366) 40 . It can be seen from the image (Fig. 3) that there are some differences among different cultivars at the same stage and under the same treatment. This research supports those results based on comparisons of six wheat cultivars. The correlation between the mean value of the image color parameter at the reviving stage and leaf N content and www.nature.com/scientificreports/ yield of HY198, XN979, and PA8 was greater than the other cultivars, while at the jointing stage the correlation of PA8 was higher than the other cultivars. The differences between different wheat cultivars may result from three aspects. First, different wheat cultivars have very different leaf color characteristics, and different cultivars may have different canopy structures 11 . Furthermore, changes in the color of wheat leaves can occur when crops suffer from pests and diseases, or when drought occurs. Second, the plant type and height of the different wheat cultivars may cause different reflection curves of the canopy spectrum in the visible light region, which is related to the acquisition of the color parameters of the canopy images 41 . In addition, the growth stage of wheat also affects its dry mass and nutrient accumulation and phenotypic characteristics, leading to canopy image differences, thus affecting the stability and accuracy of the model 42 . For example, in the jointing stage of wheat, wheat growth and nutrient transport are more active, and N accumulation in the plant is in flux, which may affect the accuracy of leaf N concentration measurements and the stability of the canopy image color parameters. In conclusion, the different responses of canopy image color to the wheat cultivars are a disadvantage of the IIMV models.
Possibility of applying IIH to different mobile phones. Different mobile phones vary in spectrum reflecting, exposure, color tuning, et al. The robustness of the proposed algorithm over different mobile phones is important. In order to verify the application effect of IIH under different mobile phone cameras, we set an independent experiment. In one quadrat, winter wheat canopy images were captured by Apple phone and Meizu phone, respectively. Different from the experiment described in previous sections, there were totally 4 samples collected in this section. The histogram of different mobile phone captured images were calculated and be input into the built network. The MAPE and RMSE indexes from Apple phone and Meizu phone were compared in Table 5. As can be seen from  www.nature.com/scientificreports/ Apple mobile phone are smaller than those of Meizu mobile phone. Their performance difference is not obvious. An interesting phenomenon is, when estimating yields, Meizu' result is even slightly better than Apple's. Therefore, from the current experiment, the proposed IIH algorithm has great potential in applying to different brands of mobile phones. In the future, it is still necessary to make a thorough comparison between different mobile phone brands and even different brand series of cameras to make this technique available in everyday life.
Advantage of the IIH model. At present, it is necessary to address the stability of the diagnosis model for N nutrition diagnosis using a mobile camera. By taking a canopy image using a mobile phone camera, Xia et al. 42 found that the visible-light atmospheric-impedance vegetation index (VARI) was significantly correlated with the traditional diagnostic indicator SPAD value and stem base nitrate at the jointing stage of wheat. However, Guo et al. 4 found that [G/(R + G + B)] had a strong correlation with the leaf N content of maize, thus establishing a maize leaf N detection model. In this study, the results indicated that the single-variable leaf N content and yield estimation models based on IIMV were not stable. The possible reason for this instability is that the mean value of the image color parameters can only reflect the leaf color differences. The plant growth status, which is also an important nutritional status indicator, could not be separated by the single variable. Meanwhile, the method based on the IIH could capture both the color information and the leaf coverage status. The correlation coefficients between the color parameter histograms, leaf N content, and yield of the various cultivars were relatively stable and were significantly higher than the mean value. With more information, a multi-variable model could be built based on the IIH. The experiment results indicated that the IIH multi-variable model could yield stable estimation results depending on the strong nonlinear mapping ability of the neural network algorithm.
From the histogram of the canopy image color parameters based on [(G-R)/(R + G + B)], it can be seen that the peak b height of the histogram increased with increasing leaf N content at the reviving stage of wheat, and the two peaks of the different cultivars had different heights. Based on the correlation between canopy color parameter histogram and leaf N content and yield, it is also evident that compared with IIMV, the difference between different cultivars is relatively small. The neural network model was used to test the IIMV model and the IIH model. The vegetation color information and vegetation growth information contained in the IIH model could reduce the differences between the different cultivars. Therefore, in IIH multi-variety N nutrition diagnosis, www.nature.com/scientificreports/ it is recommended that the IHV model is selected for canopy color parameter histogram construction. In order to carry out nutrition diagnosis more conveniently, it will have a broader prospect to use digital photos for crop N nutrition diagnosis. This paper focuses on explaining the advantages of histogram in nutrition diagnosis, but there are still many problems to be solved. Different brands of mobile phones have various cameras, and there are differences in color temperature, tone and response band of cameras. Therefore, further research is needed to explore the stability and robustness of this method applied to different mobile phone brands. In addition, some alternative techniques such as deep learning neural networks 43 and energy curve image threshold technique 44 also have great prospect in digital image processing. Combined with these techniques, the proposed diagnosis structure may be more effective and applicable in the future.

Conclusions
RGB images of the canopies of six wheat cultivars grown at different N supplies were taken with a smartphone camera during the reviving stage and the jointing stage. From the obtained results, the following conclusions were drawn.
1. The histogram of the color parameters of the wheat canopy images contained abundant information on the growth status of wheat and sufficiently displayed the overall color information and leaf growth. The IIH multi-variable model had a higher correlation with leaf N content and wheat yield than IIMV. 2. In the neural network model, the histogram of the color parameters of the canopy image also produced satisfactory results, and the estimation accuracy and error were better than the parameter mean value method. 3. The histogram of the color parameters of the canopy image combined with the neural network model has strong application potential in the use of mobile phones for the N nutrition diagnosis of wheat. 4. Further study the image analysis technology and deep learning neural network technology, and explore the effectiveness of image threshold setting technology based on energy curve.

Data availability
The datasets used and analysed during the current study are available from the corresponding author on reasonable request. All data generated or analysed during this study are included in this published article [and its supplementary information files]. Source data are provided with this paper.