Predicting hybrid rice performance using AIHIB model based on artificial intelligence

Sabouri, Hossein; Sajadi, Sayed Javad

doi:10.1038/s41598-022-13805-x

Download PDF

Article
Open access
Published: 11 June 2022

Predicting hybrid rice performance using AIHIB model based on artificial intelligence

Scientific Reports volume 12, Article number: 9709 (2022) Cite this article

1453 Accesses
3 Citations
Metrics details

Subjects

Abstract

Hybrid breeding is fast becoming a key instrument in plants' crop productivity. Grain yield performance of hybrids (F1) under different parental genetic features has consequently received considerable attention in the literature. The main objective of this study was to introduce a new method, known as AI_HIB under different parental genetic features using artificial intelligence (AI) techniques. In so doing, the rice cultivars TAM, KHZ, SPD, GHB, IR28, AHM, SHP and their F₁ hybrid were used. Having recorded Grain Yield (GY), Unfertile Panicle Number (UFP), Plant Height (HE), Days to Flowering (DF), Panicle Exertion (PE), Panicle Length (PL), Filled Grain Number (FG), Primary Branches Number (PBN), Flag Leaf Length (FLL), Flag Leaf Width (FLW), Flag Leaf Area (FLA), and Plant Biomass (BI) in the field, we include these features in our proposed model. When using the GA and PSO algorithm to select the features, grain yield had the highest frequency at the input of the Artificial Neural Network (ANN), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Support Vector Machine (SVM) structure. The AI_HIB_ANN result revealed that the trained neural network with parental data enjoyed a good ability to predict the response of hybrid performance. Findings also reflected that the obtained MSE was low and R² value was greater than 96%. AI_HIB_SVM and AI_HIB_ANFIS showed that measuring attributes could predict number of primary branches, plant height, days to flowering and grain yield per plant with accuracies of 99%. These findings have significant implications as it presents a new promising prediction method for hybrid rice yield based on the characteristics of the parent lines by AI. These findings contribute to provide a basis for designing a smartphone application in terms of the AI_HIB_SVM and AI_HIB_ANFIS methods to easily predict hybrid performance with a high accuracy rate.

Genomic analyses reveal the stepwise domestication and genetic mechanism of curd biogenesis in cauliflower

Article Open access 07 May 2024

Targeted genome-modification tools and their advanced applications in crop breeding

Article 24 April 2024

Climate change impacts and adaptations of wine production

Article 26 March 2024

Introduction

Rice could be considered as a main meal for more than 60 percent of the world's population¹. In recent years, hybrids of self-pollinating species such as wheat (Triticum aestivum L.), rice (Oryza sativa L.) and barley (Hordeum vulgare L.) have been considered². In several nations, the production technology of hybrid rice has become prevalent. Heterosis happens when the F₁ generation outperforms its parents' function, panicle size, grain per panicle, and branch count. According to Viermani et al., heterosis in rice differs depending on the degree of variety and alteration between parents. The Indica × Japonica crosses demonstrate the most outstanding levels of heterosis³. Many researchers have used hybrid and heterosis for the weight of paddy per plant and its constituent parts^4,5,6,7.

A series of complicated features accompany the hybrid performance. The method of pollination, genomic variation, genetic basis, and adaptation play a role in this complex biological process. In addition, there are many significant variables, including inheritance of the target feature, mating method of experimental design, plant architecture, and panicle characteristics, including tiller and panicle branch⁸. Plant breeding's ultimate goal is to create high-yield variations that will boost agriculture productivity and satisfy the requirements of the developing human population. The use of hybrid breeding has shown to be an effective strategy for yield improvement. The method of hybrid rice parents is the basis of their choices. Although hybrid-breeding efforts have been a resounding achievement, selecting attractive hybrids has previously been mainly based on trial and error. Finding ideal matches between chosen parents requires a great deal of chance. One of the essential aspects of hybrid variety development is the selection of parents with the highest heterotic composition for the emergence of heterosis. The primary difficulty in hybrid breeding is predicting the success of future crosses using available information. It is costly to identify high-yielding hybrids. Predictive yield methods would help in the selection of better rice inbreed lines⁹.

The traditional method of selecting excellent hybrid combinations includes a vast number of line combinations being tested¹⁰. It takes a lot of time and effort to test and choose superior inbred lines for their potential to combine for hybrid production. When a significant number of inbred lines are examined, the number of hybrid combinations that may be assessed keeps rising, creating many practical challenges in performing comprehensive yield studies. As a result, the capability of correctly predicting hybrid functions based on the performance of the inbred line must be established¹⁰. Scientists have long been interested in estimating hybrid crop yields. For an additive and dominant genetic model, the first, best single cross predictor and choice depending on double-cross estimates are experimentally evaluated by comparing various amounts of experimental error variance and different types of hybrids¹¹. There were four different techniques given by Jenkins¹¹ for predicting double cross performance, three of which used single crosses, and the other one indicates the effectiveness of their use. Researchers examined projected double cross values in maize for different predictors; however, they lacked double cross data for assessing the techniques. The methods used by Eberhat¹² were based primarily on what would be referred to as a fixed sampling plan¹². The performance of a single cross was anticipated by utilizing the most effective linear unbiased approach depending on (i) restriction fragment length polymorphism (RFLP) data from the parental inbred and (ii) yield data from a comparable single cross set⁹. According to Bernardo (1994), parental RFLP data and relevant hybrids' yields could be used to estimate single-cross yield⁹. A link was discovered between marker polymorphisms as well as hybrid performance in rice crossings, including several germplasms⁴. Hybrid corn is predicted using the most efficient linear unbiased prediction technique¹³. He made predictions depending on current hybrids and the pedigree connection between them and untried hybrids. The efficacy of best linear unbiased prediction (BLUP) was evaluated in predicting large-scale performance, moisture, stalk, and roof lodging and offered significant evidence that BLUP may be used to identify better single crossings regularly before the field-testing process¹⁴. Additionally, there was also a comparison of the efficiency of the finest linear unbiased prediction based entirely on feature data (T-BLUP) and the effectiveness of the most significant linear unbiased prediction depending on feature and marker data combined (TM-BLUP), with the results indicating that the effectiveness of TM-BLUP for predicting single-cross performance as well as population breeding values¹⁵.

Rice hybrid performance was estimated using the most acceptable linear unbiased genomic prediction¹⁶. Studies have shown that imbalanced designs may benefit from mRNA transcription profiles associated with ridge-regression models even though resources are scarce and transcription profiling is restricted to a subset of genes¹⁷. Heterosis was evaluated utilizing a DNA marker, and it was discovered that genetic distances had a substantial impact on the degree of association due to differences in genetic inheritance as well as measurement⁸. RFLP markers were used to analyze the connection between sorghum hybrid performance and parental molecular genetic variability, with the intention of utilizing the connection to predict hybrid performance¹⁸.

Artificial neural networks (ANN) are now being utilized in a variety of studies. ANN with specific inputs and outputs identifies connections between every set of its inputs and their associated outputs in such applications¹⁹. The Multiple layer perception (MLP) is a machine learning technique that is utilized in prediction applications. MLP is mainly composed of basic perceptions organized in input, output, and one or even more hidden layers. In each layer, the number of neurons varies according to the issue condition. The supervised learning method is used to train MLPs. The training step involves feeding inputs into the network and comparing the network's outputs to the intended outputs. An error signal is produced by the variation between the actual and intended outputs. The purpose of network training is to reduce the signal of error. Error minimization is accomplished by changing network weights, with necessary calculations done by the learning algorithm. The back propagation-learning rule is utilized in the majority of instances. The weights of the layers are modified to reduce mistakes once the output layer has been computed.

According to Jang, the Fuzzy Inference System (FIS) represents uncertainty during classification and prediction issues²⁰. The Takagi–Sugeno defuzzification technique was utilized by Adaptive Neuro-Fuzzy Inference System (ANFIS) to train an ANFIS network that included four stages. These stages include fuzzification of inputs, the definition of the knowledge database, rule processing, and ultimately output defuzzification (s). ANFIS's input layer forwards inputs and membership functions (MFs) to the following layer. MFs are used in the second layer for mapping input data in the range of [0, 1]. Various kinds of MFs, such as triangular, Gaussian, and bell-shaped MFs, could be used in this phase. In the rule layer, this is the third layer, each node matching the fuzzy rules’ preconditions and calculating the normalized weights. The output values arising from the inference of rules are provided by defuzzification in the fourth layer. In FIS training, two learning techniques, propagation and hybrid (a mix of propagation and least-squares approaches), are commonly utilized. The training establishes the connection between the input and output variables for determining the optimal MFs distribution. In addition, the calculation of output MF is the last change made to the ANFIS model throughout its development. There are two possible approaches: constraint-based or linear-based MF, and to get superior outcomes, both of these MFs were used. The hybrid-learning algorithm was used in this research.

Support vector machine (SVM) has excellent performance in various problems involving classification and prediction²¹. SVM showed some benefits such as quarantined performance, lower susceptibility to local minima and higher immunity to increased model complexity. Despite ANN, SVM offers excellent generalization on prediction and classification problems.

To predict hybrid performance, researchers have suggested different factors such as genomic markers^22,23, transcriptome profiles^24,25,26, metabolomic markers¹⁶ and phenomic markers²⁷ of parental inbred lines, which are needed for performance prediction, but, in this project, hybrid performance was estimated by cross-parental characteristics with the aid of artificial intelligence (AI). The aim of this study was to estimate the hybrid yield based on parental characteristics using ANN, ANFIS and SVM models and 9 Iranian rice hybrids. To achieve this goal, we presented method AI_HIB.

Result and discussion

Feature selection

The phenotypic characteristics of the plant are shown by many variables that do not have the same effect or importance in predicting yield. For this reason, it is necessary to find important variables and eliminate other additional variables that may reduce the accuracy of the prediction models. Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) algorithm were used to select the most important features for each cross. Selected features were used for subsequent analysis and prediction of hybrid performance. To determine which feature had the most impact on the predictions, their frequency was calculated at all crosses.

When the GA algorithm was used to select the features, Grain Yield (GY), Panicle Length (PL), Plant Height (HE) and Flag Leaf Area (FLA) with 7, 6, 5 and 5 had the highest frequency at the input of the ANN structure, respectively (Table 1). But when the PSO algorithm was used to select the features, GY, Days to Flowering (DF), Flag Leaf Length (FLL) , FLA and Plant Biomass (BI) had the highest frequency (7, 6, 6, 6 and 5, respectively).

Table 1 Frequency of presence in the models (feature selection).

Full size table

In the case of ANFIS model, the GA algorithm selected GY, PL, FLA and HE with frequencies of 6, 6, 6 and 5, respectively. But when the PSO algorithm was used to select the features, GY, FLA, HE, FLL and BI, respectively, had 6, 6, 5, 5 and 5, respectively, at the input of the ANFIS structure. In this attribute selection method, UFP was not inserted in the ANN structure of any of the hybrids.

In order to perform SVM analysis, when the GA algorithm was used to select the features, GY, PL, FLA, HE and BI with frequencies of 6, 6, 6, 5 and 5 had the highest presence at the input of the SVM structure, respectively. But when the PSO algorithm was used to select the features, GY, FLL, HE, FLA and BI had the highest frequencies (6, 6, 5, 5 and 5 respectively). In all modeling methods of this study and in both attribute selection algorithms, Unfertile Panicle Number (UFP), Primary Branches Number (PBN), Panicle Exertion (PE), Flag Leaf Width (FLW) and Filled Grain Number (FGN) had less frequency, respectively.

AI_HIB_ANN: prediction of hybrid grain yield using ANN

Nine crosses were created between Taromahalli (TAM), Khazar (KHZ), Spidroud (SPD), Gharib (GHB), IR28, Ahlamitarum (AHM), and Shahpasand (SHP). The best set of ANN inputs was determine by GA and PSO algorithm. Results from train, validation and test of ANN with different structures and learning algorithms are shown briefly in Table 2. The results showed that ANN trained with four inputs contain GY, PL, FLA, and BI had the least of test MSE and belonged to TAM × SHP (Table 2). Comparison of statistical parameters of neural network performance including MSE and coefficient of determination (R²) in predicting of hybrid yield showed that MLP neural network with 4–34–1 structure and Levenberg–Marquardt training algorithm predicted the hybrid yield response (belong to TAM × SHP) using GA algorithm with MSE equal to 0.00076, 0.00110 and 0.00114 predicted for training, validation and test, respectively. These values were equal to 0.00094, 0.00142 and 0.00126 for PSO algorithm (with 4–31–1 structure). In order to understand that the results obtained are also true for the general data, the models were fitted to all hybrids. The results showed the power of AI in estimating hybrid performance. The MSE parameter values of this network during the training, validation and testing steps are shown in Fig. 6 (GA algorithm) and 7 (PSO algorithm). Also, as so as avoid over-fitting the network and MSE increase in validation data, network training was stopped after 77 (algorithm GA) and 80 (algorithm PSO) repetitions. Given the low value of MSE and the value of more than 96% for R² in all hybrids, it can be concluded that the trained neural network with this data has a good ability to predict the response of hybrid performance. Graphs of network gradient variation including the adaptive parameter µ and the number of validation failures corresponding to each iteration during the training process are shown in Fig. 1 (GA algorithm) and Fig. 2 (PSO algorithm). At the end of the network training process, the gradient values, parameter µ and the number of validation failures are equal to 0.282, 0.001 and 6, respectively for GA algorithm. These values were 0.705, 0.001 and 6 for the PSO algorithm, respectively. The highest MSE belonged to AHM × SPD.

Table 2 Result of ANN in prediction of hybrid rice yield from their parent’s features.

Full size table

Hybrid grain yield predicted was fitted on actual hybrid grain yield with regression and goodness of fitness test. Despite difference between the MSE test values obtained for the hybrids, for all of them the R² value between the predicted and actual values was estimated above 96% that showed in Fig. 3 (GA algorithm) and Fig. 4 (PSO algorithm). The goodness of fitness test did not reveal a differences between actual and estimated data. Although feature selection algorithms for each hybrid introduced specific attributes into the prediction model, but the results showed that estimates can be made with robust and high accuracy for all type of hybrids.

Hybrid grain yield prediction has attracted a huge deal of interest. According to Westhues and Schrag (2017), integration of genomic and transcriptomic data is effective in the prediction of hybrid maize’s important agronomic properties²⁸. Wang et al. compared the predictabilities from all integrations of three omic data utilizing eight conventional prediction approaches²⁹. They concluded that integrating the metabolomic and genomic data normally presents the best prediction in rice. The hybrid prediction in terms of metabolomics and genomic data has been progressed. However, it is still a challenge to maximize the predictability as accurate, easy, and accessible to all. Mainly, former omic predictions for hybrid performance were concentrated on transcriptomic, genomic, and metabolomic data. However, the phenotypic information of parents (phenome) was overlooked. Indeed, phenotypes are the core of crop breeding. Moreover, experienced breeders can guess the performance of hybrids considering the phenotypes of their parents, to some degree³⁰. Several studies have been performed on the prediction of hybrid yield^31,32, though it is still not clear whether integration of AI approaches can enhance the hybrid prediction.

According to these result, neural network methods enable to predict the performance of the hybrid using parent’s characters. Linear regression between actual values and predicted by the neural network in the test stage using GA and PSO feature selection algorithm presented in Figs. 5 and 6.

ANN can explore the nonlinear association in the input data set. Using certain learning algorithms and with the appropriate topology and correct weights of connections between neurons, neural networks can be trained for approximating each function representing the dependence of the outputs on the inputs³³. Neural networks have some benefits, such as less formal statistical training, the capability at implicitly detecting the complex nonlinear relations between independent and dependent variables, the ability to discover all probable interactions between predictor variables, as well as the availability of multiple training algorithms³⁴.

AI_HIB_ SVM: prediction of hybrid grain yield using SVM

In the SVM method, nine groups of inputs were defined. Evaluation of statistics based on SVM model showed that the highest estimate of hybrid performance belonged to cross SHP × GHB. The GA and PSO selected features GY, HE, PL, NPB and LFL for this cross. The final model converged with the minimum mean squares error value of 0.0065 based on the GA algorithm (Table 3). At the point of convergence, the values of Box Construct, Kernel Scale and Epsilon were 0.0011, 0.3020, and 0.0090, respectively. Also, based on the PSO algorithm, the final model converged with a minimum mean squares error value of 0.0059. At the convergence point, the values of Box Construct, Kernel Scale and Epsilon were 0.0011, 0.302, and 0.0090, respectively. The determination of coefficients for them were 99.76 and 99.77, respectively. Evaluation of other hybrids and general data showed that the highest mean squares error and the lowest determination of coefficient belonged to hybrid TAM × KHZ. The estimated performance of hybrids for general data was also higher than 90%. No significant differences showed between the predicted and observed hybrid grain yield for all hybrid and general data. The results showed that by measuring attributes grain yield, days to flowering, plant height filled grain number and Primary branches number, the yield of hybrids can be predicted with 93% higher accuracy.

Table 3 Result of SVM in prediction of hybrid rice yield from their parent’s features.

Full size table

SVMs have been used successfully in various research areas. These systems are oriented by the structural risk minimization, rather than the empirical risk minimization of the ANN. Using the empirical risk minimization causes the overfitting problem for the network since the solution is captured at a local minimum. The model complexity and empirical error are simultaneously minimized by structural risk minimization. Then, the SVM’s generalization ability for regression problems or classification can be enhanced in several disciplines³⁵. SVM is a very convenient method for predicting dependent variables in various sciences. For example, the following can be mentioned: wall Parameters in Through-Wall Radar Imaging³⁶, Wafer Yield³⁷, Aqueous Solubility³⁸, and drag coefficient³⁹, Conceptual Cost Estimation in Construction Projects⁴⁰, Evapotranspiration⁴¹, iron concentration⁴² and total Organic Carbon⁴³. Thus, based on the SVM method, to predict hybrid performance, we need four attributes to achieve the best results to estimate the hybrid performance using parents’ features.

AI_HIB_ ANFIS: prediction of hybrid grain yield using ANFIS

Nine groups of inputs (belong to different hybrids) were defined in ANFIS. The highest estimate of hybrid performance belonged to cross SHP × GHB. The GY, HE, PL, NPB and LFL selected using GA and PSO algorithms for this cross. The final model converged with the test mean squares error value of 0.002621 and 0.002663 on the base of GA and PSO algorithm, respectively (Table 4). Also, final model converged with a train mean squares error value of 0.000894 and 0.000888 on the base of GA and PSO algorithm, respectively. Accuracy of estimates in this cross were very close to TAM × SHP cross. The determination of coefficients for them were 99.90. Evaluation of other hybrids and general data showed that the highest mean squares error and the lowest determination of coefficient belonged to hybrid TAM × KHZ. The estimated performance of hybrids for general data was also higher than 99%. No significant differences showed between the predicted and observed hybrid grain yield for all hybrid and general data. The results showed that by measuring attributes number of primary branches, umber of filled grain, height, days to flowering and grain yield per plant, the yield of hybrids can be predicted with 99% higher accuracy. The same Features proposed based on the SVM model. ANFIS performance in training and testing using GA and PSO feature selection algorithm and Fuzzy rules used in ANFIS training using GA feature selection algorithm presented in Figs. 12, 13, 14 and 15. Also, The ANFIS structure consists of 5 inputs and one output and 9 rules using GA and PSO feature selection algorithm, Input membership functions 1–4 using GA and PSO feature selection algorithm and step size chart for trained ANFIS network using GA feature selection algorithm presented Figs. 7, 8, 9, and 10.

Table 4 Result of ANFIS in prediction of hybrid rice yield from their parent’s features.

Full size table

ANFIS as an adaptive network allows the use of neural network topology along with fuzzy logic. It comprises the features of both methods and removes some disadvantages of using them alone. ANFIS operation is similar to the feed-forward backpropagation network. Calculation of the consequent parameters is forward. However, premise parameters are determined backward. The neural section of the system includes two learning methods of the hybrid learning method and the back-propagation learning method. Only zero or first-order Sugeno inference system or Tsukamoto inference system can be utilized in the fuzzy section^44,45.

The ANFIS technique has been widely used in various sciences as follow: number of foreign visitors⁴⁶, Outdoor Temperaturoft Sensors⁴⁷, acid solvent solubility in supercritical CO₂⁴⁸, solar radiation⁴⁹, Roadheader Performance from Schmidt Hammer Rebound Values⁵⁰, degree of polymerization using dissolve gas analysis and oil characteristics ⁵¹, PCUs at Different Levels of Service⁵², housing demand⁵³ and evapotranspiration⁵⁴. Although the ANFIS method has been used in various sciences to predict dependent variables, this study is the first report of predicting hybrid performance using this technique.

The methods proposed in this study can be used by breeders to predict hybrid seed yield. This type of machine learning method is useful for decision makers in two ways. First, because only the phenotypic characteristics of inbred lines are used to develop the model, it helps breeders reduce costs by reducing the number of hybrid breeding trials. Second, the breeding process will no longer be time consuming because there is no need to wait for the results of field experiments. This information can be obtained very quickly from the model.

Conclusion

Hybrid grain production technology is one of the most important parts of plant breeding. The biggest challenge in hybrid breeding is how to predict the performance of future crosses based on existing Hybrid. The prediction of the hybrid action has long been the subject of research by plant breeders. Identifying high yielding hybrids is expensive. Methods for predicting hybrid yield would facilitate the identification of superior rice inbreed lines⁹.

Unlike conventional rice breeding (inbreeding following two-way or three-way cross-breeding and release), hybrid rice breeding is proposed to increase grain yield by exploiting the heterosis phenomenon. Extensive field evaluations are required to estimate hybrid rice yield. This makes predicting hybrid rice based on parental line phenotype an important strategy. Phenotyping a wide range of hybrids is a fundamental step in predicting unobserved hybrids. To avoid the high cost of testing produced hybrids, the use of AI is the best strategy.

Conventional selection of superior hybrid combinations involves testing of large numbers of line combinations⁸. Testing and selection of superior inbred lines for their combining ability for hybrid production demands a great amount of effort. When a high number of inbred lines are tested, the possible number of hybrid combinations to be evaluated in tremendously high. This poses a lot of practical difficulties in conducting extensive yield tests. Therefore, with the ability to accurately predict the performance of hybrids from the performance of inbred lines need to be developed¹⁰.

To compare the three prediction methods, the coefficient of determination were compared. The values of t for comparison of AI_HIB_ANN and AI_HIB_SVM, AI_HIB_ANN and AI_HIB_ANFIS, finally AI_HIB_SVM and AI_HIB_ANFIS were − 0.01082, − 0.02038 and − 0.00957, respectively. The P-value values between the above comparisons were 0.102, 0.001 and 0.263, respectively. Due to the higher mean of AI_HIB_SVM (98.00%) and AI_HIB_ANFIS (98.95%) compared to AI_HIB_ANN (96.92%) method, we recommend the AI_HIB_SVM and AI_HIB_ANFIS method to predict hybrid performance.

The ANN method needs a lot of data for training and learning. Moreover, the correlation between the inputs and output is very crucial for better performance of the ANN. In addition, the weight and bias of the hidden layer and output layer need to be properly tuned during the training period to get better performance⁵⁵.

On the other hand, the adaptive ANFIS is a hybrid system with the benefits of both ANN and the fuzzy system. Therefore, the ANFIS performs better than the ANN for prediction. ANFIS has the capability of fast learning, effective handling of uncertainty and imprecision⁵⁶.

The SVM method is accurate and it is capable of minimizing the over-fitting issue. SVM can provide prediction result based on limited set of information, it is useful when the parameters are optimized by other intelligent methods. Compared with the other traditional machine learning, SVM possesses stronger generalization performance. When used for regression forecasting, SVM has the advantages of avoiding falling into local optimum compared to other nonlinear prediction models. SVM is a viable alternative to ANN in hybrid yield prediction due to its stability and good performance. SVM shows the strong resistance to the over-fitting problem and the high generalization performance. It is mainly because SVM can construct a mapping from one-dimensional input vector into high-dimensional space by the use of reproducing kernels⁵⁷.

Refer to Table 4 to determine which parental Features to measure the performance of hybrids. As it is known, GY, HE, PL and FLA attributes have the highest presence rate in feature selection algorithms. Therefore, it is suggested that the above features be measured in parents.

In this study, for the first time, hybrid performance was estimated by cross-parental characteristics with the help of AI. In all the methods that have been proposed so far to estimate the performance of the hybrid, we require laboratory costs and the employment of specialized personnel. But in the AI-based methods discussed in this study, we only need to measure to some of parent features. With the help of AI, it is easy to select from a large number of possible cases for crossing between inbred lines, a limited number for its done. This achievement can contribute significantly to the success of rice breeders.

The models obtained in this research can be used in the different environmental conditions. We tried to minimize this effect by increasing the number of crosses in one environment. More research is needed to develop global models and make them usable in various environments. Therefore, it is suggested that these experiments be repeated in different locations with different environmental conditions. We recommended that the model presented in this article be used for other environments to test its globality.

Material and methods

Field considerations

Location of field experiment

Experiments were performed at Gonbad Kavous University. The location of the experimental field is at 17° 37 latitudes, 12° 55 longitudes, and 45 m above sea level. The soil in the performed experimental plots was Si.Cl.L in texture. Some of physical and chemical properties of soil presented in Table 5.The experiments were conducted during June 2017 and 2018.

Table 5 Some of physical and chemical properties of soil.

Full size table

Crossing the genotypes and agricultural operations

Nine cultivars of rice were selected comprising TAM, SPD, KHZ, GHB, IR28, AHM, and SHP. All cultivars were grown in isolated conditions completely and were totally pure. They were categorized in the Indica group. However, they were significantly different in terms of Mn, Fe, Zn, and protein content⁵⁸, blast disease⁵⁹, drought tolerance⁶⁰, and agronomic properties^61,62. TAM, AHM, GHB, and SHP are Iranian traditional cultivars with higher plant height, lower tiller number, low and medium yield, lodging susceptibility, low biomass, low tiller number, and lower amylose content based on quality. However, SPD, KHZ, and IR28 are improved cultivars. SPD cultivar is an enhanced cultivar developed by Damsiah/IR8 cross at Rice Research Institute of Iran (RRII). Furthermore, the KHZ as an improved cultivar was developed by TNAU7456/IR36 cross at Rice Research Institute of Iran. IR28 led to a biparental cross as lR833-6-2-1-1///lR1561-149-1//lR24*4/O nivara at International Rice Research Institute. SPD, KHZ, and IR28 have significant differences with landrace cultivars based on morphological properties, abiotic and biotic stress, as well as quality features (Tables 6, 7). Hence, landrace and improved cultivars were crossed. The crosses were performed in such a way that the parents were highly different agronomic and quality properties making the hybrid superior. The population was developed to present the plant genetic materials under the Gonbad Kavous University’s license. All the methods were performed in accordance with relevant guidelines and regulations.

Table 6 Parent’s attributes.

Full size table

Table 7 Hybrid’s attributes.

Full size table

The present work was performed on 9 crosses. KHZ and TAM were crossed in the first cross. Planting 150 seeds TAM (male parent) and 150 seeds KHZ (female parent) was performed as single seedlings. One plant of KHZ was planted near TAM. Emasculating 1/2 of the KHZ main panicle, they were pollinated by the TAM. The other half of the paternal and maternal main panicle were selfed. The seeds of the selfed and the first generation of their crosses were planted in the second year, in rows of 1-m as a single seedling. Ultimately, 5 plants were selected from each row to determine the features. The operation was performed for other crosses (AHM × SPD, GHB × KHZ, IR28 × GHB, IR28 × TAM, SHP × GHB, SHP × SPD, TAM × KHZ, and TAM × SHP).

Transplanting distances as 25 × 25 cm were used. 30 days old seedlings were transplanted in each hill with one plant per hill. After transplanting, 3-inch water depth was kept till seven days before harvest. Fertilizers were applied three times for a total amount of 200 kg/ha. The first application consisted of 25% urea at the time of field preparation, the second application consisted of 50% urea 40 days after transplanting and the last application consisted of 25% urea applied before flowering stage. Insects, diseases, and weeds were thoroughly controlled until harvesting.

Features recording

In all the trials GY, UFP, HE, DF, PE, PL, FG, PBN, FLL, FLW, FLA, and BI were recorded per the standard evaluation system⁶³. Evaluation method and growth stage of recording presented in Table 8.

Table 8 Features, measurement method and growth stage of recording.

Full size table

Mathematical considerations

Data processing

The data is divided into two sections. The input of AI models is average of parent’s attributes. Hybrid grain yield is used as the target of AI models. The Data were randomized and normalized to improve the AI models performance. All analysis was performed at MATLAB programming environment (https://www.mathworks.com) and its built-in functions from Machine Learning and Deep Learning Toolbox (Matlab Machine Learning and Deep Learning Toolbox,The MathWorks, Natick, MA, USA (2020)) and Fuzzy Logic Toolbox (Matlab Fuzzy Logic Toolbox,The MathWorks, Natick, MA, USA (2020)).

Feature selection

The GA and PSO Algorithm were used to determine the most effective features affecting the performance of hybrids. GA is one of the population-based evolutionary algorithms that uses the simulated model population search to find the optimal values of parameters. GA first starts the optimization process by generating the initial population from a random solution to the problem. The initial random population is repeatedly evaluated by the fitness function and evolved to minimize or maximize goals. The main operators in GA are crossover and mutation. Crossover components combine solutions during optimization and are the main tools for exploring search space. Mutation changes some of the solutions significantly and emphasizes the general study of search space (Fig. 11).

PSO method discovered by observing the behavior of a group of fish and birds⁶⁴. PSO is an evolutionary executive random search method that consists of evolutionary planning and GA and results in an optimal solution. In the PSO algorithm, each element is called a particle. These particles exist in the n-dimensional search space and move from their place in the multidimensional search spaces based on their specific speed and information over time. Each particle has enough information and updates its direction to the best place (Pbest), according to its ability. The best location is equivalent to moving the neighboring particle called Gbest. They update their particles according to the best place (Fig. 12).

Prediction of hybrid grain yield based on ANN

Prediction of hybrid yield, using the measured features of the parent features, was performed with Multi-Layer Perceptron (MLP) neural network (Fig. 13). This network has three layers of neurons. The number of neurons in the first layer is equal to the number of elements of the inputs to the network. In the output layer, one neuron was used. All inputs were simultaneously applied to the network, and the weight and threshold values were adjusted after all inputs were applied to the network. The sigmoid transfer function (1) and the linear transfer function (2) were used in the hidden layer and the output layer, respectively. Figures 14 and 15 present the diagrams of these functions.

$$a=\frac{2}{\left(1+\mathrm{exp}\left(-2\times n\right)\right)-1}$$

(1)

$$a=purelin \left(n\right)=n$$

(2)

where n is the input of the neuron and $a$ is its output.

To train the network, different back propagation training algorithms were investigated. Then, all networks were trained with theses algorithms using the MATLAB R2020b software.

The selected network has various parameters that should be considered in the application of the network. These parameters were cases such as number of network epochs and the goal. In this study, the number of epochs of the network will be 200 epochs. The goal of the network is, in fact, the amount of the Mean Squared Error (MSE) value that the training algorithm stops while reaching it. To achieve the best result, this value was considered zero. To calculate network output, data, which were unknown to the network and not provided during the network training, were used. In addition, 70% of the data were used for training, 10% of the data for validation, and 20% of the data were used for testing the neural network.

To evaluate the efficiency of the neural network in predicting product performance, MSE value was used:

$$MSE=\frac{\sum {({y}_{t}^{\wedge }-{y}_{t})}^{2}}{n}$$

(3)

In these relationships, ${y}_{t}, {y}_{t}^{\wedge }$ and n are the actual observation value (actual value of hybrid grain yield), the predicted value of the model (neural network), and the number of observations (the number of years in the experimental group). We named the method we estimated the hybrid performance based on the neural network as AI_HIB_ANN.

Prediction of hybrid grain yield based on SVM

Support vector, like ANN, is a type of data based algorithm. These methods are a group of supervised learning methods that are used for classification, regression, prediction and clustering problems. The problem-solving steps in a SVM, such as an ANN algorithm, are divided into two stages: training and test (validation). This method was developed based on the theory of computational learning⁶⁷. Unlike other methods of AI, SVM, instead of reducing computational error, puts functional risk as a function of goal and its will gain optimal value. The support vector regression model is able to take the problem to a larger space by dimensions using the kernel method. In 2D space, there is an infinite number of lines to separate data from two classes. The closest training data to the hyperplane is called the support vector. The most optimal separator plane is the plane that has the maximum distance between two classes. In other words, the expression C₂ has its maximum value⁶⁷. According to the basics of analytic geometry

$$C= \frac{2}{\Vert W\Vert }$$

So the maximum value of C will be obtained when || W || has the lowest value. The general equation of the optimal plane will be as follows

$${W}^{T}x+b=0$$

Some data may not be in the separated range of the class. In other words, data exceeds one class and is within another class. If we assume that this degree of violation is equal to ξ, then the optimization problem becomes to find w so that the following equation is minimized:

$$Min \frac{1}{2} \Vert W\Vert +C \sum_{i}\xi i$$

Parameter C is the penalty function and its optimal value may be obtained by test or through optimization algorithms. In cases where the data is not linearly separable, the separator plane equation for the nonlinear state is obtained by interfering with the "kernel function" which is responsible for mapping the data from the nonlinear to the linear space.

The common method of Support Vector Regression (SVR) is ε -SVR. For the training data set $X=\left\{{x}_{i},{y}_{i}\right\} , i=\mathrm{1,2},\dots ,n$ approximation is done by finding a function f(x), which should not be far from the target function g(x) more than ε (i.e. $\left|f\left(x\right)-g(x)\right|<\varepsilon$). By applying a map $\mathrm{\varnothing }: {R}^{q}\to {R}^{{q}^{\mathrm{^{\prime}}}}$, with ${q}^{\mathrm{^{\prime}}}\ge q$ to the data set, the ε-SVR is shown as:

$${\mathrm{min}}_{\alpha ,{\alpha }^{*}}=\frac{1}{2}\sum _{i,j=1}^{n}\left({\alpha }_{i}-{\alpha }_{i}^{*}\right)\left({\alpha }_{j}-{\alpha }_{j}^{*}\right)K\left({x}_{i},{x}_{j}\right)+\varepsilon \sum _{i=1}^{n}\left({\alpha }_{i}+{\alpha }_{i}^{*}\right)-\sum _{i=1}^{n}{y}_{i}\left({\alpha }_{i}-{\alpha }_{i}^{*}\right)$$

(4)

$$subject \, to \left\{\begin{array}{c}{\sum }_{i=1}^{n}\left({\alpha }_{i}+{\alpha }_{i}^{*}\right)\\ 0\le {\alpha }_{i},{\alpha }_{i}^{*}\le C\end{array}\right.$$

(5)

where C is the user tuned parameter, and K is the kernel function. The kernel function on two vectors v and z is defined as:

$$K\left(v,z\right)=\langle \Phi \left(v\right),\Phi \left(z\right)\rangle$$

(6)

Kernel function enables the transformation of the input space into high-dimensional feature space where it is possible to apply the linear SVR algorithm. In regression problems, Gaussian one is the most common kernel function:

$$K{\left(x\right)}_{\left({x}_{i},{x}_{j}\right)}={\mathrm{e}}^{-\gamma \left({x}_{i}-{x}_{i}\right).\left({x}_{i}-{x}_{i}\right)}$$

(7)

After training step, the SVR function f(x) can be evaluated as follows:

$${y}_{i}=f \left(x\right)=\sum_{i=1}^{l}{w}_{i}K\left({x}_{i},x\right)+b$$

(8)

where x is the input vector, K is the kernel function, l is the number of the training data samples and w_i = (${\alpha }_{i}-{\alpha }_{i}^{*})$ is the weight vector. The vectors x_i corresponding to w₄₅₂₃₁ are called the Support Vectors (SV). The weight is usually calculated by transferring the SVR optimization problem to the dual optimization problem that equals the constrained quadratic problem and by applying quadratic programming.

We named the hybrid performance estimation method based on the SVM model AI_HIB_ SVM.

Prediction of hybrid grain yield based on ANFIS

Modeling using ANFIS involves two stages of training and network testing using experimental data. Of the data, 55% were used for network training and 45% for trained networks to determine the accuracy of network prediction. Therefore, networks were tested with data other than training data. Determination of the number of rules and the type of membership function is highly important. To find the best network among the other networks, networks with the number of different rules and functions in the MATLAB 2020b software environment were created using the Fuzzy Logic toolbox. As a result, network training continued until the value of the RMSE function goal was reached, or the number of epochs exceeded 100 epochs. Since the value of the RMSE function for all networks is the same, the performance of the networks can be compared. Finally, the performance of the networks in the experiment phase was compared, and the best network was selected based on the accuracy of prediction in the test phase. We named the hybrid performance estimation method based on the ANFIS model AI_HIB_ANFIS.

In the neural network model and ANFIS model, validation group datasets were used to prevent overfitting. The main purpose of using the validation group data is to determine the ability of the model to predict the output values of the experimental data using unseen input values and to prevent overfitting. When the model is well trained during the training process, the error of the validation group should be reduced. If overfitting starts, the validation error suddenly increases, indicating overfitting. In this case, the training process is stopped. In SVM model, fivefold cross validation method was used. In k-fold cross validation, the data are randomly divided into k equal parts. Each time, k-1 segments are used to train the model while the remaining 1 part is used for evaluation of model performance. This process is repeated until each part is used exactly once as a test set. After k-fold cross-validation, each data point has one observed output and one predicted output. The predicted output is the value that is calculated when the data point is placed in the test set during cross-validation. Therefore, in our opinion, the model presented in the article can be used in the real world.

Data availability

All data generated or analysed during this study are included in this published article and Supplementary files.

Abbreviations

AI:: Artificial intelligence
ANN:: Artificial neural network
MLP:: Multi-layer perceptron
ANFIS:: Adaptive neuro-fuzzy inference system
SVM:: Support vector machine
SVR:: Support vector regression
PSO:: Particle Swarm optimization
GA:: Genetic algorithm
TAM:: Taromahalli
KHZ:: Khazar
SPD:: Spidroud
GHB:: Gharib
AHM:: Ahlamitarum
SHP:: Shahpasand
GY:: Grain yield
UFP:: Unfertile panicle number
HE:: Plant height
DF:: Days to flowering
PE:: Panicle exertion
PL:: Panicle length
FG:: Filled grain number
PBN:: Primary branches number
FLL:: Flag leaf length
FLW:: Flag leaf width
FLA:: Flag leaf area
BI:: Plant biomass

References

FAO State. http://www.fao.org/statistics/. (2017).
Li, Z. et al. Genome-wide prediction of the performance of three-way hybrids in barley. Plant Genome. 10(1), 1–9. https://doi.org/10.3835/plantgenome2016.05.0046 (2017).
Article Google Scholar
Virmani, S. S. Mao, C. X. & Hardy, B. Hybrid rice for food security, poverty alleviation, and environmental protection. in Proceedings of the 4th International Symposium on Hybrid Rice, Hanoi, Vietnam, 14–17 May 2002. (International Rice Research Institute, 2003).
Zhang, Q. et al. Relationship between molecular marker polymorphism and hybrid performance in rice. in Rice Genetics III. Proceedings of the Third International Rice Genetics Symposium, 317–326. (International Rice Research Institute, 1995). https://doi.org/10.1142/9789812814289_0027.
Alzona, A. V. & Arraudeau, M. A. Heterosis in yield components of upland rice. Philip. J. Crop Sci. 17, 13–18 (1995).
Google Scholar
Virmani, S. S. Hybrid rice research and development in the tropics. In Advances in Hybrid Rice Technology (eds Virmani, S. S. et al.) (International Rice Research Institute, 1994).
Google Scholar
Alam, M. F. et al. Genetic basis of heterosis and inbreeding depression in rice (Oryza sativa L.). J. Zhejiang Univ. Sci. 5, 406–441. https://doi.org/10.1631/jzus.2004.0406 (2004).
Article CAS PubMed Google Scholar
Soni, S. K., Tiwari, S., Newmah, J. T., Aminon, I. D. & Sundaram, R. M. Prediction of hybrid performance in crop plants: Molecular and recent approaches. Int. J. Curr. Microbiol. Appl. Sci. 7(1), 98–108. https://doi.org/10.20546/ijcmas.2018.701.012 (2018).
Article CAS Google Scholar
Bernardo, R. Prediction of maize single-cross performance using RFLPs and information from related hybrids. Crop Sci. 34, 20–25. https://doi.org/10.2135/cropsci1994.0011183X003400010003x (1994).
Article Google Scholar
Sujiprihati, S., Saleh, G., Siddig, E. & Ali, E. S. Performance and yield predictions in double cross hybrids of tropical grain maize. Pertanika J. Trop. Agric. Sci. 26(1), 27–33 (2003).
Google Scholar
Jenkins, M. T. Methods of estimating the performance of double crosses in corn. J. Am. Soc. Agron. 26, 199–204 (1934).
Article Google Scholar
Eberhart, S. A. & Hallauer, A. R. Genetic effects in single, three-way and double-cross maize hybrids. Crop Sci. 8, 377–379. https://doi.org/10.2135/cropsci1968.0011183X000800030034 (1968).
Article Google Scholar
Bernardo, R. Testcross additive and dominance effects in best linear unbiased prediction of maize single-cross performance. Theor. Appl. Genet. 93(7), 1098–1102. https://doi.org/10.1007/BF00230131 (1996).
Article CAS PubMed Google Scholar
Bernardo, R. Best linear unbiased prediction of maize single-cross performance. Crop Sci. 36(4), 862–866. https://doi.org/10.2135/cropsci1996.0011183X003600040007x (1996).
Article Google Scholar
Bernardo, R. Marker-assisted best linear unbiased prediction of single-cross performance. Crop Sci. 39(5), 1277–1282. https://doi.org/10.2135/cropsci1999.3951277x (1998).
Article Google Scholar
Xu, S., Xu, Y., Gong, L. & Zhang, Q. Metabolomic prediction of yield in hybrid Rice. Plant J. 88, 219–227. https://doi.org/10.1111/tpj.13242 (2016).
Article CAS PubMed Google Scholar
Philippi, C. et al. Transcriptome-based prediction of hybrid performance with unbalanced Hybrid from a maize breeding programme. Plant Breed. 136, 331–337. https://doi.org/10.1111/pbr.12482 (2017).
Article CAS Google Scholar
Jordan, D. R. et al. Prediction of hybrid performance in grain sorghum using RFLP markers. Theor. Appl. Genet. 106, 559–567. https://doi.org/10.1007/s00122-002-1144-5 (2003).
Article CAS PubMed Google Scholar
Haykin, S. Neural Networks a Comprehensive Foundation (Macmillan, 2005).
MATH Google Scholar
Jang, J. S. R. ANFIS: Adaptive network based fuzzy inference system. IEEE Trans. Syst. Man Cybern. 23(3), 665–685 (1993).
Article Google Scholar
Smola, A. J. & Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004).
Article MathSciNet Google Scholar
Frascaroli, E., Schrag, T. A. & Melchinger, A. E. Genetic diversity analysis of elite European maize (Zea mays L.) inbred lines using AFLP, SSR, and SNP markers reveals ascertainment bias for a subset of SNPs. Theor. Appl. Genet. 126, 133–141. https://doi.org/10.1007/s00122-012-1968-6 (2013).
Article PubMed Google Scholar
Schrag, T. A. et al. Molecular marker-based prediction of hybrid performance in maize using unbalanced Hybrid from multiple experiments with factorial crosses. Theor. Appl. Genet. 118, 741–751. https://doi.org/10.1007/s00122-008-0934-9 (2009).
Article CAS PubMed Google Scholar
Frisch, M., Thiemann, A., Fu, J., Scholten, T. & Melchinger, A. E. Transcriptome-based distance measures for grouping of germplasm and prediction of hybrid performance in maize. Theor. Appl. Genet. 120, 441–450. https://doi.org/10.1007/s00122-009-1204-1 (2010).
Article CAS PubMed Google Scholar
Stokes, D. et al. An association transcriptomics approach to the prediction of hybrid performance. Mol. Breed. 26, 91–106. https://doi.org/10.1007/s11032-009-9379-3 (2010).
Article CAS Google Scholar
Zenke-Philippi, C. et al. Prediction of hybrid performance in maize with a ridge regression model employed to DNA markers and mRNA transcription profiles. BMC Plant Biol. 17, 262. https://doi.org/10.1186/s12864-016-2580-y (2016).
Article CAS Google Scholar
Edlich-Muth, C., Muraya, M. M., Altmann, T. & Selbig, J. Phenomic prediction of maize hybrids. Biosystems 16, 30071–30075. https://doi.org/10.1016/j.biosystems.2016.05.008 (2016).
Article Google Scholar
Westhues, M. & Schrag, T. A. Omics-based hybrid prediction in maize. Theor. Appl. Genet. 130, 1927–1939. https://doi.org/10.1007/s00122-017-2934-0 (2017).
Article CAS PubMed Google Scholar
Wang, S. et al. Identification of optimal prediction models using multi-omic data for selecting hybrid rice. Heredity 123, 395–406. https://doi.org/10.1038/s41437-019-0210-6 (2019).
Article PubMed PubMed Central Google Scholar
Furbank, R. T., Jimenez-Berni, J. A., George-Jaeggli, B., Potgieter, A. B. & Deery, D. M. Field crop phenomics: Enabling breeding for radiation use efficiency and biomass in cereal crops. New Physiol. 223, 1714–1727. https://doi.org/10.1111/nph.15817 (2019).
Article Google Scholar
Guo, T. et al. Performance prediction of F1 hybrids between recombinant inbred lines derived from two elite maize inbred lines. Theor. Appl. Genet. 126, 189–201. https://doi.org/10.1007/s00122-012-1973-9 (2013).
Article CAS PubMed Google Scholar
Zhao, Y., Mette, M. F. & Reif, J. C. Genomic selection in hybrid breeding. Plant Breed. 134, 1–10. https://doi.org/10.1007/978-3-319-63170-7_7 (2015).
Article Google Scholar
Yao, J. & Tan, C. L. Guidelines for Financial Forecasting with Neural Networks (Neural Information Processing, 2001).
Google Scholar
Tu, J. V. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 49(11), 1225–1231. https://doi.org/10.1016/S0895-4356(96)00002-9 (1996).
Article CAS PubMed Google Scholar
Hamidi, O. et al. A comparative study of support vector machines and artificial neural networks for predicting precipitation in Iran. Theor. Appl. Climatol. 119, 723–731. https://doi.org/10.1007/s00704-014-1141-z (2015).
Article ADS Google Scholar
Zhang, H. M., Zhang, Y. R., Wang, F. F. & An, J. L. Application of support vector machines for estimating wall parameters in through-wall radar imaging. Int. J Antennas Propag. 8, 1–8. https://doi.org/10.1155/2015/456123 (2015).
Article Google Scholar
Chen, K. Chang, P. Y. & Yeh, C. H. Wafer die yield prediction by heuristic methods. in Proceeding of The 40th International Conference on Computers & Indutrial Engineering, 1–4 (2010).
Lind, P. & Maltseva, T. Support vector machines for the estimation of aqueous solubility. J. Chem. Inf. Model. 43(6), 1855–1859 (2003).
CAS Google Scholar
Ravikran, N. & Ubaidulla, P. Support vector machine approach to drag coefficient estimation. in Proceeding of IEEE International Conference on Signal Processing (ICSP) (2004).
Vahdani, B., Mousavi, S. M., Mousakhani, M., Sharifi, M. & Hashemi, H. A neural network model based on support Vector machine for conceptual cost estimation in construction projects. J. Optim. Ind. Eng. 10, 11 (2012).
Google Scholar
Eslamian, S., Abedi-Koup, J., Amiri, M. J. & Gohari, A. Estimation of daily reference evapotranspiration using support vector machines and artificial neural networks in greenhouse. Res. J. Environ. Sci. 3, 439–447. https://doi.org/10.3923/rjes.2009.439.447 (2009).
Article CAS Google Scholar
Maleki, S., Ramazia, H. R. & Moradi, S. Estimation of Iron concentration by using a support vector machine and an artificial neural network: The case study of the Choghart deposit southeast of Yazd, Yazd, Iran. Geopersia 4(2), 201–212 (2014).
Google Scholar
Ahmed, M. Elkatatny, A. Salaheldin, A., Abdulazeez, A. & Mohamed Abouelresh, M. Estimation of the total organic carbon using functional neural networks and support vector machine. in The International Petroleum Technology Conference. https://doi.org/10.2523/IPTC-19659-MS (2020).
Takagi, T. & Sugeno, M. Fuzzy identification of systems and its applications to modeling and control. IEEE Trans. Syst. Man Cybern. 15, 116–132 (1985).
Article Google Scholar
Tsukamoto, Y. An approach to fuzzy reasoning method. In Advances in fuzzy set theory and applications (eds Gupta, M. M. et al.) 137–149 (NorthHolland, 1979).
Google Scholar
Karaboga, D. & Kaya, E. Estimation of number of foreign visitors with ANFIS by using ABC algorithm. Soft Comput. 24, 7579–7591. https://doi.org/10.1007/s00500-019-04386-5 (2020).
Article Google Scholar
Pezeshki, Z., Sayyed Majid Mazinani, S. M. & Omidvar, E. Outdoor temperature estimation using ANFIS for soft sensors. J. Auton. Intell. 2(3), 30–38. https://doi.org/10.32629/jai.v2i3.5820 (2019).
Article Google Scholar
Bemani, A. et al. Applying ANN, ANFIS, and LSSVM models for estimation of acid solvent solubility in supercritical CO₂. Comput. Mater. Contin. 63(3), 1175–1204. https://doi.org/10.20944/preprints201906.0055.v2 (2019).
Article Google Scholar
Hooshangi, N. & Alesheikh, A. A. Evaluation of ANN, ANFIS and fuzzy systems in estimation of solar radiation in Iran. J. Geom. Sci. Technol. 4(3), 187–200 (2015).
Google Scholar
Fattahi, H. Application of soft computing methods for the estimation of roadheader performance from Schmidt Hammer rebound values. Anal. Numer. Methods Mining Eng. 6, 11–24 (2017).
Google Scholar
Mulyodinoto, K., Suwarno, U., Prasojo, R. A. & Abu-Siada, A. Applications of ANFIS to estimate the degree of polymerization using transformer dissolve gas analysis and oil characteristics. Polym. Sci. 4(2), 11. https://doi.org/10.4172/2471-9935.100039 (2018).
Article Google Scholar
Srikanth, S. & Mehar, A. Development of MLR, ANN and ANFIS models for estimation of PCUs at different levels of service. J. Soft Comput. Civil Eng. 2(1), 18–35. https://doi.org/10.22115/scce.2018.50036 (2018).
Article Google Scholar
Aydin, O. & Hayat, E. A. Estimation of housing demand with adaptive neuro-fuzzy inference systems (ANFIS). In The Impact of Globalization on International Finance and Accounting (ed. Procházka, D.) 449–455 (Springer, 2018).
Chapter Google Scholar
Hernández-Salazar, A. et al. Estimation of the evapotranspiration using ANFIS algorithm for agricultural production in greenhouse. in IEEE International Conference on Applied Science and Advanced Technology (iCASAT), 1–5. (2019). https://doi.org/10.1109/iCASAT48251.2019.9069533.
Buchaniec, S., Gnatowski, M. & Brus, G. Integration of classical mathematical modeling with an artificial neural network for the problems with limited dataset. Energies 14(16), 5127. https://doi.org/10.3390/en14165127 (2021).
Article CAS Google Scholar
Achyut, T. & Gurrala, P. K. AI in AM: An experimental investigation using adaptive neuro-Fuzzy interface system as a prediction tool. Mater. Today Proc. https://doi.org/10.1016/j.matpr.2022.02.296 (2022).
Article Google Scholar
Neelakandan, S. & Paulraj, D. An automated exploring and learning model for data prediction using balanced CA-SVM. J. Ambient. Intell. Humaniz. Comput. 12(5), 4979–4990. https://doi.org/10.1007/s12652-020-01937-9 (2021).
Article Google Scholar
Sabouri, A., Nasiri, E., Esfahani, M. & Forghani, A. SSR marker-based study of the effects of genomic regions on Fe, Mn, Zn, and protein content in a rice diversity panel. J. Plant Biochem. Biotechnol. 30, 504–514. https://doi.org/10.1007/s13562-020-00637-x (2021).
Article CAS Google Scholar
Sabouri, A., Alinezhad, F. & Mousanejad, S. Association analysis using SSR markers and identification of resistant aerobic and Iranian rice cultivars to blast disease. Eur. J. Plant Pathol. 158, 561–570. https://doi.org/10.1007/s10658-020-02102-w (2020).
Article CAS Google Scholar
Sabouri, A. et al. Superior adaptation of aerobic rice under drought stress in Iran and validation test of linked SSR markers to major QTLs by MLM analysis across two years. Mol. Biol. Rep. 45, 1037–1053. https://doi.org/10.1007/s11033-018-4253-1 (2018).
Article CAS PubMed Google Scholar
Bagheri, N., Jelodar, N. B. & Nataj, E. Genetic diversity of Iranian rice germplasm based on morphological traits. Iran. J. Field Crops Res. 6(2), 235–244. https://doi.org/10.22067/gsc.v6i2.2430 (2008).
Article Google Scholar
Azizi, H., Aalami, A., Esfahani, M. & Ebadi, A. Evaluation of genetic diversity in some of Iranian and foreign rice genetic resources based on morphological traits. Appl. Field Crops Res. 31(1), 1–18. https://doi.org/10.22092/aj.2018.101495.1018 (2018).
Article Google Scholar
IRRI SES. Standard Evaluation System for Rice (SES). IRRI SES2015. (2015). http://www.knowledgebank.irri.org/images/docs/rice-standard-evaluation-system.pdf. Accessed 11 Oct 2021.
Kennedy, J. & Eberhart, R. C. Particle Swarm Optimization. in Preceding of IEEE International Conferences on Neural Networks, IV, 1942–1948. (IEEE Service Center, 1995).
Paulo, P., de Branco, F., Brito, J. & Silva, A. BuildingsLife: The use of genetic algorithms for maintenance plan optimization. J. Clean. Prod. 121, 84–98 (2016).
Article Google Scholar
Espitia, H. E. & Sofrony, J. I. Statistical analysis for vortex particle swarm optimization. Appl. Soft Comput. 67, 370–386 (2018).
Article Google Scholar
Vapnik, V. Golowich, S. E. & Smola, A. Support vector method for function approximation, regression estimation, and signal processing. in Proceedings of the 9th International Conference on Neural Information Processing Systems, 281–287(1997).

Download references

Acknowledgements

The authors sincerely thank the Deputy of Research and Technology for Coordination of Project (No. 6.1091) that was funded by the Gonbad Kavous University Grant.

Author information

Authors and Affiliations

Department of Plant Production, Collage of Agriculture Science and Natural Resources, Gonbad Kavous University, P.O. Box: 4971799151, Gonbad, Golestan, Iran
Hossein Sabouri & Sayed Javad Sajadi

Authors

Hossein Sabouri
View author publications
You can also search for this author in PubMed Google Scholar
Sayed Javad Sajadi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.S.: Conceptualization, Supervision, Methodology, Funding acquisition, Project administration, Writing-original draft, Writing-review editing, S.J.S.: Conceptualization, Formal analysis, Methodology, Software, Writing-original draft, Writing-review editing.

Corresponding author

Correspondence to Hossein Sabouri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Supplementary Information 3.

Supplementary Information 4.

Supplementary Information 5.

Supplementary Information 6.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sabouri, H., Sajadi, S.J. Predicting hybrid rice performance using AIHIB model based on artificial intelligence. Sci Rep 12, 9709 (2022). https://doi.org/10.1038/s41598-022-13805-x

Download citation

Received: 12 October 2021
Accepted: 27 May 2022
Published: 11 June 2022
DOI: https://doi.org/10.1038/s41598-022-13805-x

This article is cited by

A statistical package for evaluation of hybrid performance in plant breeding via genomic selection
- Szu-Ping Chen
- Chih-Wei Tung
- Chen-Tuo Liao
Scientific Reports (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Result and discussion

Feature selection

AI_HIB_ANN: prediction of hybrid grain yield using ANN

AI_HIB_ SVM: prediction of hybrid grain yield using SVM

AI_HIB_ ANFIS: prediction of hybrid grain yield using ANFIS

Conclusion

Material and methods

Field considerations

Location of field experiment

Crossing the genotypes and agricultural operations

Features recording

Mathematical considerations

Data processing

Feature selection

Prediction of hybrid grain yield based on ANN

Prediction of hybrid grain yield based on SVM

Prediction of hybrid grain yield based on ANFIS

Data availability

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links