Soft-sensor modeling for l-lysine fermentation process based on hybrid ICS-MLSSVM

The l-lysine fermentation process is a complex, nonlinear, dynamic biochemical reaction process with multiple inputs and multiple outputs. There is a complex nonlinear dynamic relationship between each state variable. Some key variables in the fermentation process that directly reflect the quality of the fermentation cannot be measured online in real-time which greatly limits the application of advanced control technology in biochemical processes. This work introduces a hybrid ICS-MLSSVM soft-sensor modeling method to realize the online detection of key biochemical variables (cell concentration, substrate concentration, product concentration) of the l-lysine fermentation process. First of all, a multi-output least squares support vector machine regressor (MLSSVM) model is constructed based on the multi-input and multi-output characteristics of l-lysine fermentation process. Then, important parameters (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document}γ, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document}λ, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma$$\end{document}σ) of MLSSVM model are optimized by using the Improved Cuckoo Search (ICS) optimization algorithm. In the end, the hybrid ICS-MLSSVM soft-sensor model is developed by using optimized model parameter values, and the key biochemical variables of the l-lysine fermentation process are realized online. The simulation results confirm that the proposed regression model can accurately predict the key biochemical variables. Furthermore, the hybrid ICS-MLSSVM soft-sensor model is better than the MLSSVM soft-sensor model based on standard CS (CS-MLSSVM), particle swarm optimization (PSO) algorithm (PSO-MLSSVM) and genetic algorithm (GA-MLSSVM) in prediction accuracy and adaptability.

l-Lysine is the second most leading globally produced amino acids that is being used in animal feeds, pharmaceuticals, cosmetics, food industry and many other daily life applications. The estimated global market is 2.2 million tons which is increasing at 10% rate per year 1 . To meet the increased demand, researchers are looking for alternatives to increase the production instead of increasing the plant capacity which is time consuming and much expensive. One of the best ways to increase the productivity is to monitor and control product concentration (reflecting the most intuitive manifestation of fermentation quality, the higher the product concentration, the better the quality) in real time. An excess amount of accumulation of product in reactor causes catabolic repression or osmotic stress for bacteria 2 . Similarly, cell concentration and substrate concentration are paramount variables to increase the output yield 3,4 . Cell concentration reflects the number of bacterial cells and substrate concentration reflects the growth and reproduction status of the bacteria, which has a close relationship with fermentation metabolism and directly affects the final formation of the product. Measurement of these key variables is necessary to control and optimize the fermentation process in real-time to enhance the productivity. However, it is hard to measure cell concentration, substrate concentration, product concentration during fermentation process in real-time due to the highly time-varying, non-linear and uncertain nature of the fermentation process [5][6][7] .
Many costly offline analysis methods are often used to measure these key biochemical variables such as dry weight method, direct staining method, optical density method and cell counting method. At the same time, there are some problems, such as large time delay, complex operations, large measurement errors and high infection rate, which cannot meet the requirement of real-time optimization control. Because, these classical methods cannot reflect the current state of the fermentation process in time, and it is difficult to meet the realtime dynamics of l-lysine fermentation process. Soft-sensor technology is introduced to solve these problems [8][9][10] Scientific RepoRtS | (2020) 10:11630 | https://doi.org/10.1038/s41598-020-68081-4 www.nature.com/scientificreports/ which constructs inferential mathematical models that can predict real-time values of unmeasurable variables by making use of those easily measurable variables 11 . The results proved that the soft-sensors technology could effectively improve the process monitoring in real-time and fermentation product quality. The successful implementation of these virtual sensors to a larger extent have revolutionized the fermentation industry.
In soft-sensor modeling of the fermentation process, lab-scale data of inputs (easily measurable in real-time using physical sensors) and outputs (cannot be measured in real-time) is collected offline. A non-linear mapping function between inputs and outputs is constructed using some well known data-driven prediction models. Researchers have proposed many data-driven methods for the soft-sensor technology in fermentation. Liu et al. exploited artificial neural network (ANN) to build a soft-sensor for measuring the key variables of marine alkaline protease MP fermentation process 12 . Chong et al. used support vector machine (SVM) to model penicillin fermentation process and results proved that SVM is better than ANN modelling methods 13 . Sang et al. have proposed a model based on least square SVM (LSSVM) to estimate biomass concentration to facilitate on-line monitoring 14 . As computational time complexity of SVM increases with the increase of the size of the dataset, LSSVM solved the curse of dimensionality limitation and it is less dependent on the size of the dataset which has good generalization ability as compared to radial basis function (RBF) neural network 15,16 .
At present, the traditional LSSVM (multi-input single-output models) has proved its usefulness in many daily life applications but the standard formulation of this algorithm seems unable to efficiently handle multi-output regression problems. In general, it is assumed that outputs are mutually independent, and for each output a new model is constructed individually. As this traditional regression model can only predict a single output, so the useful information about the temporal correlation between different outputs is neglected which results in time consumption and low prediction accuracy. To solve these problems, researchers have designed many multi-output regression algorithms as multi-output least square support vector machine regressor models and proved the effectiveness of multi-output models as compared to single output models 17,18 . In addition, multi-output models are also simple and computationally inexpensive 19 .
Meanwhile, optimization algorithms are employed to optimize the important parameters of data-driven models to increase the prediction accuracy. Chen et al. have used Particle Swarm Optimization (PSO) to optimize the weights and threshold of ANN instead of Back Propagation (BP) because of its inherent problems 20 . Robles et al. presented a method to choose the regularization and kernel parameters of SVM using PSO 21 . Genetic Algorithm (GA) is introduced to optimize SVM parameters 22 . Similarly, many other metaheuristic algorithms like Cuckoo Search (CS) 23 , Ant Colony Optimization (ACO) 24 , and Artificial Bee Colony (ABC) 25 have been used in many industrial applications.
In this work, a novel multi-output least square support vector (MLSSVM) regressor model is introduced to construct the soft-sensor model of the l-lysine fermentation process. Single output LSSVM model is an improved form of SVM (which overcomes the problem of possible overfitting in ANN), has less time cost and provides efficient generalization ability 26,27 . However, for multi-output problems, the utilization of correlation information among outputs is necessary for accurate modeling. Hence, MLSSVM is employed for multi-output soft-sensor model of l-lysine fermentation which utilizes the correlation information among outputs to find a non-linear mapping function between multi-inputs and multi-outputs. Furthermore, the selection of model parameters of MLSSVM is important for the effective results and prediction accuracy of model. Thus, a good metaheuristic optimization algorithm that has good local and global search ability with fast convergence should be selected to choose the best model parameters. In the process of multi-peak optimization, the CS algorithm has the best performance to obtain optimal solution as compared to the PSO, GA, Differential Evolution (DE) and ABC algorithms 28,29 . However, the local search ability and convergence speed of CS needs improvement because of its fixed values of two parameters, probability ' p a ' and step size ' α' 30,31 . To overcome these problems and improve the prediction ability, the optimum parameters of MLSSVM are selected by using an Improved Cuckoo Search (ICS) optimization algorithm which has successfully solved the above mentioned problems in CS optimization algorithm and provides optimum parameters of MLSSVM because of it's improved local and global search ability. The proposed hybrid ICS-MLSSVM regression method is also compared with MLSSVM optimized by standard CS (CS-MLSSVM), PSO (PSO-MLSSVM) and GA (GA-MLSSVM) which shows that ICS-MLSSVM outperforms CS-MLSSVM, PSO-MLSSVM and GA-MLSSVM in terms of prediction accuracy and adaptability. Despite of the fact that, CS, PSO and GA have been very successful in many applications, but every optimization problem has a new unknown search space. According to "no free lunch" (NFL) theorem 32 , an optimization algorithm successful in particular set of optimization problems may not be successful in some other optimization problems. ICS proved to be more competent in our case to avoid local optimal solution and provides best global optimal solution as compared to CS, PSO and GA.

Methods
Single-output Least Squares Support Vector Machine (LSSVM). Suykens et al. proposed LSSVM 33 by introducing an equality constraint instead of inequality constraint in SVM 34,35 . The convex quadratic programming (QP) problem is converted to a linear system of equations. The basic modeling principle is as follows: Suppose there are l examples for training, {(x i , y i )| i = 1, 2, . . . , l}, x i ∈ R n is an input vector and y i ∈ R is corresponding output. LSSVM learns a mapping function between inputs and outputs defined as: The optimization problem for regression LS-SVM is as follows: www.nature.com/scientificreports/ where ω is a weight vector; g ∈ R + is penalty parameter; ξ i is error variable; b is deviation; ϕ(·) is mapping to a high dimensional space. Lagrange method is used to optimize the above problems: where α i is a Lagrange multiplier. According to KKT (Karush-Kuhn-Tucker) conditions, the transformation to the linear equation is as follows: where y = [y 1 , y 2 , . . . , y l ] T ; 1 l = [1, 1, . . . , 1] T ; I l is l th ordered unit matrix; α = [α 1 , α 2 , . . . , α l ] T ; K is the kernel function matrix to satisfy Mercer's conditions: This study utilizes RBF kernel function because of its supreme generalization ability and performance 36 ; where σ is the kernel function width. Finally, the function of LSSVM is estimated as:

Multi-output Least Squares Support Vector Machine (MLSSVM). The l-lysine fermentation pro-
cess is a complex system because the bacteria continue to ingest substances from the external environment into the cells, obtain energy for survival through a series of biochemical reactions, and expel metabolites from the body. During the fermentation process, the growth of bacteria and the formation of products are not parallel. In a specific bioreactor, the relationship between biological growth and process control, environmental impact, reactor characteristics, etc. is intricate. This forms a complex multi-input/multi-output nonlinear system. Considering the nonlinear multiple-input multiple-output (MIMO) characteristics in the fermentation process, the traditional LSSVM method needs to be improved for multi-output problems. At present, the single output regression LSSVM formulation can be easily extended to multiple output MLSSVM. Xu et al. have designed a MLSSVM model to exploit correlation information among outputs 37 . MLSSVM aims at finding a mapping function between multi-input and multi-output space, thus considers the correlation information between different outputs. Given a set of examples {(x i , y i )| i = 1, 2, . . . , l}, x i ∈ R n is an input vector and y i ∈ R m is corresponding output vector. MLSSVM finds a non-linear mapping R n → R m . It find values of w = (w 1 , w 2 , . . . , w m ) ∈ R n h ×m and b = (b 1 , b 2 , . . . , b m ) ∈ R m by solving the following optimization problem: and repmatrix(A, m, n) creates a (m × n) block matrix tilling copies of a given matrix A, ' × ' denotes a simple multiplication operator. It is assumed that all w i ∈ R n h can be rewritten as w i = w 0 + v i ; whereas (w 0 , v i ) ∈ R n h . If the outputs are similar, the vectors v i are small, and if outputs are different than each other, the mean vectors w 0 are small. In other words, w 0 bears the information of correlation and v i bears the contrast information. The objective function for solving ϕ(x m )) ∈ R n h ×l and , γ ∈ R are two real positive regularization penalty parameters. Lagrange method for optimization is as follows: www.nature.com/scientificreports/ where A T = (α 1 , α 2 , . . . , α m ) ∈ R l×m is the Lagrange multiplier matrix. By using KKT condition following set of equations is achieved: As it is assumed that w i = w 0 + v i , so for the above optimization problem, the estimation function in terms of v and b is rewritten as: In Eq. 8 it can be seen that it tries to find only small size vectors for decoupling between different parameters of outputs. However, in Eq. 12 change in the first term of expression results a problem that finds additionally a tradeoff between small size vectors, trace(v T v) and v1 m 1 T m v T (closeness to an average vector of all vectors). A linear system can be achieved similarly to LSSVM by using KKT condition, which yields the following equation of the Linear system: . , x l at diagonal positions and remaining entities as zero, . . , y T m ) T ∈ R ml . This linear system consists of ((l + 1) × m) equations. Finally, the function of MLSSVM is estimated as: The linear system in Eq. 13 is hard to solve because the coefficient matrix is not positive definite, so it can be converted to a positive definite linear system by small transformation as follows 37 : where S = P T H −1 P ∈ R m×m and it is a positive definite matrix. The following relation can easily find the solution for α and b: where S = P T η , η and v can be calculated from Hη = P and Hv = y ; where H ∈ R ml×ml is a positive definite matrix.
The selection of the model parameters of MLSSVM (such as penalty factors , γ and kernel width control factor σ ) has a critical role in building a soft sensor model. This work employed ICS optimization algorithm to optimize the MLSSVM parameters and replaced the traditional experience and trial-error based methods. The obtained optimal parameters are utilized to build a more accurate soft sensor model. cuckoo search optimization algorithm. Optimization algorithms like GA, PSO and ACO have been proved more successful than conventional algorithms to solve real-world problems. The intuition of CS was taken from the reproduction style of Cuckoo bird that chooses nest of another specie for laying eggs 38 . The host bird may discard an alien egg by identifying it, or abandon the nest to build a nest at a new position by utilizing the Lévy flight idea 39 . The basic rules are: every cuckoo can lay only one egg and randomly selects a nest to hatch it; the optimal solution (best nest) will be preserved for the upcoming generation; there are a fixed number of nests, and the host bird with a chance p a ∈ (0, 1) may find out the new strange egg. The update in nest position using Lévy flight occurs according to following mathematical relation: i , α and n refer to the current location of the nest, step size, and the total number of host nests respectively. ⊕ stands for entry wise multiplication, L(µ) suit distributed from Lévy and is a random flight step, where (1 < µ < 3) . After updating the position, the probability p a is compared with a randomly generated number r ∈ (0, 1) . If p a > r the nest position remains the same and if p a < r , the nest location x (t+1) i is changed randomly.

improved cuckoo Search (icS) optimization algorithm. Although the CS optimization algorithm
has excellent global search performance, yet its convergence speed and local search ability still suffers in finding the best optimum solution. The reason is that, the values of ' p a ' and step size ' α ' are fixed in standard Cuckoo Search algorithm. For larger α , the convergence speed decays at a swift rate, which results in the worst performance of the algorithm. For small α , the algorithm require too many iterations to reach the best solution. Furthermore, small ' p a ' increases the quality of the best solution (accuracy) in each generation but solution diversity decreases, whereas with a large value of p a , the solution diversity increases and leads to an immature convergence 40 . In standard CS, ' p a ' and ' α ' are the key parameters and play an essential part in improving the global and local search ability of the algorithm. Therefore, in this paper, the values of ' p a ' and ' α ' are selected adaptively. In start the values are adjusted large enough to increase the diversity of solution and with each iteration these values are decreased to increase the fine-tunning ability of solution in later generations. The originally fixed discovery probability is improved according to formula as follows 41 : where p amax , p amin , N max and N i represent the maximum discovery probability, minimum discovery probability, maximum iterations and currently ongoing iteration respectively. The value of step size ' α ' improved according to the following mathematical relation:

icS-MLSSVM soft-sensor modeling method. The model parameters of MLSSVM are penalty factors
γ , and kernel width control factor σ , that play an important role in its performance. As in SVM and LSSVM, a very large value of γ would lead to remarkably high accuracy on training data but less accuracy on test data, while lower value makes the model less functional and shows poor performance 21 15 . Based on the above references, we propose to use an "Improved Cuckoo Search" algorithm to find optimum values of MLSSVM parameters. Figure 1, depicts the ICS algorithm in steps. The steps of ICS-MLSSVM are as follows: Step 1: Prepare train, cross-validation, test dataset and perform normalization.
Step 3: Define an objective function. In our work we have used formula 23, where y(i) and y ′ (i) are actual and the predicted values respectively. For the current generation, according to the objective function, the optimal solution f min is calculated and reserved for the next generations.
Step 4: For every new iteration, update the values of p a , α and k using Eqs. 18, 19 and 20.
Step 5: Randomly select a cuckoo x i by using Lévy flight with fitness f (x i ) and calculate fitness function at any other randomly selected host nest. If f (x i ) is less than previously stored fitness value, replace that solution (host nest) with a new solution (newly selected nest). Otherwise, leave the same solution for next iterations.
Step 6: Generate a random number r ∈ (0, 1) following uniform distribution and compare it with updated probability p a . If r > p a , randomly change the bird's nest position and if r < p a , the nest remains unchanged (good quality solution).
Step 7: Test new generation's nests and keep the best quality solution for the next generation. Finally, the best quality solution x   www.nature.com/scientificreports/ Results and discussion experimental setup. The experiment of l-lysine fed-batch fermentation was carried out at the control system platform of Jiangsu University. The RT-100L-Y fermenter model was used to perform this experiment. To make the experiment close to the actual production process, the experimental process was designed as follows: • The time period for every batch was 72 h and the sampling time period t was 15 min. The auxiliary inputs (such as temperature T, pH, speed of stirring motor r, dissolved oxygen D o , acceleration rate of glucose flow u 1 , acceleration rate of ammonia flow u 2 and air flow rate u 3 ) were collected in real-time by testing instrument.
The key biochemical variables (cell concentration 'X' , substrate concentration 'S' and product concentration 'P') were sampled after every 2-h and tested in laboratory offline. After this, the key biochemical parameters were transformed from 2-h sampled data to 15 min sampled data (consistent with the number of auxiliary inputs data) in MATLAB using the "spline" interpolation function interp1 (https ://www.mathw orks.com/help/ matla b/ref/inter p1.html). The cell concentration was achieved after performing some computations using the www.nature.com/scientificreports/ method of cell dry weight i.e. centrifuge tube was filled with 10 ml liquor of fermentation, the operation of centrifuge was carried out for 5 min at 3000 r/min, supernatant was set aside, and washed by distilled water twice and after that it was dried at 105 • C until its weight became constant, then calculated its weight. S was measured by SBA-40C multi-function biosensor. P was determined by the modified ninhydrin colorimetric method i.e. 2 ml of the supernatant and 4 ml of the ninhydrin reagent were mixed and heated in boiling water for 20 min. The absorbance at 475 mm was measured by a spectrophotometer after cooling and obtained by checking the standard l-lysine curve. • 10 batches were used for testing the modeling competence of the hybrid ICS-MLSSVM method. The initial conditions between batches were set differently and the feeding strategy was also changed to enhance the differences between batches. The pressure of the fermentation tank was set to 0.1 MPa, the temperature of fermentation was adjusted at 30 • C ± 10 • C and the dissolved oxygen electrode was calibrated for the reference reading when the stirring motor was rotating at 400 r/min. evaluation metrics. In addition, to assess the prediction accuracy, Root mean square error (RMSE) and Mean absolute error (MAE) are used. www.nature.com/scientificreports/ where y and y ′ refer to the actual and the predicted value, respectively, and m is the total number of points. Furthermore, difference between actual data and predicted data is plotted to visualize the clear difference.  www.nature.com/scientificreports/ Furthermore, error plots on right-hand side of Figs. 3, 4, 5, 6, 7 and 8 provide a clear difference between ICS-MLSSVM and corresponding comparitive methods. As the y-axis (output concentrations) is much bigger as compared to error exists between actual and predicted curves, so it is difficult to visualize it. Thus, these error values in plots are determined by calculating difference between actual and predicted values (y − y ′ ) to visualize error between actual (y) and predicted values (y ′ ) . A significant difference can be observed in error analysis presented in these plots. All outputs (cell, substrate, product concentration) in such batch fermentation data set are correlated with each other. For example, if the substrate is consumed (concentration decreases), the cell and product concentrations will increase at the same time. However, traditional single output LSSVM can predict only one output at a time, so correlation among all outputs is not considered in the training process which affects the prediction accuracy of the model. The proposed MLSSVM regression model exploits correlation and contrast information among all outputs to find an accurate mapping function between multivariate input space and multivariate output space. Table 1, illustrates the Root mean square error (RMSE) value comparison, and Table 2     www.nature.com/scientificreports/ conclusion In this paper, a hybrid ICS-MLSSVM soft-sensor modeling method is proposed for measuring crucial parameters of the l-lysine fermentation process. According to the characteristics of multi-input and multi-output in the fermentation process, this paper constructs a multi-output MLSSVM model of the fermentation process to measure the key parameters. The proposed MLSSVM method predicts all outputs simultaneously, exploits the useful correlation information among different outputs and designs only a single model for all outputs. In this way, it also decreases computational time as compared to single output LSSVM in which a new model is designed for each output separately because it can only predict a single output independently. Furthermore, considering the importance of the three crucial model parameters (penalty factors γ , and kernel width control factor σ ) of the multi-output MLSSVM regression model to the performance of the soft-sensor model, these model parameter values are selected by utilizing a novel ICS optimization algorithm. ICS replace the traditional methods based on experience and trial-error to choose the model parameters and find optimal values of MLSSVM   www.nature.com/scientificreports/ model parameters, which results in a more accurate soft sensor model, and results show the superiority of ICS as compared to CS, PSO and GA. Simulation results show that the prediction accuracy and adaptability of ICS-MLSSVM outperforms the CS-MLSSVM, PSO-MLSSVM and GA-MLSSVM. The model achieves real-time identification effect based on a few input/output data, thus eliminates the need for an exact kinetics model of the fermentation process. In future, this algorithm can be used to solve complex, non-linear, time-varying, and strongly coupled fermentation problems of industry. In our future work, we are interested to extend this work further and use this model in Model Predictive Control to control the fermentation process and maintain the desired conditions to increase the yield.