MLR and ANN Approaches for Prediction of Synthetic/Natural Nanofibers Diameter in the Environmental and Medical Applications

Fiber diameter plays an important role in the properties of electrospinning of nanofibers. However, one major problem is the lack of a comprehensive method that can link processing parameters to nanofibers’ diameter. The objective of this study is to develope an artificial neural network (ANN) modeling and multiple regression (MLR) analysis approaches to predict the diameter of nanofibers. Processing parameters, including weight ratio, voltage, injection rate, and distance, were considered as independent variables and the nanofiber diameter as the dependent variable of the ANN model. The results of ANN modeling, especially its high accuracy (R2 = 0.959) in comparison with MLR results (R2 = 0.564), introduced the prediction the diameter of nanofibers model (PDNFM) as a comparative model for predicting the diameter of poly (3-caprolactone) (PCL)/gelatin (Gt) nanofibers. According to the result of sensitivity analysis of the model, the values of weight ratio, distance, injection rate, and voltage, respectively, were identified as the most significant parameters which influence PDNFM.

www.nature.com/scientificreports www.nature.com/scientificreports/ increases by using MLR while it declines when independent variables increase. Nonlinear and dynamic modeling techniques like artificial neural network (ANN) are modeling tools to solve complex cases, quality control, data mining, and linear and nonlinear multivariate regression problems [16][17][18][19] . In recent years ANN approach as one of the most popular artificial intelligence approaches has been used to model the electrospinning technique, mostly aimed at predicting the diameter of nanofibers electrospinning 16,20 . The accuracy of multilayer perceptron artificial neural network (MLP) in comparison with other ANN techniques such as Radial Basis Function (RBF) and Support Vector Machine (SVM) in nanofibers diameter prediction has been proved in recent researches. Researchers declared that the reliable results of the ANN in nanofiber studies are in the complex interactions between the variables which are influencing nanofiber formation 21 . However, the capability of ANN techniques, in nanofibers diameter prediction, has not been compared with classic regression methods such as MLR. The objective of this research is to compare the classical regression method with a multilayer perceptron artificial neural network (MLP) for predicting the diameter of PCL/Gt nanofibers electrospinning and developing a probabilistic model to predict the diameter of PCL/Gt nanofibers (PDNF) using objective criteria.

Material and Methods
Materials. Gelatin, from porcine skin type A (Gel Strength _300 g Bloom), PCL (Mw = 80000 g/mol), glacial acetic acid, and formic acid were all purchased from Sigma-Aldrich.
Preparation of polymer solution and electrospinning. A separate solution was prepared from PCL and gelatin by dissolving 15% w/w of the sample in glacial acetic acid/formic acid in a 9:1 ratio (AA/FA) via magnetic stirrer for 4 h. Following this, PCL and gelatin (PCL/Gt) were mixed at different weight ratios (80:20, 70:30, 60:40, 50:50, 40:60, 30:70, and 20:80) for 20 h prior to electrospinning. For electrospinning, each PCL/Gt solution was added in a 5 ml syringe with a needle tip (23 G). Electrospinning was carried out with an injection rate of 0.6-2 ml/h. The distance between the needle tip and the collector was 5-20 cm. The applied voltage was in the range of 6-22 kV. All experiments were conducted at room temperature 22,23 . Characterization. Morphology of nanofibres was observed under a scanning electron microscope (SEM, DSM-960A Model, ZEISS, Germany) at an accelerating voltage of 20 kV. Before SEM, samples were coated with gold. For each sample, the average fiber diameter determined from about 70 random measurements using Image J software.
Artificial intelligence modeling. The ANN is known as one of the main tools in the modeling and control of electrospinning processes in recent years 24,25 . The ANN, as a computing tool, represents a network with several numbers of layers, including many interconnected processing elements (PEs), which are only aware of signals 26 . Indeed, ANNs are capable of learning from real samples of a problem, using transfer functions between neurons and specific learning algorithms in the structure of computer software [27][28][29] .
Four parameters, namely the applied voltage (X 1 ,.kV), the injection rate of solution (X 2 , ml/h), the weight ratio of polymers (X 3 , wt%), and the needle-to-collector distance (X 4 , cm) were considered as input variables of the ANN and the average PCL/Gt nanofibers diameter (Y, nm) was chosen as the output. In this study, hyperbolic tangent, sigmoid tangent, and linear transfer functions were examined to optimize the performance of the neural network 30 . The backpropagation (BP) was applied as a learning algorithm for calculating derivatives of performance concerning the weight and bias variables X. To do an evaluation, all samples (761 samples) were randomly divided into three subsets. The training data set contained 60% of all samples (457 samples), the validation data set included 20% of all samples (152 samples), and test data set included 20% of all samples (152 samples). The validation data set is applied to decrease the possibility of over-fitting or memorizing. It means that when the error of the training data set decreases while the error of the validation data set increases, the network training process will be stopped and over-fitting will be controlled 28,31 . ANN may be trapped in a local minimum of errors and the Momentum coefficient helps to avoid local minimum error traps. Therefore the possibility of under-fitting will be reduced by using the Momentum coefficient. In this research, the Momentum coefficient, initial momentum, and learning rate are 0.9, 0.001, and 0.01 respectively. Levenberg-Marquardt (LM) learning algorithm was used to train the network. This algorithm solves generic curve-fitting problems, but the LM maybe is trapped in a local minimum. Therefore, the momentum coefficient was assigned to avoid the local minimum trap. The Levenberg-Marquardt is more robust than other algorithm and in many cases it results in the best performance of the network 32,33 .
To design the structure of feed-forward and back-forward networks, a program was provided in MATLAB software (Version R2016b). There is not any predefined rule to determine the number of neurons and layers in the structure of ANN, therefore the number of neurons and layers are defined based on trial and error 28,34 . In this study in order to reduce output error, the number of neurons and layers increased and after that, any increase in the number of neurons and layers does not increase the accuracy of the model Model selection. The performance of the designed ANN was evaluated by different statistical indicators: mean squared error -MSE (Eq. 1), root mean squared error -RMSE (Eq. 2), mean absolute error -MAE (Eq. 3), coefficient of determination -R 2 (Eq. 4) and Nash-Sutcliffe model efficiency coefficient (NSE) (Eq. 5) 35 .
where y i and ŷ i are the targets and network outputs, y i is the mean of target values, ŷ i is the mean of output values, and n is the number of samples, respectively. Sensitivity analysis was conducted to rank prediction the diameter of PCL/Gt nanofibers model (PDNFM) parameters considering the significance of each parameter in the model output.
Sensitivity analysis of the model. For the analysis of the importance of each electrospinning parameters, each input parameter was withdrawn while not manipulating any of the other parameters. Then the model was trained for every pattern. It means that the standard deviation was calculated for each input variable and the changes of each input around the mean value (in the limits of standard deviation) used to determine the changes of output. Indeed, the standard deviation of output values for each input variable changes assigned as the sensitivity of the model. The changes in model output values with changes in input variables (in the limits of standard deviation) illustrate the trend of the model 21 .

Results
In this research, two predictive models, i.e., MLR analysis and ANN model, were investigated to compare findings in PDNF model prediction.
MLR model. Four independent variables (needle-to-collector distance, the injection rate of the polymer solution, weight ratio, and applied voltage) were used to predict the diameter of nanofibers. To avoid any possible bias in the selection of test set individuals, the total samples (761 samples) were randomly divided into two subsets. Training data subset including 80% of total samples (609 samples), and test data subset, including 20% of total samples (152 samples). Using the training data subset, constant coefficients of the regression equation were calculated, while the summation of square errors was minimized. Then the prediction operation was carried out on test data -20% of samples (152 samples). Equation 6 was used to predict the PDNFM: where D and V are the distance between needle to collector and voltage, respectively. Statistical indices were calculated to estimate the MLR model's accuracy in the prediction of PDNFM, and the findings are illustrated in Table 1.
The relation between target and predicted PDNFM by MLR model had been plotted using a linear regression model (Fig. 1).

Artificial intelligence modeling.
Affecting parameters on PDNFM as inputs variables, and PDNFM as output were summarized in the MATLAB software for the design of the most accurate structure of ANN. The data provided from affecting parameters were applied to train the feedforward neural networks. The maximum value of R 2 in all data considered ( Table 2). The best ANN structure is (4-28-28-1), which means 4 parameters as inputs,  In which, p i is input layer values, IW ji is the weight of neurons, LW ji is the weight of layers, b i is bias, and tansig is the sigmoid tangent function (tansig(X)= 2/(1+exp(−2*x))−1). As we know, IW ji and LW ji are structured in a huge matrix of weights, which is applicable in MATLAB software. Therefore, this model is calculable in MATLAB software by running the designed network 32,36,37 .
The scatter plot will be applicable to demonstrate the correlation between variables 38 . Figure 2 provides the scatter plot of ANN output versus target (observed) values of the PDNFM for training, validation, test, and all data. Considering R 2 , the correlation coefficient between the ANN output and target values of PDNFM is relatively high. Figure 3 compares the target and simulated values of PDNFM in the training, validation, test data set, and all data. A meaningful and distinctive agreement between target and simulated values is shown in Fig. 3.
The main application of PDNFM in which it's used to predict the nanofibers size based on electrospinning processing parameters. This model could be applied as a decision support system tool in predicting the diameter of electrospinning nanofibers to reduce the time and costs. The compare findings of PDNFM MLR and PDNFM MLP show that the PDNFM MLP is the most accurate model in the prediction of the diameter of PCL/Gt electrospinning nanofibers (Fig. 4).

Sensitivity analysis of PDNFM.
For the analysis of the importance of each electrospinning parameters, each input parameter was withdrawn while not manipulating any of the other parameters, and then the PDNFM was trained for every pattern. As can be seen from Fig. 5, the share of each input parameter of the developed PDNFM in favorable output can be understood clearly. From the data obtained from the sensitivity analysis model, it is apparent that values of the PCL/Gt weight ratio, the needle-to-collector distance, the injection rate polymer solution, and applied voltage, respectively, have been recognized as the critical factors for PDNFM (Fig. 5).
The effect of electrospinning processing parameters on MLP outputs (the diameter of PCL/Gt nanofibers) is shown in Fig. 6.

Discussion
In this investigation, the main goal was to assess which modeling technique has better accuracy. Therefore, the experimental and predicted data obtained by MLR analysis and MLP model were evaluated to obtain an accurate fit. Consequently, error analysis was used, and R 2 , MSE, RMSE, and MAE were calculated. The resulting MLP model with R 2 = 0.959 is in perfect agreement with experimental results than the R 2 value found in the MLR analysis.   www.nature.com/scientificreports www.nature.com/scientificreports/ The MLR analysis has a low R 2 value, meaning that it has a lower accuracy than the MLP model. If the correlation coefficient threshold is calculated one, it represents a perfect correlation between targets and output values of the training/testing data [39][40][41] . The results obtained of the ANN modeling, especially its high accuracy (R 2 = 0.959) in comparison with MLR results (R 2 = 0.564), introduced PDNFM mlp as a comparative model for predict the diameter of PCL/Gt nanofibers.
This fact indicates that the MLP model can be more predictive accuracy. However, these values were satisfactory because the electrospinning process and the diameter size of electrospinning nanofibers have high degrees of complexity 9 . The evidence showed in previous studies suggests similar findings [42][43][44] . Nurwaha and Wang (2013) compared the neuro-fuzzy inference systems (ANFIS) and support vector machines (SVMs), an MLR for evaluation of electrospinning nanofibers diameter. Taken together, the evidence from this research presents that the performance of the SVM model was better than ANFIS and MLR techniques. Accordingly, the values of RMSE and MAE for the SVM are 8.21 and 6.56, for the ANFIS are 9.98 and 8.89, and for MLR are 19.73 and 15.78, respectively 43 . To determine the effects of the content of poly(butylene adipate) and teriflunomide on an initial burst effect and a dissolution behavior, Siafaka et al. (2016) has raised compared ANN and MLR models. The ANN model was more accurate and it had better correlation efficacy compare to MLR analysis. The R 2 value for the MLR and ANN model is 0.85 and 0.945, respectively 42 . Vle et al. (2015) measured the physical properties of nylon-6 fibers and compared them with measured values based on MLR and ANN models. Considering all relevant data, it seems that the ANN model can be applied efficiently in predicting the physical properties of fibers. The ANN model showed well correlation and provided stable responses comparison to MLR. Overall, these results indicated that the ANN model would very useful for predicting combined interaction between independent variables 44 . The ANN techniques provide the advantage of modeling a nonlinear and complicated problem www.nature.com/scientificreports www.nature.com/scientificreports/ without the need to find suitable functional forms for the problem, and their neural network learning ability also equips them with high efficiency in nonlinear system modeling 43,45 . Together, these studies indicate that ANNs techniques carried out well and illustrated stable responses in predicting combined interactions between independent parameters [42][43][44] .
The present study explores the effect of injection rate polymer solution, applied voltage, PCL/Gt weight ratio, and tip to collector distance on the average nanofiber diameter. The finding of sensitivity analysis found that there are close relationship processing parameters and MLP output. From sensitivity analysis results (see Fig. 5), PCL/Gt weight ratio parameter has a highly effect on the average nanofiber diameter. It is observed that the fiber www.nature.com/scientificreports www.nature.com/scientificreports/ diameter was decreased, by increasing the content of PCL in the AA/FA solution and applied voltage (see Fig. 6 (a) and (b)), as a result, there is a reverse correlation between the applied voltage and weight ratio polymer and fiber diameter. Increasing PCL content will result in lower polymer solution emulsion ( Fig. 6(a)). The evidence presented in other studies suggests that the average diameter decreases with PCL content for acetic acid or acetic acid/formic acid mixture system as a solvent. This emulsion structure is related to absence, or very limited miscibility, PCL and Gt, and the interaction of weak PCL and Gt with AA and FA. This means that at higher PCL content, the emulsion structure is weakened 22 . These results are in agreement with Denis et al. ' s finding 22 . Furthermore, the viscosity of polymer solution decrease with an increase in the PCL content of the polymer solution blend. Accordingly, thinner fiber formed due to that the jet could be stretched by electrostatic forces easily 46,47 . Considering trends in Fig. 6(b), applied voltage in electrospinning PCL/Gt is negatively correlated with the average diameter. At high applied voltage, the electrical field strength is high, resulting in more stretching jet during the jet path, and hence, it is expected that the nanofibers diameter decrease 46 . In general, decreasing fiber size is due to the fact that the surface of charge on the jet at higher voltage or field increased. This observation is similar to the previously published reports 21,48,49 . Figure 6(c) provides the effect of the injection rate on the diameter of nanofibers. As one can see, the diameter of nanofibers increases as well as decreases with the increase of the injection rate. The previous investigations suggest similar results 9,50 . With the increasing injection rate, it is expected that the nanofiber's diameter increases. Accordingly, an increase in the diameter of nanofibers was obtained with an increase in the injection rate of polymer solution due to the increases the amount of polymer solution on the tip of the needle 9,46,49,50 . However, when the injection rate exceeds a certain limit increase, the diameter of the nanofiber continuously decreased. Increasing, the injection rate will result in the higher electrical field, increase in the volumetric charge density on the droplet jet, and greater tensile force which this phenomenon creates stretching during jet path and hence the diameter of nanofibers will decrease 5,48 . One interesting findings demonstrated here indicate that distance has a double effect on the nanofibers size ( Fig. 6(d)). The increase in the distance between needle and collector was accompanied by an increase in the size of nanofibers, but electrospinning distance more than a certain value exhibited a decrease in the diameter of nanofibers. This behavior is explained by the decrease in solvent evaporation time, before nanofibers deposited on collector versus, the diameter fiber reduce with increasing the distance owing to that solvent evaporation time increased and jet stretched before deposited on the collector 51 . A review of other studies reported that an increase in the distance between needle and collector causes a decrease in the nanofibers size. This is probably owing to breaking  www.nature.com/scientificreports www.nature.com/scientificreports/ the formed jet into two or more jets, leading to finer nanofibers 9 . With regard to the recent progress in electrospinning, the findings suggest that modeling methods such as ANN techniques can be important implications for controlling and prediction the diameter of electrospinning nanofibers, which is a critical factor in determining the properties of nanofibers.

conclusions
In this research, the application of the multiple regression analysis and MLP model was studied to predict the electrospinning PCL/Gt nanofiber diameter. The finding of this study suggests that an ANN technique can be used quite effectively for prediction the diameter of nanofibers. The main application of PDNFM MLP is to predict the diameter of nanofibers based on electrospinning processing parameters. As a decision support system tool, PDNFM could assist researchers, engineers, and expert's lab in fabricating electrospinning nanofibers with defined fiber diameter. It can be worthwhile in the aspect of economic, time, and scientific aims. However, it is interesting to note that the effects of electrospinning parameters are highly depended on the type of polymer used. Also, it is suggested that future research, which takes more parameters into account, will need to be undertaken with higher accuracy over a more extensive application range.

Data availability
This article has no additional data.