Prediction of heavy-section ductile iron fracture toughness based on machine learning

The preparation process and composition design of heavy-section ductile iron are the key factors affecting its fracture toughness. These factors are challenging to address due to the long casting cycle, high cost and complex influencing factors of this type of iron. In this paper, 18 cubic physical simulation test blocks with 400 mm wall thickness were prepared by adjusting the C, Si and Mn contents in heavy-section ductile iron using a homemade physical simulation casting system. Four locations with different cooling rates were selected for each specimen, and 72 specimens with different compositions and cooling times of the heavy-section ductile iron were prepared. Six machine learning-based heavy-section ductile iron fracture toughness predictive models were constructed based on measured data with the C content, Si content, Mn content and cooling rate as input data and the fracture toughness as the output data. The experimental results showed that the constructed bagging model has high accuracy in predicting the fracture toughness of heavy-section ductile iron, with a coefficient of coefficient (R2) of 0.9990 and a root mean square error (RMSE) of 0.2373.


Effect of C
C is the basic element of heavy-section ductile iron and helps in graphitization, and the general content of Carbon is controlled between 3.5 and 3.9 wt%.Too high carbon content leads to graphite floating, too low leads to low spheroidization rate and reduced mechanical properties.

Effect of Si
Si is strong graphitization element, which can effectively reduce the white iron tendency, increase the amount of ferrite, and improve the roundness of graphite ball.However, silicon will also increase the ductile-brittle transition temperature and reduce the impact toughness.Therefore, the silicon content should not be too high, generally controlled between 1.7 and 3.8 wt%.

Effect of Mn
The main role of Mn is to increase the stability of pearlite and promote the formation of carbides, these carbides segregate at grain boundaries, which greatly reduces the toughness of ductile iron.The preparation of heavysection ductile iron generally requires that it has a ferrite matrix to obtain high fracture toughness.However, due to the long cooling time of the core, it is easy to generate a large amount of pearlite and affect its toughness.Generally, the Mn content is limited to less than 0.5 wt%.

Effect of cooling rate
For heavy-section ductile iron, due to the increased wall thickness, casting solidification is slow, spheroidal fading, coarse graphite, chunky graphite, elemental segregation and other defects easily occur, and the microstructure and fracture toughness of heavy-section ductile iron are significantly different from those of normal ductile iron.To obtain high fracture toughness heavy-section ductile iron, water cooling is generally used to increase the cooling solidification rate.However, because the actual production of the heavy-section ductile iron wall thickness is different, even with the use of water cooling, the solidification time of the core test block is often more than 150 min, and the method cannot unlimitedly reduce its cooling rate to increase its fracture toughness.

The relationship between micro-structure and four factors
In order to explore the relationship between the four factors (C content, Si content, Mn content, cooling rate), fracture toughness and the microstructure, samples with different (C content, Si content, Mn content, cooling rate) were selected for research, as shown in Fig. 1. Figure 1 shows the effects of C, Si, Mn and cooling rate (part of the measured data is selected) on the microstructure of heavy-section ductile iron.Figure 1a,b have different C contents, which are 3.3 wt% and 3.6 wt%, respectively, and the other components are the same.Figure 1c,d have different Si contents of 1.9 wt% and 2.3 wt%, respectively, and the other components are the same.Figure 1e,f have different Mn contents, which are 0.1 wt% and 0.7 wt%, respectively, and the other components are the same.Figure 1g,h have different cooling rates, 145 min and 265 min, respectively, and the other components are the same.
It can be seen from Fig. 1a,b that an appropriate increase in C content can increase the promotion of graphite nucleation and increase the number of graphite spheres.When the Si content reaches 2.3 wt%, the number of graphite spheres increases and the roundness increases, as shown in Fig. 1c,d.The increase of Mn content leads to a significant increase in the number of pearlite in the matrix structure, and a large amount of chunky graphite is produced, as shown in Fig. 1e,f.When the cooling rate of heavy section ductile iron is accelerated, the number of graphite balls is greatly increased and the graphite morphology is improved, as shown in Fig. 1g,h.
Figure 2 is the fracture morphology of fracture toughness test of the measured data in this paper (part of the measured data is selected), the composition of each sample is the same as Fig. 1.
It can be seen from Fig. 2a,b that an appropriate increase in C content can increase the number of graphite spheres, and a large number of graphite spheres play a role in releasing stress concentration when subjected to external forces, and the heavy-section ductile iron has more dimples and is more inclined to ductile fracture.It can be seen from Fig. 2c,d that when the Si content reaches 2.3 wt%, the increase in the number of graphite spheres and the increase in roundness also lead to the development of the heavy-section ductile iron from ductile-brittle mixed fracture to ductile fracture.The increase of Mn content leads to a significant increase in the number of pearlite in the matrix structure, and a large amount of chunky graphite is produced, which leads to a large number of cleavage planes and river patterns on the fracture surface, and the fracture morphology changes from ductile-brittle mixed fracture to brittle fracture, As shown in Fig. 2e,f.When the cooling rate of heavy-section ductile iron is accelerated, a large number of small and round graphite balls make the fracture morphology show a typical ductile fracture, As shown in Fig. 2g,h.
Figure 3 shows the measured data in this paper.The influence of C, Si, Mn and cooling speed (selected part of the measured data) on the fracture toughness of heavy-section ductile iron was investigated.Figure 3 shows that the chemical elements C, Si, Mn on the fracture toughness of heavy-section ductile iron present a nonlinear relationship, and only the cooling rate on the fracture toughness effect shows a linear relationship.With increasing cooling rate, the fracture toughness increases.This leads to the preparation of high fracture toughness heavysection ductile iron parts in actual production, and a large number of tests need to be carried out to explore the combination of a reasonable chemical composition and forced cooling method.

Test sample preparation
In this paper, a total of 18 large heavy-section ductile iron test blocks with different chemical compositions were cast.Benxi Q12 pig iron, 75% silicon iron, steel grade 45 and graphite powder were melted in a mediumfrequency induction furnace, and a spheroidizing treatment was applied using a Ce-Mg-Si nodulizing agent.The molten metal was poured into sand using furan resin sand to obtain heavy-section ductile iron cubic test blocks with a size of 400 mm × 400 mm × 400 mm.The chemical composition of nodulizing agent is shown in Table 1, and 18 heavy-section ductile iron test blocks with different C, Si, and Mn contents were labeled Casting 1, Casting 2, Casting 3, and Casting 18, and their chemical compositions are listed in Table 2.
Four positions were selected from the edge to the heart of the 18 test blocks as the typical cooling rate site sampling of heavy-section ductile iron, as shown in Fig. 4. A total of 72 specimens were sampled for microstructure observation and fracture toughness measurement.The cooling conditions of the 18 test blocks were the same, and the temperature was measured by using platinum-rhodium thermocouples and the configuration king temperature measurement system.Figure 5 shows the casting temperature measurement process and the solidification time of the 4 temperature measurement positions.From Fig. 5c, it can be seen that position 1 has the fastest cooling speed, with a solidification time of 135 min; followed by position 2, with a solidification time of 220 min; and position 3 and position 4, with solidification times of 255 and 265 min, respectively, which each exceed more than 250 min.In this paper, the fracture toughness prediction of heavy-section ductile iron was investigated, the chemical elements C, Si, Mn and cooling time were the four factors used as input data for machine learning, and the  www.nature.com/scientificreports/

Fracture toughness test of heavy-section ductile iron
The fracture toughness was tested at room temperature using an MTS 809 electrohydraulic servo material testing machine according to the standard GB/T4161-2007.The dimensions of the fracture toughness compact tensile (CT) specimens are shown in Fig. 6.In this paper, six machine learning methods were selected to study the influence of C, Si, Mn and the cooling rate on the fracture toughness of heavy-section ductile iron, and corresponding machine learning model of regression prediction of fracture toughness of heavy-section ductile iron was constructed.

XGBoost model
XGBoost is a parallel regression tree model based on the idea of boosting.Boosting refers to the weighted summation of a number of existing classifiers to obtain the final classifier.The XGBoost model was improved on the basis of the gradient descent decision tree (GBDT) by Chen 30 .In this model, each nonleaf node represents a feature, and leaf nodes represent a kind of label or decision result.When applying the decision tree, the samples to be predicted are examined for judgment conditions according to the feature values corresponding to the nodes to determine the next node position and judgment conditions and iteratively until a definite decision result is obtained.
Since the decision tree structure is simple and logical, it is prone to overfitting.Therefore, a random forest approach to reduce overfitting has been derived.The random forest constructs multiple random training sets  www.nature.com/scientificreports/and trains a weak learner decision tree based on them.Then, the decision trees are integrated to obtain a strong learner decision tree with superior performance to avoid overfitting 31 .Since the integrated decision trees are independent of each other and without feedback, a gradient descent decision tree method was derived.Each tree in the GBDT (gradient-boosted decision tree) is fitted using the residuals (i.e., error-free observations) of the previous tree, and the final result is determined by the sum of the results of all trees.However, this also makes it impossible to perform parallel operations.XGBoost is further extended based on the GBDT method by presorting and saving the training data before the model is trained and using the sorted data in the iteration process 32 .This model calculates the feature gain at the time of new node selection and selects the node with the larger value to split and form the next layer of child nodes.Due to the use of prearranged data, XGBoost can split multiple features at the same time, thus realizing parallel operation and saving model training time.At the same time, its objective function not only includes the common loss function but also adds a regularization term and uses the column method in random forest, which reduces overfitting and accelerates the speed of parallel computation.The most important core issue of the XGBoost algorithm is the objective function that combines the evaluation error of decision trees: In the equation, Loss is the loss function of the deviation between the actual value and the predicted value, y j is the actual value of the predicted data, ŷi is the result of the previous decision tree algorithm, x is the input data, thus fitting the algorithm results of multiple decision trees, while f m is the approximation function used   www.nature.com/scientificreports/by the decision tree.Meanwhile, γ is the regularization term for improving the penalty coefficient used by the decision tree, and the basic decision tree approximation function is: In this paper, the XGBoost model parameters are searched by the random search method, and fivefold cross validation is used to evaluate the model performance to finally arrive at the best model.The model graph is shown in Fig. 7.

Support vector regression
The regression algorithm is support vector regression or SVR.SVR is a supervised learning algorithm used to predict discrete values.Support vector regression uses the same principles as those in SVM.The basic idea is to map the data into a high-dimensional space and find the optimal hyperplane for regression and thus the line of best fit.Compared with traditional regression algorithms, SVR not only considers the degree of data fit but also the generalizability of the model.Thus, SVR can effectively deal with high-dimensional data and nonlinear data 33 .In the regression task, it is necessary to make the interval between the sample points that are farthest away from each other by the hyperplane the largest.That is, the SVR is given a restriction on the interval, and the deviation of the regression model f (x) from y must be ≤ ε for all sample points.This range is referred to as the deviation of the ε pipeline.
The SVR optimization problem can be expressed mathematically as 34 : is obtained such that it is as close as possible to y, and w and b are the parameters to be determined.The loss is zero when f (x) and y are identical.Support vector regression assumes that there is at most ε deviation between f (x) that can be tolerated.The loss is calculated when and only when the absolute value of the difference between f (x) and y is greater than ε, which is equivalent to the construction of a spacing band with a width of 2 ε centered around f (x).The training samples are considered to be correctly predicted if they fall into this spacing band, as shown in Fig. 8. (2)

Multi-layer perception model
The MLP regressor is a supervised learning algorithm.Figure 9 shows the MLP model with only 1 hidden layer; the left side is the input layer, and the right side is the output layer 35 .
The MLP is also known as the multilayer perceptron.In addition to the input and output layers, it can have more than one hidden layer in the middle.A linearly divisible data problem can be solved if there is no hidden layer 36 .The layers of the multilayer perceptron shown in Fig. 9 are fully connected to each other.Therefore, it is possible to solve the problem without any hidden layer.The bottom layer of the multilayer perceptron is the input layer, the middle layer is the hidden layer, and the last layer is the output layer.The input layer is a 4-dimensional vector, and the output is 4 neurons 37 .The neurons in the hidden layer are fully connected to the input layer.It is assumed that the input layer is represented by the vector Xi, and the output of the hidden layer is f (W kj X + b1), Wji is the weight (also known as the connection coefficient), b1 is the bias (in the design model in Fig. 9, the bias is 0 by default).The function f can be the commonly used sigmoid function or tanh function.Finally, the output layer is connected to the hidden layer through a sigmoid function or tanh function.This connection is analogous to a multicategory logistic regression, that is, softmax regression.Therefore, the output of the output layer is y = softmax (W kj Xj + b2), where X j represents the hidden layer output f (W kj Xj + b2), and b2 is the bias.

Gaussian process regression model
The Gaussian process model is a statistical tool for constructing stochastic processes and is widely used in the fields of machine learning, statistics and information processing.The Gaussian process model is based on the principles of probability theory.By modeling and predicting observed data, this model can provide estimates of the probability distribution of unknown data points 38 .
In the Gaussian process model, it is assumed that the stochastic process under study obeys a multivariate Gaussian distribution for any set of inputs.Each point in the input space is usually projected to a random variable in the output space, and the joint assignment of these random variables constitutes the probabilistic model of the Gaussian process 39 .
A Gaussian process model can be described by two underlying components: the mean value function and the variance function (also known as the kernel function).The mean value function represents modeling the overall dynamics of the stochastic overprocess, while the covariance function describes the correlation or similarity between different points 40 .
Given a set of input points and corresponding observations, predictions can be made using a Gaussian process model.The result of the prediction is a conditional probability distribution over the unknown data points, which www.nature.com/scientificreports/includes a predicted mean and a predicted uncertainty (variance).This estimate of uncertainty is quantified by the covariance function, which portrays the correlation between the input points, reflecting the uncertainty of the expected measurements 15 .Its mathematical expression is as follows: Here, x is the mean, and y is the variance.The expression is as follows: Considering that the general observations of the function are noisy, there is: where ε ∼ N(0, σ 2 n ) is the white noise with a variance of σ 2 n .The prior distribution of y can be expressed as: At the test dataset x * , the joint prior distribution of the observation set y and the prediction set ŷ can be expressed as: According to the above equation, the posterior distribution p(ŷ|x, y, x * ) can be obtained: where the output of the model prediction is the predicted mean ŷ * , and the uncertainty of the prediction model is reflected by the prediction covariance cov(ŷ * ) .The expression is as follows: The standard Gaussian process model can be represented as shown in Fig. 10.

Bagging model
Bagging regression is an integrated learning method that generates multiple subsets of training data by randomly sampling the training data with replacement and then uses these subsets to train multiple base learning devices 41,42 .The main idea of bagging regression is to improve the generalizability of the model by reducing the variance.The specific process is to generate multiple subsets from the training set X by random sampling in a relaxed manner and use each subset to train a base learner.Each base learner is a regression model, such as decision tree regression or linear regression 43 .The main idea of bagging regression is to improve the generalizability of the model by reducing the variance.After training, the prediction results of the base learners are averaged or weighted average, and the final prediction results are obtained Y. Bagging regression reduces the variance of the model and improves the generalizability of the model.Parallelization is possible at the same time, accelerating model training.The process is shown in Fig. 11.
Let the expectation of the single model be µ ; then, the expectation of the bagging regression is: (

Random forest regression model
The random forest is a more optimized combinatorial forecasting model proposed by professor Breiman in 2001 based on the idea of bagging, which is a new extension of randomness 44,45 .The random forest algorithm utilizes bootstrap sampling to construct different tree models with random samples.Then, the selection of the best node of each tree model is changed so that the variable node of each tree also has randomness, which in turn generates a number of regression trees with high prediction accuracy and uncorrelated regression trees 46 .
The predictions of all the trees are average to vote, representing the prediction of this regression model.The RF model structure is shown in Fig. 12.Let { {h(x, θ t )} , t = 1,2,…,T} be a T regression tree in a random forest, where θ t is a random variable obeying an independent homogeneous distribution and x is a dependent variable.Then, the regression result can be expressed as:

Experimental conditions and settings (experiments)
The model is trained on measured fracture toughness data of large ductile cast iron.The hardware environment of the simulation platform for this experiment is a CPU and a GPU.The CPU is an Intel (R) Core i7-10700 @ 2.90 GHz, and the GPU is an NVIDIA GeForce GTX 1660 SUPER with 64 GB of RAM.The simulation software environment uses PyCharm, Python version 3.8, and the Sklearn, NumPy, pandas and matplotlib libraries.
In this paper, two indicators, the RMSE and R 2 , are selected to measure the reliability and accuracy of the model prediction.The RMSE metric measures the error of the prediction value deviating from the true value, and a smaller value represents a higher prediction accuracy.The R 2 measures the ability of the model to fit the data.The closer to 1 this value is, the better the model fits, and the closer the 0 this value is, the worse the model fits.The formulas for RMSE and R 2 are shown in Eqs. ( 15) and ( 16).
Among them, the R 2 value is in the range of (0, 1).The closer to 1 the value is, the better the prediction result.In contrast, the closer to 0 the value is, the worse the prediction result.The RMSE value is in the range of (zero, + ∞).A value closer to 0 indicates a smaller prediction error.
Genetic algorithm is a heuristic search and optimization technique inspired by the processes of biological evolution in nature.It simulates the processes of biological evolution, such as selection, crossover, and mutation, to find the optimal solution or better solution to the problem.It is widely used in field of machine learning.In this study, the samples are trained using the fifty percent cross-validation method, and the hyperparameter optimization method is the genetic algorithm (GA).XGBoost has numerous hyperparameters that need to be manually set.In this paper, we selected several common hyperparameters: n_estimators (number of trees), max_depth (maximum tree depth), learning rate, and subsample.For the SVR model, we chose C (penalty parameter), gamma (kernel coefficient), and kernel type.The MLP Regressor model selected alpha (regularization parameter), max_iter (maximum number of iterations), and solver for optimizer selection.Gaussian Process Regression model parameters included alpha (value added to the diagonal of the kernel matrix during model fitting) and kernel type.For the Bagging model, we chose n_estimators (number of base estimators), max_samples (number of samples to train base estimators), max_features (number of features to train base estimators), and bootstrap (determines the sampling method for the sample subset).The Random Forest model selected n_estimators (number of trees) and maxdepth (maximum tree depth).The specific model coefficients and optimized coefficients are shown in Table 3.

Results and analysis
Figure 13 illustrates the comparison of the goodness of fit of six machine learning models optimized through genetic algorithms.As shown in Fig. 13, the optimized XGBoost, SVR, MLP, and Random Forest models exhibit poor fitting of predicted values to actual values, while the optimized Bagging and Gaussian process models show a good fit between predicted and actual values, approaching a single line.
Table 3. Parameter ranges and optimal parameters for each model.To further compare the accuracy of the Gaussian process and bagging models in predicting the fracture toughness of heavy-section ductile iron, the RMSE and R 2 are selected as the two metrics to measure the reliability and accuracy of the model predictions.As shown in Fig. 14, the R 2 values of the XGBoost, SVR, MLP regressor, Gaussian process and random forest models are 0.8662, 0.8901, 0.5942, 0.99 and 0.63, respectively, which are lower than the R 2 value of bagging (0.9990).The RMSEs of the XGBoost, SVR, MLP regressor, Gaussian process and random forest models are 3.085, 0.661, 4.5467, 0.3937 and 5.21, respectively, which are higher than the RMSE of 0.2373 of the bagging model.Therefore, the bagging model is better for predicting the fracture toughness of heavy-section ductile iron.The bagging model optimized by the genetic algorithm is applied to 72 fracture toughness specimens of heavy-section ductile iron with different compositions and cooling times constructed in this paper.The experimental results are shown in Fig. 15a.The results indicate that the projected value and the true value data points basically coincide with each other, and the prediction effect is accurate.Figure 15b shows the absolute value of the prediction error.The figure shows that the prediction error is basically less than 0.6, and the maximum does not exceed 0.8.
Bagging is an ensemble learning method that works by constructing multiple weak learners (typically decision trees) and combining their results to improve the overall model performance.On the other hand, genetic algorithm is an optimization algorithm that simulates the biological evolution process, searching for the optimal solution to a problem by mimicking evolutionary mechanisms such as natural selection, crossover, and mutation.Through genetic algorithm optimization, the Bagging model can better adapt to different data patterns, improve prediction accuracy for unknown data, and particularly demonstrate significant advantages for materials with complex structures such as thick-sectioned ductile cast iron.
In this paper, six machine-learning models optimized by the genetic algorithm are applied to 72 heavy-section ductile iron specimens with different compositions and cooling times.The experimental results show that the optimized bagging model has the best prediction effect, with an R 2 of 0.9990 and an RMSE of 0.2373.
The general C content of heavy-section ductile iron is controlled between 3.5 and 3.9 wt%, appropriate carbon content can make cast iron have good fluidity and lubricity, which is convenient for filling mold cavity.However, if the C content is too high to exceed 3.9 wt%, the plasticity and toughness of heavy-section ductile iron will be seriously reduced, and the ductile iron is prone to crack and fracture, the thermal brittleness increases 47 .The general Si content of heavy-section ductile iron is controlled between 1.7 and 3.8 wt%, the hardness, tensile strength and yield strength of heavy-section ductile iron can be improved by adding appropriate amount of silicon.However, the ductile-brittle transition temperature of ductile iron can be significantly increased and the plastic and toughness of heavy-section ductile iron will be reduced when the silicon content is too high to exceed 3.8 wt% 48 .The preparation of heavy-section ductile iron generally requires its ferrite matrix to obtain high fracture toughness.Due to the long cooling time in the core of heavy-section ductile iron, it is easy to generate a large amount of pearlite, and its Mn content needs to be strictly controlled, generally controlled within 0.5 wt% 49 .
In this paper, in order to explore the performance prediction of fracture toughness of heavy-section ductile iron products, the content range of C, Si and Mn of heavy-section ductile iron is expanded in the additional test, as shown in Table 4.
As shown in Table 4, additional experiments were conducted to prepare four heavy-section ductile iron specimens with varying carbon (C) content, silicon (Si) content, and manganese (Mn) content.A total of 16 samples were cut from the edge to the core of each specimen, and the cooling time was the same as that of the previous samples.The basic parameters for each model were the same as those in Table 3, and the same methods were employed.
Figure 16 compares the fitting degree of six machine learning models.From the figure, it can be seen that the fitting effect of the Bagging model optimized by genetic algorithm and the Gaussian process regression model is still better, approaching a straight line.
To further compare the accuracy of Gaussian Process and Bagging models in predicting the fracture toughness of heavy-section ductile iron, this study selected the root mean square error (RMSE) and coefficient of determination (R 2 ) as two indicators to measure the reliability and accuracy of the model predictions.As shown in Fig. 17, the R 2 values for XGBoost, SVR, MLP Regressor, Gaussian Process, and Random Forest models are 0.9061, 0.9435, 0.7105, 0.9818, and 0.7582, respectively, all lower than Bagging 0.9873.Additionally, the RMSE values for these models are 0.6936, 1.0537, 2.38, 0.5938, and 2.17, all higher than Bagging's 0.4993.The results indicate that after expanding the component range of heavy-section ductile iron products, the obtained results still meet the requirements for practical applications.

Conclusions
(1) In this paper, four factors affecting the fracture toughness of heavy-section ductile iron, such as C content, Si content, Mn content and cooling rate, are discussed.By using the isothermal section method, 18 cubic physical simulation specimens with different compositions and wall thicknesses of 400 mm were cast, and 72 specimens were prepared for microstructure observation and fracture toughness testing.In addition, the relationship between the above four factors and the microstructure and fracture toughness of heavysection ductile iron was discussed.www.nature.com/scientificreports/ (2) Aiming at the problems of a long preparation cycle, high R&D cost, and many nonlinear influences for high fracture toughness heavy-section ductile iron, a machine-learning-based fracture toughness prediction model for thick and large section ductile iron was established.The C content, Si content, Mn content and cooling rate are input, and the fracture toughness is output.(3) Compared with the XGBoost, SVM, MLP regressor, Gaussian process, random forest and other models, the bagging model has the best prediction effect, followed by the Gaussian process, with the R 2 values reaching 0.9990 and 0.99, respectively, and RMSE values of 0.2373 and 0.3937, respectively.These models can meet the design requirements of high fracture toughness heavy-section ductile iron for nuclear spent fuel storage and transportation containers and bases of wind power.

Figure 3 .
Figure 3.Effect of C, Si, Mn and cooling time on the fracture toughness of heavy section ductile iron (part of the measured data).

Figure 4 .Figure 5 .
Figure 4. Four positions in castings chosen for temperature measurement and specimen collection.

Figure 6 .
Figure 6.Dimensions of the fracture toughness specimen in mm.

Figure 10 .
Figure 10.Diagram of the standard Gaussian process model.

Figure 14 .
Figure 14.Evaluation indicators for each model.

Figure 15 .
Figure 15.Bagging model prediction results (a) Comparison of the true and predicted values, (b) absolute error.