Introduction

Friction stir welding (FSW), a relatively new solid-state welding process, is now widely used in aerospace, shipbuilding, automotive, and other industries.1,2,3 In this process, a rotating rigid tool with a shoulder and a pin is inserted in the joint under pressure.2 It generates heat by friction between the tool and workpiece, softens the alloy but does not melt it. The softened material flows around the tool pin and forges a joint behind the pin.2,3 Since FSW does not involve melting, it avoids the common fusion welding problems such as solidification cracking and loss of volatile alloying elements.1,3 Despite its many advantages, its success depends on a confluence of many complex physical processes that influence the three-dimensional distribution of temperature, velocities of the plasticized material, strain rate, and other mechanical and metallurgical variables. Changes in the temperature and velocity fields, strain rates, and other parameters may result in the formation of voids in the component at a location near the tip of the pin.4 Voids in the welded components affect both the mechanical properties and the serviceability of the joints.

Because of the importance of this problem, significant efforts have been made to understand and develop a theory that can help in mitigating the voids. However, the complexity of many simultaneously occurring physical processes and the large parameter space of the welding variables and materials have so far precluded the establishment of a unified criterion that can be used to avoid the void formation. Efforts have been made to experimentally determine the effects of welding parameters on void formation in commonly used aluminum alloys.4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 Tracers have been used to examine experimentally how the flow of materials affect the void formation.20,21,22 The time lapse determination of the tracer’s position in some cases indicated that the voids occurred near the bottom of the pin where the flow of the plasticized material was interrupted. Phenomenological models were also used to examine their effectiveness to mitigate void formation in FSW.17,23,24,25,26,27,28,29 It was also suggested that the peak temperature had to be within 80–90% of the solidus temperature of the alloy to avoid the void formation.25 Diverse experimental studies such as ultrasonic30 and radiographic31 detections of voids and theoretical analysis of forces26,32,33 were undertaken to understand the origin of void formation. Although progress made in the previous work to identify the important variables such as temperature, strain rate, torque, and maximum shear stress on the pin that affect the void formation, no rigorous mechanistic explanation, or criterion for the void formation have emerged.

Here we examine the effectiveness of supervised machine learning (ML) algorithms to forecast the void formation during FSW. One hundred and eight sets of data for the FSW of three aluminum alloys, AA2024, AA2219 and AA6061 obtained from the peer-reviewed literature4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 have been analyzed using neural network (NN) and decision tree (DT) to examine the effectiveness of ML to mitigate void formation. Vibration and poor fixtures may affect the quality of welds, at least in principle. However, FSW is routinely performed using different makes and models of machines and there is no evidence in the literature that the void formation is affected by the selection of mainstream, reliable FSW machines. Therefore, it is reasonable to consider all experimental data for training, validating, and testing of the ML algorithms without any bias from the make or model of the machines. We select NN and DT over other ML algorithms such as K-nearest neighbor (KNN), support vector machines (SVM), and random forest (RF) because of their usefulness for this investigation. For example, NN performs accurately even for a relatively small volume of data and is computationally efficient. The DT is a simple and easy-to-use method, which can handle both numerical and categorical data with small amount of dataset. In contrast, KNN provides accurate results only for a huge number of data sets. SVM does not provide any model that can be used in future for predicting voids for new datasets. RF is suitable for multivariable outputs.34

The roles of welding parameters such as the welding speed, rotational speed, tool shoulder radius, plate thickness, axial pressure, pin tip and bottom radii, tilt angle, and material properties such as thermal diffusivity and yield strength on the void formation were examined. These data are easily accessible because welding parameters are generally measured and recorded in the shop floor anytime welding is undertaken, and no further work is needed to obtain them. An analysis of the data showed that if any one of these parameters is kept constant, adjustment of the other parameters may result in joints both with and without voids. In other words, all of these raw welding parameters showed almost the same influence on the void formation. Therefore, a ranking of these raw welding parameters to generate a classification DT was not undertaken. Instead, the data were analyzed using a Bayesian NN.

In many complex engineering systems, its behavior is often accurately described by a group of variables rather than the raw individual variables. An example is the well-studied problem of flow of a fluid in a pipe. In principle, the pipe diameter, average fluid velocity, and the density and viscosity of the fluid can predict if the flow is laminar or turbulent. However, it is well accepted that the nature of flow structure is much better represented by the causative Reynolds number than the four aforementioned individual variables. In FSW, the process variables, temperature dependent thermophysical properties and the tool and specimen geometry constitute a very large parameter space where the effects of individual variables are masked by the complexity of the flow of plasticized material that affects the void formation. Since the temperature, strain rate, maximum shear stress on the tool pin, and torque are known to affect the flow of the plasticized material,1 they are likely to be closely correlated with the void formation. Values of these variables are needed to understand the formation of voids during welding. A solution is to use well-tested mechanistic models of FSW35,36,37,38,39 that can calculate the values of these variables for each sets of raw welding parameters, tool and specimen geometry and alloy system. These mechanistic models can be either simple and easy to use reduced order analytical models40,41 with simplifying assumptions to reduce computational work or rigorous multiphysics-based numerical models.35,36,37,38,39 These models will enable an evaluation of the role of these causative variables that affect material flow, which can be used in both DT and NN.

The comprehensive mechanistic models35,36,37,38,39 of FSW solve the equations of conservation of mass, momentum, and energy to obtain the temperature and velocity fields in three-dimensions, strain rate, shear stress on the pin, and the torque. Figure 1 illustrates the input and the output of the mechanistic models and how the variables that affect the flow of materials are used in DT and NN for the FSW of aluminum alloys. Typical results of the strain rate,37 temperature history, shear stress, and torque are shown in the Fig. 1a–d, respectively. Higher strain rate is found at advancing side as shown in Fig. 1a. The temperature at a monitoring location increases from room temperature (298 K) to the maximum value (608 K) and then slowly decreases to the room temperature as explained in Fig. 1b. The shear stress on the tool pin obeys sine function, and achieves the maximum at retreating side with 90° of welding direction in Fig. 1c. Torque decreases at higher heat input, with both reduction in welding speed and increase in rotational speed, as shown in Fig. 1d. Because of the simultaneous rotational and translational motion of the tool, the strain rate, temperature, and stresses are all asymmetric about the axis of the pin.29

Fig. 1
figure 1

Schematic representation of this research. The components are FSW process, mechanistic models, and machine learning methods (neural network and decision tree). Corresponding experimental test is in the literature.5,6 a The distribution of strain rate plotted for 4-mm thickness above pin tip. b Temperature-time curve during FSW process. c The distribution of shear stress on tool pin with degrees. d The distribution of torque with heat input

One hundred and eight sets of data used in the calculations were for the FSW of three aluminum alloys that had different chemical composition. In order to avoid compositional effects, the data for each alloy were normalized with dividing each variable by its maximum value for the alloy. These normalized values of local temperature, strain rate, torque, and maximum shear stress were used to train, validate, and test the NN and DT. Starting from the results of the supervised ML, this work aimed to identify a metric that can be used to find an accurate and effective way to predict the formation of voids and avoid them. All important welding parameters and material properties were used to train, validate, and test an NN. They included raw unprocessed welding parameters such as welding speed, rotational speed, tool shoulder radius, plate thickness, axial pressure, pin tip and bottom radii, tilt angle, as well as thermal diffusivity and yield strength. To improve the prediction accuracy and better understanding of the void formation, four calculated causative variables obtained from analytical and numerical models have been employed using NN and DT, as the second and third types of data sets, respectively. These four calculated causative variables are temperature, strain rate, torque, and maximum shear stress on the tool pin. Although the different methods are tested here for mitigating defect formation in FSW, they are generic in nature and can be extended for any other multifactorial manufacturing issues.42,43,44,45,46,47,48,49,50,51,52,53,54,55,56

Results and discussion

The effectiveness of a NN- based supervised ML algorithm to forecast the effects of variation of welding parameters and material properties on the formation of voids were examined. The welding parameters were recorded every time an FSW was conducted. Because of the accessibility of the data, it is useful to correlate their values with void formation. One hundred and eight data sets that contained 43 welds with voids and the remaining sets without any voids were examined. The occurrence of voids was decided based on their appearance in the transverse sections of the welds. The target output for the analysis was set as ‘1’ and ‘0’ to represent joints with and without voids, respectively. Among all data, 63 were randomly selected to train the NN. From the remaining data, 15 and 30 sets were selected for the validation and testing,34,42 respectively. The data sets for training, testing, and validation were selected randomly, with the condition that each data set represented the same percentage of welds that contained voids. The accuracy of this method for predicting the void formation was found to be 83.3%. The reason of this modest outcome is not known. This prediction is significantly better than the random guess (50%) and shows that the welding parameters have a hidden connection with the formation of voids. However, no explicit relation between the welding parameters and the void formation has been uncovered. The individual welding parameters affect many simultaneously occurring physical processes during FSW which in turn affects the void formation. Given the constraints of the available data-base being taken from peer-reviewed literature, it is worth asking if the prediction efficiency can be improved.

From the previous experimental and modeling research,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,23,24,25,26,27,28,29 it is known that the welding parameters affect important variables such as temperature, strain rate, torque, and maximum shear stress on the tool pin that affect properties of the plasticized alloy and its smooth flow. Since the disruption of the smooth flow is thought to cause the void formation, the values of these four variables are important for the void formation.

The inadequate and discontinuous flow of the plasticized material around the tool pin is thought to be a cause of void formation. Inadequate flow of plasticized material results from insufficient heat input due to low rotational speed for a given welding speed, inappropriate shoulder diameter, large plate thickness, and other improper welding parameters.5,32 Low frictional force and insufficient flow stress in the area behind the tool pin and near the pin tip due to reduced velocity and low temperature cause inadequate material flow from retreating side to advancing side.5 High strain rate found in the advancing side near the pin tip indicates high velocity gradient and affects material flow. The flow stress of the plasticized material is affected by the temperature and velocity fields in the weld zone.1 Understanding and controlling the flow of plasticized material flow is the key to reduce void formation. The plasticity of the alloy depends on the temperature of the alloy as well as the strain rate.37 High temperature ensures softening of the alloy to enable it to easily flow around the pin without disruption.36 In contrast, high local strain rates may result in nonuniformity in the flow of the plasticized material and may disrupt smooth flow and make the process susceptible to void formation.37 The difficulties in the flow of the plasticized material are reflected by high shear stress and torque on the tool pin.35 Therefore, high values of these two parameters indicate susceptibility to void formation. Temperature, strain rate, torque, and maximum shear stress on tool pin are recognized as the most important factors for the void formation. The effects of these four causative variables on the void formation are described in Fig. 2. It shows that the void disappears at higher temperature but at lower values of strain rate, torque, and shear stress on the tool pin. The values of these four quantities, although dependent on the welding parameters, are not always available from measurements. However, they can be calculated from verifiable mechanistic models, as indicated in Fig. 1.

Fig. 2
figure 2

Variations in causative variables for different joints. Variations in local temperature, relative strain rate (strain rate/rps), pin total torque and maximum shear stress on tool pin for void free joint and joints with small and large voids. Welding speed is constant as 100 mm/min. The rotational speed and corresponding transverse sections of the joints have been provided from the literature.4 Temperature, relative strain rate, pin total torque, and maximum shear stress are calculated with numerical models. These local temperature and strain rate values are taken from where the experimental void happens, which can reflect the relation of material flow state and void formation

The easiest way to calculate the values of these important potentially causative factors of void formation is to use a reduced order, back of the envelope analytical model. Details of these calculations are described in the “Method” section. Therefore, the four causative variables were calculated with the analytical model, and the computed values were used for forecasting the defect formation using a DTand an NN. Local values of temperature and strain rates were calculated near the tool pin tip in the advancing side where the voids typically formed.4 The four variables were calculated for all experimental cases adapted from the literature, normalized with respect to their maximum value, and plotted in Fig. 3 as a function of the linear heat input.

Fig. 3
figure 3

Distribution of the normalized results using analytical models. a Local temperature, b local relative strain rate, c pin total torque and d maximum shear stress on the tool pin with heat input per unit length of the weld. Heat input per unit length represents the ratio of heat input to welding speed. Relative strain rate represents the ratio of strain rate to rotational speed. All data points correspond to the experiments and are adapted from the literature.4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19

Implementation of the NN for these four causative variables is similar to the NN for the unprocessed welding parameters described earlier. The accuracy of this method has been improved from 83.3% when using welding parameters to 93.3% when the four computed variables were used. Unlike the unprocessed welding parameters, these four causative variables clearly correlate better with void formation, and their utilization in ML provided more accurate results.

It is noteworthy that all these four variables exhibit a threshold value that decides the void formation. For example, the welds corresponding to the normalized local temperature less than 0.93 are susceptible to the void formation. Consequently, in a third trial, the same data sets were classified using DT. All the normalized four variables were marked with asterisk. The threshold values of the four variables based on which the decisions were made explained in the ‘Method'' section and presented in Fig. 3. Among the 108 calculated data points, 63 of them were selected to train the DT. From the remaining data points, 15 and 30 of them were selected for validation and testing respectively.34,42 The generated DT in this method is provided in Fig. 4a. The uniqueness of this method is that it provides the relative importance of the four variables on void prediction. Details about random selection and ranking variables are presented in the ‘Method'' section. The variable with the highest information gain (IG) is considered as the root node of the DT. In Fig. 4a, for the first-time ranking, the maximum shear stress on the tool pin has the highest IG and is selected as the first root node. The void is more likely to be formed with high maximum shear stress (τm), high pin total torque (MT), low temperature (T), and high relative strain rate (εr). However, the testing accuracy of void prediction is 90% which is less than that using a NN and the same input data set. The main disadvantage of the DT-based ML is that the structure of the tree is significantly dependent on both the normalized results and the threshold values. The relative inaccuracy of the reduced order, simplified analytical model also affects the results.

Fig. 4
figure 4

Decision trees. The decision trees (DT) are based on classification scheme to predict the void formation in FSW joints using a reduced order analytical models and b rigorous numerical model. The structure of the DT depends on the normalized results and the threshold values for the four causative variables. Therefore, all the four causative variables were necessary to generate the DT even only three of them were selected as the classified nodes

To improve the accuracy of both the NN-based and DT-based ML, the four causative variables were next calculated using a well-tested numerical model of FSW, shown in Fig. s1 in Supplementary Information. The rigorous numerical model captures the complex physics of heat transfer and flow of plasticized materials around the tool pin and thus accurately calculates the four causative variables. Implementation of the NN for this method was same as that described before. Because of the accurate predictions of the causative variables, the testing accuracy of this method is improved from 93.3% when the back of the envelope reduced order analytical model was used to 96.6%. However, considerable computational work was required for generating the variables from the numerical models.

Finally, the four causative variables computed from a well-tested numerical model were used in a DT. The construction and implementation of the DT are the same as what was used before. The DT is shown in Fig. 4b. The structure of the DT depends on the normalized results and the threshold values. Therefore, all the four causative variables were necessary to generate the DT even though only two of them were selected as the classified nodes in Fig. 4b. The testing accuracy of this method in the void prediction was found to be 96.6%.

The four causative variables are ranked based on their hierarchical influence on void formation. Temperature and maximum shear stress show the most important influence on the void formation, followed by torque and strain rate. The importance of temperature is clear from its effect on the strength of the material. Furthermore, the temperature also affects the flow stress. The shear stress is a measure of the nature of the flow. For example, a high value of shear stress indicates a difficulty of the tool pin in influencing the local flow of plasticized material. Voids can be mitigated by producing smooth material flow in the stir zone. Therefore, both temperature and the maximum shear stress are important factors for the void formation.

The accuracies of the aforementioned five methods were compared based on their accuracies in the void prediction using the confusion matrices46,48 in Fig. 5. The basic structure of the confusion matrix is explained in Fig. 5a. The figure shows that the matrix is employed to display the number of correct and incorrect predictions in comparison to the target experimental results and the calculated results. The results of the first three sets of results show that the accuracy improves when the raw-welding parameters are replaced by causative variables, which capture the conditions of void formation more accurately. The comprehensive well-tested numerical models provide the best results but require more intensive calculations.

Fig. 5
figure 5

Confusion matrices. Confusion matrices of predicted output using machine learning and target output classification results for 108 experimental results, assigning ‘0’ for no void and ‘1’ for void. a The basic structure of the prediction and target confusion matrices, the results of b method one, c method two, d method three, e method four, and f method five

The results from several existing independent studies suggest that the void formation is caused by inadequate material flow often resulting from inappropriate heat input and friction force.5,32 However, the easily measurable welding parameters cannot be directly correlated with void formation. In this paper, ML, with its outstanding advantages in solving multiple factors problems, has been used to explore and rank the factors that affect void formation. Four causative variables that are known to affect material flow are computed using mechanistic models and the computed values are correlated with the occurrence of void formation using ML. The causative variables have been ranked based on their importance on the void formation. Furthermore, we identify the conditions for void formation with reasonably good accuracy, which is helpful for engineers to produce void free FSW joints.

In summary, void formation in the FSW of aluminum alloys was investigated using two machine leaning algorithms, a NN and a DT. One hundred and eight points of independent experimental data available in the peer-reviewed literature were analyzed. Both the raw welding parameters and potentially causative computed variables such as temperature, maximum shear stress on tool pin, torque, and strain rate were investigated. The observations are summarized in Table 1. Below are the specific findings.

Table 1 Detail information of the five methods used

(1) The variables that affect void formation during FSW of aluminum alloys are found to be temperature near the tool pin, maximum shear stress on the tool pin, torque and strain rate, in decreasing order of influence.

(2) The simplest methodology examined for predicting voids was to feed raw welding parameters and material properties to an NN capable of providing a classification scheme that outputs binary results (void and void free). This approach was able to forecast the void formation with 83.3% accuracy.

(3) The four potentially causative variables of void formation, temperature, maximum shear stress on tool pin, torque, and strain rate, are superior to the raw welding parameters in predicting void formation during FSW. When these causative variables computed from reduced order analytical models and used as input data sets for ML algorithms, the accuracies of the void formation predictions were 93.3% and 90% for NN and DT algorithms, respectively.

(4) When the void formation was correlated with the potentially causative variables, i.e., the local temperature near the pin tip, maximum shear stress on the tool pin, torque, and strain rate, computed from a mechanistic numerical model, both the NN and DT approaches could predict defect formation with 96.6% accuracy.

Methods

Data collection for the welding parameters

The 108 independent FSW experimental results on void formation during FSW of three commonly used aluminum alloys are collected from the literature and marked as ‘1’ and ‘0’ for welds with and without voids, respectively. The temperature and velocity fields which affect the smooth flow of plasticized alloy depend on welding parameters, welding speed, rotational speed, tool shoulder radius, plate thickness, axial pressure, pin tip and bottom radii, tilt angle, as well as material properties such as thermal diffusivity and yield strength. For some cases, in which the axial pressure on the tool and tilt angle are not reported in the literature, reasonable values that commonly used are adopted. The estimated values of axial pressure and tilt angle are marked with asterisk in Table 1 of the Supplementary Information.

Potential causative variables computed from analytical models

The calculations of the reduced order analytical models started with the estimation of the velocity field.40,41 Details of these calculations can be found in our previous publications35,40 and are not repeated here. The shape and size of the calculation domain depended on the shoulder radius, pin tip radius, and plate thickness. For simplicity, material properties, sliding, and friction coefficients were assumed to be temperature independent. Within the velocity field, 12 monitoring points were set for detecting local velocity and strain rate. These points were at advancing side, retreating side, front of the tool and the trailing end, and at three elevations, around the pin tip, the mid-height of pin, and around the pin root of the tool pin. These 12 local results were used for torque and maximum shear stress calculations. The local temperatures near the voids were estimated using the heat conduction equation for thick plate as follows.57

$$T - T_0 = \frac{Q}{{\rho C_{\mathrm{p}}\left( {4\pi \alpha t} \right)^{1.5}}} \ast {\mathrm{exp}}\left( {\frac{{ - R^2}}{{4\alpha t}}} \right)$$
(1)
$$Q = \frac{2}{3}\pi \left( {\delta \tau + \left( {1 - \delta } \right)\mu _{\mathrm{f}}P_{\mathrm{N}}} \right) \ast w\left( {\left( {r_{\mathrm{s}}^3 - r^3} \right)\left( {1 + \mathrm{tan}\ \phi } \right) + r^3 + 2r^2l_{\mathrm{P}}} \right)$$
(2)
$$\alpha = \frac{k}{{\rho C_{\mathrm{P}}}},R^2 = x^2 + y^2 + z^2$$
(3)

where Q is the total heat input, the τ is shear stress at yielding. δ is the coefficient of slip. μf is friction coefficient. PN is axial force. rs and r are shoulder radius and pin radius, respectively. lp is pin length. R is the distance from the center of the tool pin to the calculated location in the middle high of workpiece. w is rotational speed. k, ρ, and Cp are the thermal conductivity, density, and specific heat of the work plate material, respectively. α is material coefficient. ɸ is the tool pin tilt angle. T0 is the room temperature, and t is the time used to achieve steady welding state.57

Calculations of potential causative variables using numerical models

Well-tested numerical model of FSW solves the equations of conservation of mass, momentum, and energy. Their construction, testing, and applications have been reported in detail in our previous publications35,36,37,38,39 and are not repeated here. The rigorous numerical model, used in this research calculates heat generation rates, transient heat transfer in three-dimensions and plasticized material flow around the tool pin. Temperatures, strain rates, shear stress, and torque were calculated using a well-tested numerical model.36,37

Random selection, accuracy assessment, and selection of threshold values

Data sets were randomly selected for training, validation, and testing. Among the 108 data points, 43 sets were for welds with voids. Same percentages of the defective welds were included in the training, validation, and testing data sets.34,42 Both void and void free data points were selected randomly to avoid the less-fitting and over-fitting for training. Sixty-three randomly selected data points that included 25 welds with voids were utilized for training. After training, 15 data points were randomly selected as the validation data that contained 6 defective joints. The remaining 30 sets were used for testing. The accuracies of training, validation, and testing of these five methods are listed in Table 1.

The distribution of the normalized results of all variables from the reduced order analytical model and rigorous numerical model are plotted in Fig. 3 and Fig. s1 (in the Supplementary Information). The threshold values were randomly selected between 0 and 1. For the strain rate, torque, and maximum shear stress, we assigned ‘1’ for void if the normalized results were above the threshold values, and ‘0’ for void free data below the threshold values. However, for temperature, we assigned ‘1’ for void if the results were below the threshold value and ‘0’ for void free if the results were above the threshold value, because unlike the other three variables, higher temperatures indicate void free welds. The agreement between the assigned values (‘0’ and ‘1’) and the target values indicates that the classification schemes can be predicted correctly. For each variable, the threshold value with the least number of wrong predictions (highest classified accuracy) was selected as the best threshold value and used in ML algorithms.

Neural network

The training data set was used to fit a hyperbolic tangent function by minimizing the logarithmic error.43 The first step aimed to make the actual response of the network move closer to the desired target response in a statistical sense. Second, the actual outputs were continuous values from 0 to 1, which needed to be classified into ‘0’ and ‘1’ for joints without and with voids. The best threshold value, that has the highest classification accuracy, was used for the training, validation, and testing data sets.

The number of hidden nodes for the NN was usually varied from 4 to 8 (twice the number of input nodes). The output of a node was computed with the following hyperbolic tangent function.43

$$y = {\mathrm{tanh}}\left( {\mathop {\sum}\nolimits_{i = 1}^n {w_i{\mathrm{x}}_{\mathrm{i}} + {\mathrm{\theta }}_i} } \right)$$
(4)

where xi and y are the input and the output of a node,wi is the weight, n is the total number of nodes and θi is the bias dependent on the ith input.43 The NN model with the least log predictive error43 was selected as the best model and used for the validation and testing data sets.

$${\rm{LPE}} = \frac{\beta }{2}\mathop {\sum}\nolimits_{i = 1}^n {\left( {d_i - y_i} \right)^2 + \frac{n}{2}\ln \left( {\frac{{2\pi }}{\beta }} \right)}$$
(5)

where β is the regulariser term. Details of the implementation of the NN for FSW are discussed in our previous publication43 and are not repeated here.

Decision trees

Four causative variables were employed as classifiers using binary DT.42 Many available methods can be used to rank variables. Here the root and child nodes were selected according to the IG depending on the entropy.42,47 In every ranking, the variable with the highest IG was picked out as the node. The selection of the proper threshold value “p” is the same as that for NN. If the answer of “xi > p?” is yes, then follow the left-hand child node, otherwise, it will go to the right-hand child node. The objective of each node is to split the next child nodes and make them as homogeneous as possible. Each branch takes care of one possibility, which contributes to accurate and effective prediction. The leaf nodes take only two values, of the target output, ‘1’ for void and ‘0’ for void free.