(a) An n = 60 sample with one predictor variable (X) and each point belonging to one of three classes (green, dark gray, blue). Three possible splits are shown at X = 20 (X < 20 and X > 20), X = 38 and X = 46 along with the number of points in the resulting subsets (n1, n2), their breakdown by class (colored numbers), the purity of the subset (Ig(S1), Ig(S2)) and information gain for the split (IGg), based on the Gini index. The sample's Gini index is Ig = 0.67. (b) The information gain based on Gini index (IGg), entropy (IGe) and misclassification error (IGc) for all possible first splits. The maxima of IGg, IGe and IGc are at X = 38, 34 and 29–34, respectively. (c) The decision tree classifier of sample in a based on IGg. Large text in each node is the number of points colored by predicted class. Smaller text indicates class membership in each subset. N, no; Y, yes.