A study on sex estimation by using machine learning algorithms with parameters obtained from computerized tomography images of the cranium

The aim of this study is to test whether sex prediction can be made by using machine learning algorithms (ML) with parameters taken from computerized tomography (CT) images of cranium and mandible skeleton which are known to be dimorphic. CT images of the cranium skeletons of 150 men and 150 women were included in the study. 25 parameters determined were tested with different ML algorithms. Accuracy (Acc), Specificity (Spe), Sensitivity (Sen), F1 score (F1), Matthews correlation coefficient (Mcc) values were included as performance criteria and Minitab 17 package program was used in descriptive statistical analyses. p ≤ 0.05 value was considered as statistically significant. In ML algorithms, the highest prediction was found with 0.90 Acc, 0.80 Mcc, 0.90 Spe, 0.90 Sen, 0.90 F1 values as a result of LR algorithms. As a result of confusion matrix, it was found that 27 of 30 males and 27 of 30 females were predicted correctly. Acc ratios of other MLs were found to be between 0.81 and 0.88. It has been concluded that the LR algorithm to be applied to the parameters obtained from CT images of the cranium skeleton will predict sex with high accuracy.

The main purpose of forensic anthropology is to reconstruct the biological profile of deceased individuals; that is, to predict sex, age of death, lineage and height based on the remains of skeletons 1 . Forensic sex prediction has taken a large place in literature since the late 1960s and identification of sex from human skeleton has been described as an important factor, even a key element in both forensic medicine and bio-archaeological context [2][3][4] . Sex prediction is an indispensable part of biological profile. Anthropologist uses the biomarkers of the skeletal system that vary between sexes to determine sex 5,6 .
It is noteworthy that studies have been conducted in literature for the estimation of sex almost with all bones of the human skeleton and that the accuracy of gender determination has been researched frequently by comparing with different populations. It can be seen that various bones such as femur 2,3 , patella 7,8 , mandible 9 , calcaneus 10 , metatarsal bone and phalanx 11,12 , occipital condyle 13 , hand bones 14,15 and sternum 16 are used in sex prediction. It has been reported in a large number of studies in literature that cranium and pelvis bones, which are considered to be the most dimorphic areas according to skeletal parts, can be used in sex prediction by using different assessment methods 4,10,[16][17][18][19] .
Identification of sex includes some inherent limitations that are affected by different factors such as ethnicity, socio-economic status, diet and geographic location. The inability to generalize the results obtained from a specific population, especially in skeletal parts such as cranium, to other populations and the need for populationspecific studies increase the interest in cranium and mandible in sex determination 4,20 . For these reasons, all techniques reported for identifying sex are specific to related studies and they may not be applicable to different samples or data sets 3 .
ML is a modern classifier that is used extensively in the field of engineering, and it is now gradually integrated in the field of health. These algorithms are classified as supervised, unsupervised and reinforcement. Supervised learning is algorithms that match the relationship between input and output, unsupervised learning is algorithms that match the characteristics of the data about which there is no information and reinforcement leaning is the algorithms that match the input data with desired characteristics 20 . Decision Tree (DT) algorithm is one of the simple, powerful, fast and frequently used data mining classification algorithms that processes the inputs by dividing them continuously 8,[21][22][23] . Logistic regression (LR) is a classification algorithm that uses the sigmoidal curve function to classify the relationship between output probability and parameters. Random Forest (RF) is an ensemble algorithm that can derive more than one decision tree within the system 24 . Extra Tree Classifier (ETC) is a superior method to RF, and this advantage is due to the random division of nodes and using all data as a training set 25 . Linear discriminant Analysis (LDA) is a classification algorithm that reveals the difference and relationship between classes 26 . Quadratic Discriminant Analysis (QDA) is a superior method to LDA and is a second-order parametric classifier 27 .
Computerized tomography (CT) is an imaging method that can show all tissues, especially bone tissue with sharp borders. In case of thin section, image orientation can be changed in three dimensions and can be taken to orthogonal plane. In this way, length and angle measurements can be calculated in a way that is less affected by orientation. With all these aspects, it provides superior results compared to studies carried out with more conventional osteometric devices 16 .
The aim of this study is to show the success of sex prediction by using ML with parameters obtained from CT images of cranium and mandible skeleton.

Results
Of the 25 parameters determined, 20 (NVIC, NSVC, NNL, PC, NIVA, PNIC, VIC, NIC, RML, CML, GHGA, HML, COL, CMHA, HGGC, COIC, HGGMC, HGGMA) were found to be statistically significant between males and females (p ≤ 0.05). In 18 of these parameters which were found to be statistically significant, the average of the parameter used was higher in males, while the average of the parameter used was higher in females in 2 parameters (GHGA, CMHA) (Tables 1, 2).
ROC analysis was performed with the IBM SPSS (Version 21) package program to reveal the discriminative power of the parameters in distinguishing between male and female individuals, and the highest AUC ratio was obtained with the CGL parameter ( Fig. 1). AUC, cut-off, p, Sen, Spe values of all parameters are given in Table 3. In addition, ROC curves and AUC values for each algorithm are given in Fig. 2. In addition, in terms of the reliability of our study, the tenfold cross-validation estimation values of the algorithms are also included. As a result of tenfold cross validation, Acc ratio of 87.766 ± 0.819 with LR algorithm, Acc ratio of 87.733 ± 0.410 with LDA algorithm, Acc ratio of 86.533 ± 0.592 with QDA algorithm, Acc ratio of 85.766 ± 1.045 with RF algorithm, Acc ratio of 77.200 ± 1.970 with ETC algorithm, Acc ratio of 80,266 ± 1.396 was obtained with the DT algorithm (Table 4).
In our study, the SHAP explanatory model of the RF algorithm was used to reveal the contribution of the parameters to the general algorithm, and it was found that the first five contributions were found to be with the parameters HGGMC, PC, GGL, HGGA, HGGC (Fig. 4).

Discussion
The aim of this study is to test whether sex identification can be made by using ML with the parameters obtained from cranium and mandible CT images taken to orthogonal plane. In the statistical analysis performed, NVIC, NSVC, NNL, PC, NIVA, PNIC, VIC, NIC, RML, CML, HML, COL, HGGC, COIC, HGGMC, HGGMA parameters were found to be statistically significant in distinguishing between sexes (p ≤ 0.05). Of the MLs tested, 0.90 Acc, 0.80 Mcc, 0.90 Spe, 0.90 Sen, 0.90 F1 values were found as a result of LR algorithm. It was found that 27 of 30 males and 27 of 30 females were predicted correctly as a result of confusion matrix. Acc ratios of other MLs were found to be between 0.81 and 0.88. Working in small datasets, lack of external validation, and not working in different populations are the limitations of our study. www.nature.com/scientificreports/ Forensic anthropologists constantly try to improve skeletal identification methods by using various methods in various parts of the skeleton or by developing new methods to determine gender 4 . Pelvis and cranium are known as the most dimorphic skeletal parts and they form the basis of sex determination researches 4,10,17-19 . Bertsatos et al. 19 reported that they predicted sex with an Acc ratio of 0.71-0.90 in total according to the results of the discriminant function analysis they carried out with the parameters taken from the cranium. Franklin et al. 28 and Dayal et al. 29 reached Acc ratios of 0.88-0.90 and 0.80-0.85, respectively according to the results of the discriminant function analysis they carried out with the parameters taken from the cranium. In this study, 0.90 Acc, 0.80 Mcc, 0.90 Spe, 0.90 Sen, 0.90 F1 results were found as a result of LR algorithm. Since the ML results included Mcc value which can evaluate Acc, Spe, Sen values together and which shows the reliability of algorithm, it is thought that reliability and accuracy were tested with various methods and reliable results were found in the study 12 .
While discriminant function analysis is one of the most widely used methods in forensic and archaeological cases for the determination of sex in literature, it is known that error rates are always different from 0% 2 . The fact that the MLs used in the present study were trained as 80% training and 20% test set increases the prediction reliability of the study and makes it more advantageous when compared with discriminant analysis.
CT is preferred for providing advantage in the measurement of missing and damaged parts by making bone measurements very close to original and allowing for the reconstruction of each bone part, unlike conventional osteometry devices (calliper, odontometer, digital distance meter) 16,22 . As far as we know, studies that associate parameters taken from cranium and mandible on orthogonal plane with ML based sex prediction are very limited. Even if CT is used in current studies, the results can show differences because the orientation of the image is not converted to the orthogonal plane since especially angular measurements are parameters affected by orientation.
In their study they predicted sex from cranium by using CT, Gillet et al. 30 used geometric morphometric model in their study and reported that they reached 0.90 Acc ratio for skull model. Zaafrane et al. 31 reported that they estimated sex with an Acc ratio of 0.90 from parameters of cranium in CT images they analysed by using basic statistical methods. These differences in results can be explained with the fact that the evaluation of sexually dimorphic features depend on group specific standards and skeletal characteristics differ among different populations, as well as the methodological methods used and differences in statistical analyses.   32 They used the support vector machine in their study in which they examined 100 skull skeletons and obtained a gender prediction rate of over 90% with 10% cross validation. In this study, we use image-based CNN, SVM, etc. We did not choose algorithms. The reason for this is due to the selection of only anthropometric points, not the entire cranium skeleton. Anthropometric points were measured manually using the Horos Project program and the results were used as ML algorithm input. Because image-based algorithms will produce a result by learning all the points of the given cranium skeleton.
It has been reported in literature that the possibility of removing the mandible intact is high 33 . The reason for this is the fact that the presence of a dense compact bone layer in the mandible makes it durable and therefore more likely to be found intact 34 . It is reported in literature that the measurements taken from the mandible are generally obtained from panoramic radiography images and that these images are affected by orientation 35 . According to the results of studies in which only the measurements taken from mandible are evaluated, an Acc ratio between 0.60 and 0.88 seems to be a reliable structure for sex prediction 29,[35][36][37] . In this study, combining the parameters taken from the mandible with the cranium strengthened gender prediction. RML, CML, GCGA, CFL, PLL, PICA, CGC, PLIC, CGGIC, CGGIA parameters taken from the mandible were found to be statistically significant in sex identification.
Since the identity of individuals should be predicted quickly and accurately in events such as war, natural disasters and fire, which deeply affect the society, the CT technology and MLs used in the present study show that prediction time can be minimized and high accuracy can be obtained. Considering the high Acc ratio found as a result of LR algorithm, it is thought that the present study will strengthen and contribute to studies related with sex prediction.

Materials and methods
Image set and population. The      The image set in the study consisted of the CT images of 150 male and 150 female individuals whose ages ranged between 20 and 65. Individuals with any surgical operation or pathology of the cranium skeleton were excluded from the study. Average age of the males was 54 (min 20, max 65), while average age of the females was (min 21, max 65). No statistically significant difference was found between the average ages of males and females (p = 0.395). program, which is a personal workstation in Digital Imaging and Communications in Medicine (DICOM) format. Images in sagittal, transversal and coronal planes were obtained from the transferred images by using 3D Curved Multiplanar Reconstruction (MPR). The line passing through the nasion and inion points of the images in these three planes was determined and all images were brought to the orthogonal plane (Fig. 5A). Later, CT images brought to orthogonal plane were overlapped by increasing their section thicknesses (Fig. 5B).     Statistical analysis. Mean, standard deviation, minimum and maximum values were included in the descriptive statistics of each data according to gender groups. Normality test Anderson Darling test was applied to each parameter and it was checked whether the data were normally distributed. Two simple T test was applied to parametric data and Mann-Whitney U test was applied to nonparametric data and p ≤ 0.05 value was considered as statistically significant. In order to reveal the differences of the parameters in terms of gender, ROC analysis was performed and the ROC curve was included. Minitab 17 and IBM SPSS (Version 21) package program was used in analyses.   www.nature.com/scientificreports/ Table 6. Angle parameters and abbreviations.