The use of canid tooth marks on bone for the identification of livestock predation

Historically wolves and humans have had a conflictive relationship which has driven the wolf to extinction in some areas across Northern America and Europe. The last decades have seen a rise of multiple government programs to protect wolf populations. Nevertheless, these programs have been controversial in rural areas, product of the predation of livestock by carnivores. As a response to such issues, governments have presented large scale economic plans to compensate the respected owners. The current issue lies in the lack of reliable techniques that can be used to detect the predator responsible for livestock predation. This has led to complications when obtaining subsidies, creating conflict between landowners and government officials. The objectives of this study therefore are to provide a new alternative approach to differentiating between tooth marks of different predators responsible for livestock predation. Here we present the use of geometric morphometrics and Machine Learning algorithms to discern between different carnivores through in depth analysis of the tooth marks they leave on bone. These results present high classification rates with up to 100% accuracy in some cases, successfully differentiating between wolves, dogs and fox tooth marks.


Results
Most graphic results present a relatively high degree of separation among groups, especially in the case of biometric data (Fig. 2). In all cases the degree of separation is significant, as established by multivariate statistical results presented in Table 1. In depth evaluation of sample distribution in each principal component feature space indicate wolves to be the most variable when analyzing tooth mark morphologies. Additionally, foxes appear to be present the least amount of variation, appearing as a tight cluster in both metric (Fig. 2) and geometric morphometric PCAs (Figs 3 and 4).
PCAs for metric analyses are generally presented by PC scores representing a high degree of variance across the first two principal components. In both the inclusion (7 PC Scores in total) and exclusion of OA (6 PC Scores in total), the cumulative proportion of variance represented is over 90%, with the biplot revealing a tendency for PC1 to represent almost all of the variables with the exception of OA (Fig. 2). The tendency for wolves to occupy an area of the graph with larger tooth marks and the fox to appear on the opposing extremity can be logically explained considering the size of the animals being compared. Considering the much smaller dimensions of fox cuspids in comparison with those of wolves, it is reasonable to assume that the variable size holds a significant weight on the comparison of these different samples.
When observing differences in pure morphological feature space, excluding the influence that tooth mark size may have on the sample, differences are still relatively clear ( Table 1). Overlapping of samples increase however patterns in feature space can still be observed for both tooth pit and score marks. Fox tooth scores remain to occupy a more restricted proportion of feature space, while wolves are represented by the highest degree of morphological variability. In shape space (Figs 3A and 4A), dog tooth marks begin to overlap more with samples produced by wolves, yet their differences to fox tooth marks remain clear. In form space (Figs 3B and 4B), overlapping remains however begins to present a clearer separation amongst samples in the case of tooth scores (Fig. 3B). In the case of scores (Fig. 3), geometric morphometric data is represented by a relatively small number www.nature.com/scientificreports www.nature.com/scientificreports/ of dimensions (Shape = 10 PC scores, Form = 14 PC scores) with a slightly clearer separation amongst samples than in the case of pits (Fig. 4). Nevertheless, geometric morphometric data for the case of pits is represented by a high number of dimensions (Shape = 44 PC scores, Form = 51 PC scores), and remain to present significant differences in the distributions of samples as demonstrated by MANOVA results for both shape and form (Table 1).
Exploring morphological variation through results obtained in grid warping calculations reveal foxes to produce deeper tooth marks of both types, while wolves tend to produce more superficial marks. Dogs in each study appear to occupy a midway point between both fox and wolf samples.  www.nature.com/scientificreports www.nature.com/scientificreports/ Finally, SVM models were able to efficiently construct hyperplanes that separated all samples with a 100% classification rate on all types of data sets ( Table 2). The global loss across each model indicate highly accurate decision boundaries, trained and computed in less than 1/10th of a second. Furthermore, obtaining optimal hyperparameters in model tuning took an average of 124.9 milliseconds. Based on classification-misclassification ratios obtained in model evaluation, Kappa, Sensitivity and Specificity metrics are able to highlight the potential these models have for processing data with low rates of misclassification. Considering each of these evaluation metrics and the computational power required to run these algorithms, SVMs can be seen to be the most efficient classification models for morphological data of any type.

Discussion and conclusions
This study presents an additional case of high classification amongst carnivore tooth marks using a hybrid geometric morphometric and machine learning methodological approach. Here we have been able to obtain 100% classification between tooth marks produced by Irish Setter gundogs, wolves and foxes. These results confirm that differentiation is possible and could provide a useful tool for discerning the agents responsible for livestock predation. This is an important advance that could be used to resolve a number of cases, such as insurance fraud 26 , and thus lower tension between landowners and government officials [23][24][25] .
Across areas of Europe and Northern America, the conflict between wolves and farmers is intense. To relieve said tensions, governments have directed large amounts of funding towards the compensation of livestock owners 3,8,16,[20][21][22][23][24][25][26] , nevertheless, for many these subsidies are considered inefficient and thus push for a general reduction of wolf populations 3,24,26 (Supplementary Note S1). This is especially dangerous in areas such as Sweden and Norway where wolf populations are especially low [20][21][22] . On the other hand livestock owners are currently pushing for faster and larger subvention allocations 24,26 , insisting that governments help establish plans to protect livestock including an increase in the rearing of shepherd dogs and the erection of electric fences among others 3,[5][6][7]12,15,25,26 . Nevertheless, multiple sectors argue the wolf to not be the sole cause of livestock death, accusing dogs to be a problem in a number of cases 30,32,33 . In the case of England an approximation of 20,000 yearly attacks have been recorded annually (Supplementary Note S1). In these same areas, evidence exists to argue the wolf to preferably www.nature.com/scientificreports www.nature.com/scientificreports/ attack wild animals over domestic livestock [16][17][18][19] . In response to this, the hereby proposed methodology may prove to be an important development in protecting wolf populations from this generated tension. This is especially evident when determining the agent responsible for animal death cannot be confronted using typical approaches [34][35][36] thanks to the intervention of other agents such as vultures. This study hereby presents an alternative that can be used to investigate insurance claims concerning livestock predation.
Hybrid artificially intelligent algorithms and morphological data processing provide a starting point for further research into different factors of agricultural and natural science studies. This could consequently reduce the  amount of erroneous compensation costs spent by the government in insurance claims of this type. Additionally, while structured light surface scanning was chosen for this study, a further range of other high resolution techniques are also available, including microphotogrammetry and microscopy [38][39][40][41] . Coupled with the statistical techniques employed here, investigation into these type of attacks on livestock can now employ a new methodological approach for carcass analysis. The methods and results presented here are additionally the first to study the different types of canid carnivores in depth. Here we have obtained higher accuracy and classification rates than obtained in some previous efforts to differentiate between other types of carnivores, including jaguars, hyenas and lions, through morphological analyses of their pits 42,43 and scores 43,44 . These high classification rates have been obtained using a new line of investigation into morphological studies, showing the potential of Artificially Intelligent Machine and Deep Learning algorithms for the processing of such data sets 43 . These methods are able to overcome more traditional techniques into carnivore studies at a much higher resolution.
Nevertheless, it is important to point out that this is not yet the absolute solution to all analytical problems in related fields. These valuable results should be interpreted as an important advance that should guide future experimentation and investigation in order to expand our experimental samples and develop our understanding of different carnivore feeding habits. Similarly, larger sample sizes could be key to building highly robust computational models that can obtain even better results.
From one perspective, investigation is needed to confront whether the size of prey be a conditioning factor in tooth mark morphology, as has been observed in the case of other types of carnivore produced damage [42][43][44] . While previous efforts regarding other types of archaeological bone surface modifications have argued the size of the animal to not be a conditioning factor in mark morphology 45 , this question is still to be confronted in the case of carnivore tooth marks. If variances were to be observed, the data provided in this study would be applicable only to wolf and dog prey of a large size (including bovids and equids), while further experimental reference collections would be needed for other animal sizes.
Moreover, the inclusion of more carnivore samples can be considered of great importance to resolve similar conflicts in other areas of the world where wolves are not the only predator. In areas such as Southern America, Asia and Africa, predators such as the cougar, jaguar 10,11,25,34,46 , leopard 47,48 and tiger 49,50 also require investigating. In the case of Europe we should additionally include the bear 24 . It may thus be considered necessary that research of this type be developed further. Through this, we may be able to provide an additional means of investigating numerous questions regarding agricultural and environmental sciences with a greater ecological, economic and social impact.

Material and Methods
A total of 83 carnivore tooth pits and 105 tooth scores were studied and compared for this current sample. All marks were collected on long bones of tibiae and radii, considering how diaphysis are denser than epiphyses and are thus more likely to survive during carnivore feeding.
The samples included tooth marks generated by foxes (pits = 29, scores = 41), wolves (pits = 24, scores = 30) and a typical breed of gundog; the Irish Setter (pits = 30, scores = 34). Tooth marks produced by wolves (pits = 24) were obtained from horses of the Cabárceno Natural Park, Cantabria 51 , as well as natural wolf sites in mount Campelo, near Sobrado Dos Montxes, Galicia (scores = 30) 52 . Tooth marks produced by foxes were obtained on samples originating from sheep of Ayllón, Segovia 53 . Dog sampes were obtained on cow bones from Madrid.
Digital reconstructions of different carnivore tooth marks were performed using the David SLS-2 Structured Light Laser Scanner, located in the TIDOP laboratory at the University of Salamanca (Spain). The digital reconstruction protocol 38 employed the use of a DAVID USB CMOS Monochrome camera, an ACER K132 projector, and a calibration marker board. The scanning process produced 3D models in less than a minute, producing a density of up to 1.2 million points. Final models correspond to meshes that are produced using an algorithm based on the Delaunay triangulation strategy 38 .
Virtual reconstructions of each mark were then imported into different software for landmarking. The Global Mapper software was used to extract cross sections from scores, using tspDig2 (v.2.1.7) for the collection of landmark data. The Avizo software was used for processing entire tooth pits in 3D. Two different landmark models were employed, using a 7-landmark 2D model for scores 44 and a 17-landmark 3D model for pits 42 . Landmark coordinates were then used to calculate the seven measurements described by Bello and Soligo 39 , which have been successfully adapted and employed for tooth mark analysis 44 . These calculated measurements consider the thickness, depth and various angles of each groove. Both geometric morphometric models and the seven measurements employed in this study have been visually described in Supplementary Fig. S1. Landmark coordinate data was then extracted and converted into a standardized format to be imported into R for further statistical treatment.
Data analysis. Metric data was first processed using a Principal Components Analysis (PCA). Measurements were then tested for statistical significance using standard multivariate approaches (MANOVA). Depending on the inter-group variability present within each sample, the MANOVA test employed either a Hotelling-Lawley or Wilk's Lambda test for homogenous or inhomogeneous samples respectively. These steps were repeated including as well as excluding the variable Opening Angle (OA), thus adjusting for this variables' weight on the overall results 54 .
For geometric morphometrics, landmark coordinates were first subjected to an orthogonal tangent projection, known as Generalized Procrustes Analysis (GPA), in order to normalize data for further multivariate statistical analyses [55][56][57] . From this, a PCA is performed, reducing the degree of variance to fewer dimensions in order to provide a more efficient comparison of morphology. Additional to analyses of pure morphology in shape space, form space was investigated after re-scaling the data obtained after Procrustes superimposition through the natural logarithm of Centroid Size. Thin Plate Splines were then calculated to explore morphological variation across (2019) 9:16301 | https://doi.org/10.1038/s41598-019-52807-0 www.nature.com/scientificreports www.nature.com/scientificreports/ the different Principal Component (PC) scores 58 . Additional MANOVA testing was performed to assess the significance of morphological variance. PC scores were then extracted to be used for the training of the Machine Learning model.
Considering its recent success in geometric morphometric applications 43 , a supervised Support Vector Machine (SVM) algorithm was used to train a classification model based on morphological tooth mark data. SVMs map out input vectors into a non-linear high dimensional feature space, using hyperplanes to calculate the degree of separation between samples 59,60 In order to define said feature space, a kernel function was used. This helps to overcome traditional limitations imposed by linearity. The constructed hyperplane can be described as a discriminant classifier decision surface which uses a maximized margin or decision boundary to reduce chances of overfitting. The consequent hyperplane divides the samples into n -1 dimension, where n is the number of variables.
Data was bootstrapped 1000x to overcome issues that may be produced by the size of samples. Models were then trained using a 70:30% training/testing split. For geometric morphometrics SVMs were trained on the PC scores, while for biometric data the models were trained directly on the measurements. Considering previous observations on the value of the variable size in geometric morphometric analysis of taphonomic traces 43 , each model was constructed using either form or shape data considering the type of tooth mark being analyzed. Following the statistical protocol described by Courtenay et al. 43 , shape was used for carnivore tooth score marks while form was used for the processing of pits.
SVMs were then trained using k-fold Cross Validation (k = 10) in order to ensure the model could efficiently adjust its weights. Model optimization was performed following a standard Machine Learning protocol using back propagation to adjust weights during optimization. The corresponding objective function thus employs Gradient Descent to minimize the loss and control overfitting. Additional hyper parameter optimization included the fine tuning of cost and gamma values, established via a random search loop function programmed in R. This algorithm ran for 50 iterations during the tuning of SVM models, using a random combination of cost and gamma values until finding the optimum setting that can ensure the best separation of samples in feature space. These optimal parameters are then extrapolated and used to construct the final classification model.
Model evaluation followed a standardized Machine Learning and Deep Learning approach, using Kappa, Sensitivity, Specificity and Balanced Accuracy values 60 . These values are presented as numbers between 0 (poor) and 1 (high performing) and are calculated via a confusion matrix, assessing the rate and ratio of misclassification and correct classification results. These metrics are interpreted by considering how low Sensitivity and Specificity values indicate high misclassification rates, while the accepted threshold of the Kappa statistic considers a model performing >0.8 to be powerful 60  Finally, the SVM training process was microbenchmarked employing 200 iterations and a mean time value in milliseconds was taken as a final result.
A final R code that can be executed on any type of morphological file is available online at https://github.com/ LACourtenay/Support-Vector-Machine-for-Morphological-Analysis. A similar version of this code is also available in this repository for the processing of measurements stored in a csv comma delimited file format.