Integrated image and location analysis for wound classification: a deep learning approach

The global burden of acute and chronic wounds presents a compelling case for enhancing wound classification methods, a vital step in diagnosing and determining optimal treatments. Recognizing this need, we introduce an innovative multi-modal network based on a deep convolutional neural network for categorizing wounds into four categories: diabetic, pressure, surgical, and venous ulcers. Our multi-modal network uses wound images and their corresponding body locations for more precise classification. A unique aspect of our methodology is incorporating a body map system that facilitates accurate wound location tagging, improving upon traditional wound image classification techniques. A distinctive feature of our approach is the integration of models such as VGG16, ResNet152, and EfficientNet within a novel architecture. This architecture includes elements like spatial and channel-wise Squeeze-and-Excitation modules, Axial Attention, and an Adaptive Gated Multi-Layer Perceptron, providing a robust foundation for classification. Our multi-modal network was trained and evaluated on two distinct datasets comprising relevant images and corresponding location information. Notably, our proposed network outperformed traditional methods, reaching an accuracy range of 74.79–100% for Region of Interest (ROI) without location classifications, 73.98–100% for ROI with location classifications, and 78.10–100% for whole image classifications. This marks a significant enhancement over previously reported performance metrics in the literature. Our results indicate the potential of our multi-modal network as an effective decision-support tool for wound image classification, paving the way for its application in various clinical contexts.


Introduction
Wound diagnosis and treatment are a pressing issue worldwide, with a considerable population suffering from wounds.As per a 2018 retrospective analysis, the costs for wound treatment have been estimated to be between $28.1 billion to $96.8 billion [3], reflecting the tremendous financial and medical burden.The most commonly observed wounds include diabetic foot ulcer (DFU), venous leg ulcer (VLU), pressure ulcer (PU), and surgical wound (SW), each associated with a significant portion of the population [22][23] [24] [25].Given these circumstances, effective wound classification is crucial for timely and adequate treatment.Until recently, wounds were predominantly classified manually by specialists, often leading to inconsistencies due to lack of specific guidelines.However, the advent of artificial intelligence (AI) has brought about significant changes in healthcare, including wound diagnosis [4].AI, specifically deep learning (DL), has proven to be a game-changer in medical image analysis, enabling accurate, time-efficient, and cost-effective wound classifications.Data-driven techniques like DL, which require minimal human intervention, have been extensively utilized for identifying patterns and relationships in complex data [5] [11].DL encompasses several methods, like Convolutional Neural Networks (CNN), Deep Belief Networks (DBN), Deep Boltzmann Machines (DBM), Stacked Autoencoders, and many more.These techniques have found applications in various medical diagnostic fields, including wound image analysis [26] [29].Studies have underscored the effectiveness and efficiency of deep convolutional neural networks in wound diagnosis and analysis [14][15] [16].Notwithstanding the advancements, the accuracy of wound classification models remains constrained due to the partial information incorporated in the classifiers.The present research introduces an innovative approach to address this limitation by including wound location as a significant feature in the wound classification process.Wound location, a standard entry in electronic health record (EHR) documents, is instrumental in wound diagnosis and prognosis.A body map has been utilized to facilitate accurate and consistent wound location documentation [28], enhancing the classifier's performance by providing a more holistic set of data for classification.The classifier trained on both image and location features outperforms those reliant solely on image data.A simplified workflow of this study is shown in Figure 1.The developed wound classifier takes both wound image and location as inputs and outputs the corresponding wound class.

Related Works
In this review, we revisit the relevant research in the field of wound image classification, segmented into categories based on the methodology of each study.

A. Deep Learning Based Classification
A.1.Convolutional Neural Networks (CNNs) with SVM: A method proposed by Abubakar et al. [30] distinguished between burn wounds and pressure ulcers using pre-trained deep architectures such as VGG-face, ResNet101, and ResNet152 in combination with an SVM for classification.Similarly, Goyal et al. [31] predicted the presence of infection or ischemia in Diabetic Foot Ulcers (DFUs) using Faster RCNN and InceptionResNetV2 networks, in combination with SVM.

A.2. Advanced Deep Learning Techniques:
Advanced methods involving two-tier transfer learning were utilized in studies which used architectures like MobileNet, InceptionV2, and ResNet101.Goyal et al. [32] presented DFUNet for classification of DFUs, while Nilsson et al. [33] applied a CNN-based method using VGG-19 for venous ulcer image classification.In another significant study, Alaskar et al. [38] applied deep CNNs for intestinal ulcer detection in wireless capsule endoscopy images.Using AlexNet and GoogleNet architectures, they reported a classification accuracy of 100% for both networks.Ahsan et al. [41] discusses the use of deep learning algorithms to automatically classify diabetic foot ulcers (DFU), a serious complication of diabetes that can lead to lower limb amputation if untreated.The authors examined various convolutional neural network (CNN) architectures, including AlexNet, VGG16/19, GoogleNet, ResNet50, MobileNet, SqueezeNet, and DenseNet.They used these models to categorize infection and ischemia in the DFU2020 dataset.To address the issue of limited data and to reduce computational cost, they fine-tuned the weights of the models.Additionally, affine transform techniques were employed for data augmentation.The results revealed that the ResNet50 model achieved the highest accuracy rates, reaching 99.49% for ischemia and 84.76% for infection detection.

A.3. Multi-Class Classification
Techniques: Shenoy et al. [34] proposed a method to classify wound images into multiple classes using deep CNNs.Rostami et al. [36] proposed an ensemble DCNN-based classifier to classify entire wound images into surgical, diabetic, and venous ulcers.Anisuzzaman et al. [28] proposed a multi-modal classifier using wound images and their corresponding locations to categorize them into multiple classes, including diabetic, pressure, surgical, and venous ulcers.This paper introduced an image and location classifier and combined it together to create a multi-modal classifier.In this study, two different datasets were used namely AZH dataset that consists of 730 wound images with four classes, Medetec dataset which consists of 358 wound images with three classes.Also, they introduced a new dataset AZHMT dataset which is a combination of AZH and Medetec dataset containing 1088 wound images.The reported maximum accuracy on mixed-class classifications varies from 82.48% to 100% in different experiments and maximum accuracy on wound-class classifications varies from 72.95% to 97.12% in various experiments.

B. Wound Image Classification Using Novel Approaches
Novel techniques have been presented to overcome challenges in wound classification.Alzubaidi et al. [35] introduced DFU_QUTNet, a deep architecture model for classification of DFUs.They reported a maximum F1-Score of 94.5% obtained from combining DFU_QUTNet and SVM.Another interesting method was presented by Sarp et al. [37] who classified chronic wounds using an explainable artificial intelligence (XIA) approach.

C. Traditional Machine Learning-Based Classification
C.1.SVM-Based Techniques: Traditional machine learning techniques have also found significant use in wound image classification.Yadav et al. [39] used color-based feature extraction and SVM for binary classification of burn wound images.Goyal et al. [40] used traditional machine learning and DCNN techniques for detecting and localizing DFUs, with Quadratic SVM classifiers trained on feature-rich patches extracted from the images.Through this review, it is evident that both traditional and advanced machine learning techniques have demonstrated promising results in wound image classification, providing valuable insights for future research in this area.

Materials and Methods
This study encompasses three distinct subsections, each elucidating the specific methodology employed in this study: Whole Image Classification, Region of Interest (ROI) Extracted Image Classification, and ROI with Body Map Location Image Classification.It should be noted that each of these subsections utilizes the same fundamental base classifier for the image data analysis.Datasets were anonymized, partitioned, and augmented before processing through a proposed architecture.The proposed model incorporated transfer learning, convolution blocks, axialattention mechanisms, and Adaptive-gated MLP.Model performance was evaluated using accuracy, precision, recall, and the F1-score.

A.1 AZH Dataset:
The AZH Dataset is a collection of prefiltered 730 ROI images and 538 Whole wound images, varying in size and depicting four types of wounds: venous, diabetic, pressure, and surgical.Captured over two years at Milwaukee's AZH Wound and Vascular Center, the images were collected using an iPad Pro (software version 13.4.1)and a Canon SX 620 HS digital camera, and subsequently labeled by a wound specialist from the center.While most of the dataset comprises unique patient cases, some instances involve multiple images from a single patient, taken from different body sites or at varying stages of healing.These were classified as separate due to distinct wound shapes.This dataset, unfortunately, couldn't be expanded due to resource limitations.It's important to note that the data doesn't involve any human experimentation or usage of human tissue samples.Instead, it utilizes de-identified wound images, publicly available at link: (Link).Each image only includes the wound and immediate skin area, eliminating any unnecessary or personal data to protect patient identity.The University of Wisconsin-Milwaukee has vetted the dataset's use for compliance with university policy.Figure 3 and Figure 4 show images from whole and ROI images.

B. Body map for location
A body map serves as a simplified, symbolic, and accurately phenotypic representation of an individual's body [42].Primarily used in the medical field, body maps are effective tools for identifying and locating physical afflictions such as bruises, wounds, or fractures.They are especially valuable in forensic science for identifying bodily changes during post-mortem examinations and in medical practice for pinpointing the location of infections [43].By offering a detailed overview of the body, they inform practitioners about other body areas that might be affected and require attention during the healing process.Furthermore, in the realm of scientific research, body maps function as verifiable evidence, validating observable bodily changes caused by internal diseases.The design of a comprehensive body map with 484 distinct parts is credited to Anisuzzaman et al. [28].PaintCode [44] was employed to prepare this body map, with initial references being drawn from several credible sources [45][46] [47].The fundamental framework for this design originated from the Original Anatomy Mapper [48], which directly paired each label and outline.The extreme intricacy involved in the detailed depiction of each feature on the body map led to a pre-selection of 484 features or regions.This process was overseen and approved by wound professionals at the AZH wound and vascular center, ensuring the map's medical accuracy and applicability.Major part of body map is shown in the Figure 2.Each number denotes a location in this case.Table 1 shows a few examples of locations and their related numbers.

C.1. ROI extraction:
The extraction of Region of Interest (ROI) from wound images presents a robust methodology for diagnosing and tracking wound progression.As aforementioned, the ROI includes the wound itself and a portion of the surrounding healthy skin, which collectively encompasses the vital elements of the wound's condition.The developed wound localizer is a significant tool for this extraction process, as it is capable of automatically cropping single or multiple ROIs from each image [49].
Each ROI represents one of six categories -diabetic, venous, pressure, surgical, background, and normal skin.These categories are critical in understanding the etiology of the wound, allowing for more accurate and personalized treatment plans.However, the diversity of the wounds is also reflected in the different sizes and shapes of the extracted ROIs, each telling a unique narrative of the wound's journey.Importantly, the ROI's rectangular form and variable size allow for greater adaptability in handling various wound types and sizes.It is an efficient method to focus on the essential wound characteristics while reducing unnecessary information that could potentially introduce noise into the data.

C.2. Data Split:
During this study, we utilize two distinct methods to partition the dataset.The utilization of two distinct partitioning provides a richer understanding of the model's behavior, potential biases, sensitivities, and the ability to generalize.It also facilitates a more robust and comprehensive evaluation process.The first approach consisted of splitting the data into training (70%), testing (15%), and validation (15%).The second partitioning method diverged slightly from the first, allocating 60% of the data for training, 15% for validation, and increasing the testing set to comprise 25% of the total dataset.Table 2 shows both types of dataset splits on ROI images.

C.3. Data Augmentation:
Each image in the training set was augmented using transformation methods such as resizing, rotation, flipping (both vertically and horizontally), and application of affine transforms (including scaling, rotating, and shifting).Additional alterations such as the application of Gaussian noise and coarse dropout (i.e., random rectangular region removal) were also performed.These transformations were probabilistically applied, creating a diverse set of augmented samples as shown in Figure 5.The transformations ensured robustness of the model against variations in the data.

D.1 Base Models:
The proposed model utilizes three pre-trained Convolutional Neural Networks (CNNs) -ResNet152, VGG16, and EfficientNet-B2.In ResNet152, modifications include not only the removal of the average pooling and fully connected layers, but also alterations to the third block of the second layer by removing its last two layers and removing the final four layers of the overall model.For VGG16, the last twelve layers are omitted, capturing more primitive patterns.
The last layer of EfficientNet-B2 is removed to maintain consistency with the modifications made to the other two models.These models, applied in parallel to the input, capture different levels of features.

D.2 Custom Layers:
The custom layers comprise a Convolutional Block (ConvBlock), followed by a Parallel Squeeze-and-Excitation (P_scSE) block [51], and a dropout layer.The ConvBlock is a combination of a convolution layer and a ReLU activation function, capturing spatial information and introducing non-linearity.
The P_scSE block blends Channel-wise Squeeze-and-Excitation (cSE) and Spatial Squeeze-and-Excitation (sSE) operations.The cSE focuses on channel interdependencies, providing global context, while the sSE concentrates on the spatial interdependencies of each channel, maintaining detailed spatial information.Outputs from the cSE and sSE are merged using max-out and addition operations [51] as shown in Figure 7.

D.3 Aggregation and Fully Connected Layers:
The base models' outputs are concatenated and fed through sequences of ConvBlocks and P_scSE blocks to merge and process multi-level features.The output is then flattened and passed through a dense layer.The output is further processed through a fully connected layer block.This block includes two dense layers enriched with axial-attention mechanisms, an enhancement over traditional attention mechanisms, focusing on individual dimensions separately.Interspersed with ReLU activation functions and dropout operations, the axial-attention mechanism boosts important features, helping the model to recognize complex patterns and dependencies in the data.
If wound location is used along with image data, then we use Adaptive-gated MLP to analyze wound location separately before concatenating them with the fully connected layers in the above model.This module is constructed as a series of linear transformations, axial attentions, and ReLU activations, followed by a final linear transformation to map the output to the target size.The MLP is gated, meaning it can learn to selectively propagate information through the network.
The orderly arrangement of data is vital for the efficient functioning of the model.Consistency in the output from the image and location data is essential, thus necessitating the synchronous feeding of properly sequenced data into the model.This alignment was maintained by associating each Region of Interest (ROI) with a unique index number and mapping the corresponding wound location to this number.Given the categorical nature of wound location data, it was represented using one-hot encoding.

D.4 Output Layer:
The final dense layer maps to the number of output classes.

E. Performance Metrics
In our study, we employed various evaluation metrics such as accuracy, precision, recall, and the F1-score to scrutinize the effectiveness of the classifiers.The related mathematical formulations for these assessment metrics are illustrated in Equations 1 through 4. The abbreviations TP, TN, FP, and FN in these equations stand for True Positive, True Negative, False Positive, and False Negative, respectively.For a more comprehensive understanding of these equations and associated theories, readers are referred to reference [50].

Results
In the present investigation, we deployed the advanced computational capacities of Google Colab Pro Plus A100, fortified with 40GB of memory.This enabled a methodical analysis involving both Region of Interest (ROI) and whole image-based classifications.The experimental setup involved processing images of 256x256 pixel dimensions, batched into groups of 32, across a course of 100 epochs.Our learning parameters were finely tuned to optimize the learning process: a learning rate of 0.0001 was chosen, with a minimum rate limit of 0.00001.To enhance the efficiency of our learning process, we applied the Adam optimizer [52].

Classification Categories:
The classifiers were extensively trained to distinguish among various classes represented in the images, specifically: Diabetic (D), Venous (V), Pressure (P), Surgical (S), Background (BG), and Normal Skin (N).Further specifications and results regarding these classes will be provided in the ensuing sections of this paper.
Loss Function: Cross Entropy was chosen as our loss function, given the multi-class and binary nature of our image classifications.Its mathematical formulation is as follows 26 [56]: For multi-class problems, the cross-entropy loss, L, is: Here yi is the actual label and (pi) is the predicted probability for each class (i).For binary classification problem, the binary cross entropy loss, L, is computed as: The optimization process strives to minimize this loss, thereby reducing the discrepancy between our model's predictions (pi) and the actual labels (yi).Further sections will elucidate the efficacy of this loss function within our research context.

A. ROI Classification
The primary phase of the ROI classification trial pertains to the classification of 6 unique types of wound patches, specifically: diabetic, venous, pressure, surgical, BG, and N. Subsequently, the 5category classification problem comprised three types of wound labels alongside BG and N categories.When addressing the 4-category classification, the objective centered on the categorization of the wound patches into one of the four classes: BG, N, along with two different wound labels.In the context of 3-category classification, the aim was to sort the wound patches into one of the three groups: D, P, S, V.For binary classification, a range of combinations including N, D, P, S, V were utilized to categorize the wound patches into two distinct groups.

B. Whole Image Classification
In the whole image classification, the precision, recall, and F1-score measurements show that the incorporation of these metrics, alongside accuracy, provides a more comprehensive understanding of the model's performance.Table 5 depicts these additional measurements, and they reveal interesting patterns that match with the observed accuracy rates.
For the 4-class classification comprising D, P, S, and V, precision, recall, and F1-scores were observed at 83.22%, 83.13%, and 82.26% respectively for the 70-15-15 data split.For the 60-15-25 split, these scores were slightly lower, coming in at 78.60%, 78.10%, and 76.75%, respectively.This pattern is similarly reflected in the accuracy measurements for the same class combination and data splits.
In the 3-class classification, the D, S, V combination showed a high precision of 93.48%, recall of 92.64%, and F1-score of 92.54% for the 70-15-15 split.Conversely, the D, P, S combination demonstrated lower values, with a precision of 82.66%, recall of 81.35%, and F1-score of 80.72% in the same split.Focusing on the 2-class classification, all N-related combinations (N, D; N, P; N, S; N, V) achieved perfect precision, recall, and F1-score of 100% in both data splits.However, other combinations like D, P and P, S displayed lower scores.The D, P combination, for instance, recorded precision, recall, and F1-score of 89.38%, 87.17%, and 86.50% respectively for the 70-15-15 split, and 86.03%, 84.37%, and 83.61% respectively for the 60-15-25 split.
In conclusion, the whole image classification performance, as depicted by precision, recall, F1score, and accuracy, varies based on the number of classes and the specific class combinations.Nrelated combinations in the 2-class category consistently showed perfect precision, recall, and F1scores, indicating optimal classification performance.These results provide significant insights and avenues for further research and optimization in whole image classification.

C. Cross Validation
Cross-validation is a robust methodology we employed in our experiments to validate the performance of our machine learning model.It involves splitting the data into several subsets or 'folds', in this case, five.We then train the model on all but one of these folds, testing the model's performance on the unused fold as displayed in Table 7 .This process is repeated for each fold, giving us a better understanding of how our model might perform on unseen data.It's particularly useful in situations where our dataset is limited, as it maximizes the use of data for both training and validation purposes.Due to resource constraints, our experimental scope was confined to select procedures.As such, we were only able to conduct a limited subset of experiments, focusing on those deemed most crucial and promising.
In the first scenario, we explored an approach called "ROI without location" with an 80-20 data split.Here, the average accuracy varied across different groupings of classes.The accuracy for a grouping of six classes fluctuated between 80.01% to 85.34%, giving an average of 82.58%.For five classes, it varied from 80.71% to 87.14%, with an average of 82.28%.In a group of four classes, we observed a higher average accuracy of 95.65%, while three classes gave us an average accuracy of 74.80%.
The second method we looked at was "ROI with location".Here, we noticed a similar pattern to our first method.The six-class grouping showed an average accuracy of 83.77%, with individual tests ranging from 80.10% to 86.91%.The five-class grouping had an average accuracy of 81.85%, ranging between 78.57% and 84.28%.For four classes, the average accuracy was high again at 95.50%, while the three classes gave us an average of 76.60%.Finally, we examined the "whole image" method with the same 80-20 data split.A four-class grouping resulted in an average accuracy of 78.34%.One group of three classes managed a much higher average accuracy of 89.86%, while the other three-class group had an average accuracy of 78.22%.[32], and Aguirre et al. [33].The comparison is based on accuracy as the evaluation metric, which is a common criterion for classification tasks.For each work, we have tested our model on the same dataset and compared the results as displayed in Table 8. Figure 8 and Figure 9 display confusion matrix for 3-class (P, S and V) and 6-class (BG, N, D, P, S, V). Figure 9 also display ROC plots for 6-class ROI image with location-based image classification.In the case of Rostami et al.'s work [36], the classification was carried out on a 6-class, 5-class, and 4-class basis using the AZH dataset.Our model outperformed the previous work by a notable margin across all class divisions.For example, in the 6-class division (BG, N, D, P, S, V), our model improved the accuracy by approximately 11.73%.Furthermore, for the 5-class and 4-class divisions, our model consistently showed improvements, highlighting its efficiency and robustness.Anisuzzaman et al.'s work [28] also used the AZH dataset, with a focus on 6-class, 5-class, and 4class divisions.Our model yielded better accuracy results, such as an increase of about 1.34% in the 6-class division.The consistency of improved performance in all divisions showcases the broad applicability of our model.As for the work of Goyal et al. [32], they only classified into a 2-class division (N, D) using the DFU dataset.When tested on the AZH dataset, our model demonstrated 100% accuracy, similar to their findings.This highlights the versatility of our model in achieving high accuracy across different datasets.Aguirre et al. [33] conducted their research on a dataset of 300 wound images with a 2-class division.Their model yielded an 85% accuracy rate, while our model, when tested on the AZH dataset, showed a significant improvement in the accuracy rates, ranging from 92.70% to 100%.

B. Limitations and Future Research:
While our research demonstrates the strengths of our model, it is not without limitations.For instance, all comparisons in our current study were conducted using the AZH and Medetec dataset.Although our model performed commendably on this dataset, the results might not be fully generalizable to all datasets.Hence, the applicability of our model to other datasets remains an area for further investigation.It's noteworthy that our study does not solely rely on accuracy as the evaluation metric.In an attempt to provide a comprehensive evaluation, we also considered other metrics such as precision, recall, and the F1 score.This thorough approach helps to give a well-rounded understanding of our model's performance.However, despite its strong performance, there could be scenarios or datasets where the model might not yield the same level of success, a potential caveat to be explored in future work.Future research should be focused on testing the model with larger and more diverse datasets to ensure its generalizability.Specifically, addressing the issue of overlap between healthy and diseased skin, possibly through refining the image preprocessing or feature extraction stages, could yield significant improvements.Furthermore, conducting comparative studies using a wider range of evaluation metrics could offer a broader understanding of the model's strengths and weaknesses.In addition to further empirical evaluation, there is also potential to investigate the theoretical properties of the model.Understanding why the model performs as it does could lead to insights that drive further improvements.

Conclusion
In this study, we presented a multi-modal wound classification network that uniquely incorporates both images and corresponding wound locations to categorize wounds.Differing from previous research, our approach utilizes a pre-existing body map and two datasets to classify wounds based on their locations.Our model is built on a novel deep learning architecture, featuring parallel squeeze-and-excitation blocks (P_scSE), adaptive gated multi-layer perceptron (MLP), axial attention mechanism, and convolutional layers.The integration of image and location data contributed to superior classification outcomes, demonstrating the potential of multi-modal data utilization in wound management.Despite the benefits, our work has some limitations, including data scarcity which affects the generality of our model.Looking ahead, future research will aim to enhance our model by incorporating more modalities such as pain level, palpation findings, general observations, wound area and volume, and patient demographics.Addressing data overlaps in wound location will also be a priority to enhance classification accuracy.Our efficient wound care algorithm has significant potential for automation in wound healing systems, offering cost-effectiveness and aiding clinicians in prompt diagnosis and development of suitable treatment plans.Especially in resource-scarce areas, AIenabled wound analysis can contribute to rapid diagnosis and quality treatment.However, this necessitates proper technical training for both patients and physicians, which will also be a focus of future work.Expanding our dataset will help improve our model's performance and better serve wound care providers and patients alike.

Figure 1 :
Figure 1: Expected workflow of this research

Figure 4
Figure4excellently illustrates the variation in extracted ROIs from different classes of the wound dataset.This showcases the versatility of our wound localizer, capable of handling wounds of different origins, sizes, and stages.It successfully extracts the ROI, making the most relevant information available for analysis.

Figure 3 :
Figure 3: Sample images from the AZH Wound and Vascular Center database.The rows from top to bottom display diabetic, pressure, surgical and venous samples, respectively.

Figure 4 :
Figure 4: Sample ROIs.The columns from left to right display background, normal skin, diabetic, pressure, surgical and venous ROIs, respectively.

Figure 5 :
Figure 5: Data Augmentation with leftmost original image.The rows from top to bottom display background, normal skin, diabetic, pressure, surgical and venous ROIs, respectively.

Figure 6 :
Figure 6: Proposed model architecture outline

Figure 8 :
Figure 8: Confusion matrix for three class classification on P-S-V, left column displays dataset with 70/15/15 and right column displays dataset with 60/15/25.

Table 1 :
Body Map examples of Lower leg region

Table 2 :
AZH and Medetec original dataset count and class Abbreviations [28]ROI and wound location:In the ROI dataset we have two additional classes named normal skin and background which were created manually by selecting skin region for normal skin and any additional information as background from the original whole image dataset Error!Bookmark not defined.[28].Sample of these two classes are shown in Figure4.All of these were verified by wound specialists.Wound location was associated with each ROI image and assigned values from the body map discussed in section B. All the six classes abbreviation is shown in

Table 3 .
The dataset was split in two different ways, one is 70-15-15 and the other is 60-15-25, to observe and compare the best results.In A.1.ROI multiclass classification without wound location:The results of the ROI classifier's performance without wound location evaluation varied across different scenarios.For the 6-class classification case (BG, N, D, P, S, V), the test accuracy was 85.the 5-class classification scenario, the results varied between the class combinations.The BG, N, D, S, V combination showed superior performance with test accuracies, precisions, recalls,

Table 3 :
ROI image without location-based classification with different data split (Left -70%,15%,15%, Right -60%,15%,25%).P = Precision, R = Recall, F = F1-score, Acc = Accuracy Following the inclusion of wound location data in conjunction with image data, Table4displays the performance metrics from experiments using an Adaptive-gated MLP to separately analyze the wound location.This data was subsequently concatenated with the fully connected layers of the prior model.86% and 91.54%, and F1-score at 91.85% and 91.50% across the two data splits.Conversely, the BG, N, D, P, S combination demonstrated lower accuracy at 84.90% and 84.39%, precision at 85.28% and 85.56%, recall at 84.90% and 84.39%, and F1-score at 84.96% and 83.92%.In the context of the 4-class classification, the BG, N, D, V combination once again showed impressive metrics with accuracy rates of 95.87% and 96.22%, precision at 96.06% and 96.37%, recall at 95.87% and 96.22%, and F1-score at 95.83% and 96.24%.On the other hand, the BG, N, P, S combination witnessed a decrease in performance, registering accuracy levels of 90.90% and 88.88%, precision at 91.50% and 88.90%, recall at 90.90% and 88.88%, and F1-score at 91.03% and 88.72% for each respective data split.For the 3-class and 2-class classification models, a range of performance scores were observed.The 2-class case, particularly the N, D combination, achieved perfect performance with accuracy, precision, recall, and F1-score all at 100% in both data splits.The D, P class combination, however, recorded the lowest performance levels for this category with accuracy at 86.14% and 86.41%, precision at 86.14% and 86.73%, recall at 86.00% and 86.41%, and F1-score at 86.03% and 86.22%.In conclusion, the results show that the incorporation of wound location data alongside image data led to variations in accuracy, precision, recall, and F1-score based on the number and combination of classes, as well as the distribution of the data split.Furthermore, the use of an Adaptive-gated MLP for separate wound location analysis consistently resulted in promising outcomes across all experiments.

Table 7 :
Cross Validation performed on randomly selected classes for each of the above experiment.

Table 8 :
Previous work comparison.(L -With Location, NL -Without Location)

Discussion A. Comparison with previous work:
[28]study presents a comprehensive comparison of our model's performance with those of previous studies, namely the research conducted by Rostami et al.[36], Anisuzzaman et al.[28],Goyal et al.