Introduction

Breast cancer is one of the most common malignancies in women and poses a significant risk to their health1,2. In recent years, the incidence of breast cancer has increased year by year3. Early diagnosis, early treatment and tumor biology determine the prognosis of patients4. Therefore, how to diagnose breast cancer is very important for patients. Ultrasound is widely recognized as a convenient and safe method for screening breast cancer5,6. The ultrasound indicators, especially the shape, direction, edge, internal echo and internal blood flow grading of the nodules, are related to the benign and malignant breast nodules7. However, the misdiagnosis rate of benign and malignant breast nodules based on a single ultrasound index is high8. Therefore, it is necessary to conduct a comprehensive analysis of multi-index ultrasound characteristics.

The Breast Imaging Reporting and Data System (BI-RADS) lexicon was established to ensure standardization and objectivity in ultrasound diagnosis9. However, this grading partially depends on the sonographer's experience10. Some benign and malignant breast nodules were misclassified, especially by young physicians. Misdiagnosing these benign breast nodules as malignant may lead to unnecessary biopsies11. Many have reported that mathematical models coupling with ultrasound characteristics could serve as a means to automatically discriminate among diseases12,13. Logistic regression (Logistics)14, support vector machine (SVM)15 and artificial neural network (ANN)16, partial least squares discriminant analysis (PLS-DA)17, linear discriminant analysis (LDA)18, K-nearest neighbor (KNN)19, and random forest (RF)20 are commonly used models for disease diagnosis. Logistics, PLS-DA and LDA belong to linear models, while ANN, KNN and RF belong to non-linear models21. SVM could solve linear and non-linear classification problems15. Each model possesses its own distinct computational characteristics, however, few studies have compared the diagnostic performances of these diagnostic models. In the present study, we compared the diagnostic effects of ultrasonic imaging combined with different mathematical models, including Logistics, SVM, ANN, PLS-DA, LDA, KNN and RF, in identifying benign and malignant breast nodules.

Materials and methods

Patient information

This study was a retrospective study with no adverse effects on patients and was approved by the Institutional Review Committee of the First Affiliated Hospital of Nanjing Medical University. Breast nodules from female patients undergoing ultrasound examination were collected at the First Affiliated Hospital of Nanjing Medical University from June 2018 to December 2022. The inclusion criteria required complete clinical data for patients and confirmation of nodule lesions through pathology. Borderline diseases such as lobulated tumors of grade II were excluded. All images were independently evaluated by two senior physicians with more than a decade of experience. Doctors would convene to discuss and reassess controversial issues such as the grading of nodule features. If a consensus cannot be reached, the controversial images will undergo further review by a third senior doctor and a discussion will be held to reach a final conclusion. A total of 926 breast nodules were included, comprising 388 benign and 538 malignant ones. The cases of benign nodules included fibroadenoma, adenosis, intraductal papilloma, fibrocystic disease of the breast, lobulated tumors of grade I, hyperplasia of glands, and cyst with inflammatory changes. The cases of malignant nodules included mucinous breast cancer, solid papillary carcinoma of breast, invasive breast cancer (invasive lobular carcinoma, invasive ductal carcinoma and mixed invasive carcinoma), and ductal carcinoma in situ.

Instruments and methods

The birth number, menarche and breast appearance (redness, swelling, and dimples) of patients were recorded. A thorough breast ultrasound was performed using ESAOTE MyLab Twice color Doppler ultrasonic diagnosis instrument with linear array high-frequency probe LA523 and the frequency of 3 ~ 12 MHz probe frequency. The initial preset conditions were as follows: imaging gain at 65%, dynamic range at 10, enhancement at 4, density at 1, depth at 44 mm, persistence at 6, dynamic compression at 3, and transducer resolution set to low (RES-L). In practice, the B-mode image is adjusted to incorporate the target lesion according to patient actual situation to achieve optimal resolution. In the color Doppler ultrasound mode, the sampling frame size was optimized to fully encompass the mass. The color gain was optimized, enabling the detection of low-velocity vascular flow within target lesions with minimal background noise. Ultrasound features included background echotexture, nodular size, shape, margin, internal echo, echo intensity, calcification, alder grade, resistance index and axillary lymph node. These ultrasound features for each nodule were graded according to the BI-RADS lexicon9. Background echotexture can be classified as either homogeneous or heterogeneous. Homogeneous background echotexture is defined by the predominant presence of parenchyma displaying a uniform hyperechoic appearance with minimal isoechoic or hypoechoic characteristics and less than 25% fibro glandular tissue22. Other background echotextures are defined as heterogeneous type.

Pathological diagnosis of nodules was taken as the dependent variable (Y), and the above ultrasound features and patient’s clinical information were taken as independent variables (X). The assignment of these variables is shown in Table 1.

Table 1 Information on the variable assignment.

Statistical analysis

The relationship between ultrasonographic characteristics and clinical information (age, birth number, menarche, breast appearance and pathological type) was analyzed using Pearson's correlation, with P<0.05 indicating a significant correlation. Principal component analysis (PCA) was adopted to analyze the difference between nodule features in benign and malignant groups. Before establishing the models, the relevant variables were selected via the stepwise regression method, where the variable with a P-value < 0.05 was considered to have a significant relationship with the dependent variable23. In data analysis, 90% and 10% of 926 nodule cases were randomly divided into training and prediction sets, respectively, using the Monte Carlo cross-validation method24,25. In order to mitigate errors arising from a single calculation, Monte Carlo simulation was performed 100 times. The average diagnostic rate of each model for benign and malignant breast nodules in the training and prediction sets over 100 computations was separately computed. The diagnostic efficacy among models was compared by using the area under the receiver operating characteristic curves (ROC) in the prediction set. All the programs of these models were performed using MATLAB software. t-test was used to analyze the difference in diagnostic effectiveness among different models, with P<0.05 indicating a significant difference.

Ethics statement

This study was approved by the Institutional Review Committee of the First Affiliated Hospital of Nanjing Medical University (Approval number: 2022-SR-048) and patient informed consent was waived. This study retrospectively analyzed the ultrasonographic image of the patient's previous examination, which posed no potential risk or harm to the patient. The study was conducted with strict adherence to the Declaration of Helsinki. No patient privacy data was included in the data collection, and the data remained strictly confidential throughout the collection process.

Result

Breast nodule features

A total of 926 breast nodules were collected, including 388 benign nodules (41.9%) and 538 malignant nodules (58.1%). Each breast nodule included 10 ultrasound characteristics and 5 clinical information. Via Pearson’s correlation analysis, benign and malignant breast nodules were significantly correlated with patient’s age (r = 0.548, P < 0.05), shape (r = 0.520, P < 0.05), resistance index (r = 0.491, P < 0.05) calcification (r = 0.419, P < 0.05), axillary lymph node (r = 0.414, P < 0.05), birth number (r = 0.389, P < 0.05), internal echo (r = 0.348, P < 0.05), margin (r = 0.346, P < 0.05), alder grade (r = 0.289, P < 0.05), background echotexture (r = 0.200, P < 0.05) and nodule size (r = 0.137, P < 0.05) (Fig. 1). Logistics based on individual ultrasonographic characteristics showed that resistance index has the highest the area under the ROC curve (AUC = 0.764), followed by shape (AUC = 0.735), calcification (AUC = 0.703), axillary lymph node (AUC = 0.676) and alder grade (AUC = 0.660) (Fig. 2). Besides, many correlations were found among ultrasonographic characteristics and clinical information. For example, alder grade was significantly positively correlated with nodular size, shape, calcification, resistance index, and axillary lymph node (P < 0.01). Breast appearance was significantly correlated with margin and axillary lymph node (P < 0.01).

Figure 1
figure 1

Correlation analysis of ultrasonographic characteristics and clinical information, where *P value < 0.05; **P value < 0.01; ***P value < 0.001.

Figure 2
figure 2

Area under the ROC curve (AUC) of individual variances using Logistics.

Principal component analysis and variable selection

PCA enables the projection of samples from a high-dimensional space to a lower-dimensional space, revealing the spatial distribution characteristics among different samples in the data26. PCA demonstrated a certain degree of overlap in the distribution regions between benign and malignant nodule samples (Fig. 3). This overlap introduces an error rate in diagnosing benign and malignant breast nodules, which is consistent with the diagnosis results of individual ultrasonographic characteristics. Therefore, we could apply several mathematical models to analyze ultrasonographic characteristics and clinical information for diagnosing benign and malignant nodules. The stepwise regression method showed that 6 variances had a significant relationship with the dependent variable, including age (Coeff = 3.553, P < 0.01), background echotexture (Coeff = 0.887, P < 0.01), shape (Coeff = 1.835, P < 0.01), calcification (Coeff = 2.157, P < 0.01), resistance index (Coeff = 2.786, P < 0.01), and axillary lymph node (Coeff = 2.320, P < 0.01) (Fig. 4).

Figure 3
figure 3

PCA analysis of ultrasonographic characteristics in benign and malignant breast nodules.

Figure 4
figure 4

Variance selection using the stepwise regression method.

Model analysis

The model for each mathematical model was built based on the training set using the six variables selected by the stepwise regression method. After 100 random runs of Monte Carlo simulation, the diagnosis results of seven methods in the training and the prediction sets are shown in Table 2. In the training set, the diagnostic rates of all tested methods ranged from 0.849 to 0.999 for benign nodules and from 0.915 to 0.971 for malignant nodules, indicating that the diagnosis rate of all models was satisfactory. In the prediction set, the diagnostic rates of Logistics, PLS-DA, Linear SVM, RF, ANN, KNN and LDA, for benign nodules were 0.845, 0.833, 0.881, 0.850, 0.858, 0.846, and 0.851, respectively, and Linear SVM has the higher values than other methods (P < 0.05). The diagnostic rates of Logistics, ANN and LDA for malignant nodules (ranging from 0.910 to 0.912) were the highest and the diagnostic rate of KNN was the lowest (0.865). Among these methods, only the diagnostic rate of RF and KNN in the prediction sets (ranging from 0.846 to 0.898) were much lower than that in train sets (> 0.947). The AUC value of Linear SVM was the highest (0.890), followed by ANN (0.883), LDA (0.880), Logistics (0.878), RF (0.874), PLS-DA (0.866), and KNN (0.855). The AUC values of all models were higher than individual ultrasonographic characteristics (ranging from 0.494 to 0.764)).

Table 2 Diagnosis result of benign and malignant thyroid nodules using different mathematical models.

Discussion

Ultrasound is the primary method used to differentiate between benign and malignant breast nodules27. Our result indicated that patient’s age, birth number, shape, resistance index, calcification, axillary lymph node, internal echo, margin, alder grade and background echotexture were significantly correlated with benign and malignant breast nodules. It has been reported that the breast nodules with irregular morphology, indistinct borders, hypoechoic pattern and suspicion of calcifications suggest malignancy28. Besides, there existed a significant correlation between nodule size and its pathology. In general, the larger the nodules are, the higher their degree of malignancy. However, it is important to note that nodules with smaller size are often overlooked by patients and sonographers. Neovascularization plays a crucial role in the onset, progression, invasion, and metastasis of breast cancer29. The higher the alder grade and resistance index of nodules were, the higher the possibility of malignancy, which should be paid attention to. In the BI-RADS classification, the probability of malignancy ranged from 2 to 95% for nodules in the 4 categories30. However, when a nodule is classified as grade 4, breast surgeons may recommend conducting additional examinations for patients, such as mammography, magnetic resonance imaging, or even core needle biopsy. These often increase the economic cost and psychological burden of patients. In addition, BI-RADS classification depends on the experience of physicians31. Our results showed that the AUC value of the test models ranged from 0.855 to 0.890, which was better than that of the BI-RADS classification based on junior and senior physicians (0.718-0.790 and 0.766-0.870, respectively) reported in the literature32,33. Our finding suggested that all seven models could effectively predict benign and malignant nodules, which could help doctors judge the malignant probability of nodules and reduce unnecessary examinations for patients34.

The stepwise regression method showed that age, background echotexture, shape, calcification, resistance index, and axillary lymph node had significant relationships with breast nodule pathology (P < 0.05), suggesting that these variables had a substantial contribution to the model. Using these relevant variables instead of all variables has the potential to enhance model performance, simplify the model, and avoid overfitting35. Among these models, Linear SVM had the highest diagnosis rate of benign breast nodules, and Logistics, ANN and LDA had the highest diagnosis rate of malignant breast nodules. Linear SVM could be recommended for diagnosing benign nodules, while Logistics, ANN and LDA could be recommended for diagnosing malignant ones. The diagnosis rate of the KNN and RF in the prediction set was significantly lower than that in the training set, which is likely attributed to model overfitting36. Compared with the training set, the results of the prediction set better reflect the performance of the two models. However, the overfitting of RF and KNN in the training set could limit their application in practical work.

Many reports indicate that artificial intelligence technology can classify images directly into benign and malignant categories by extracting feature variables such as color, contour, and texture37,38,39. However, these image feature variables extracted by artificial intelligence technology often lack clinical significance, posing significant limitations during the practical application of the model. In contrast, the clinical information and ultrasonographic characteristics of breast nodules were used for building our models and these variables hold clinical significance. This approach not only uncovers the relationship between breast nodule features and pathology, but also enhances the generalizability of these models.

In our study, these models were constructed using algorithm programs provided by MATLAB software, which could reduce the complexity of data analysis. Indeed, a certain level of programming knowledge is still required when using MATLAB software, particularly for modifying or rectifying inappropriate commands. Additionally, it is important to note that a single computation can result in biased estimates and the Monte Carlo cross-validation method could be employed to obtain robust statistical analysis results.

While mathematical models cannot fully substitute doctors, they can effectively aid in diagnosis. However, there are some limitations to the present study. For example, the present work relied on a limited sample of nodules for assessment. The assessment of controversial nodules by doctors involves subjectivity, and inaccurate judgment of nodule characteristics can further impact the diagnosis40. In future work, we will further incorporate more variables, including pathological types and ultrasound elasticity, to improve model diagnosis performance. In addition, we further used ultrasound images coupled with mathematical models to predict the presence of lymph node metastasis in malignant nodules.