Introduction

Breast cancer is still one of the leading causes of cancer-related death for women worldwide1. Methods are being sought that allow early and accurate diagnosis of cancer, and to precisely differentiate the character of the lesion and qualify for biopsy. X-ray mammography has reduced breast cancer mortality up to 45%, and is recommended as a screening method2, but its sensitivity is limited in women with dense breasts. In a group of patients with extremely dense breasts it is reported to be 48%3. Conventional ultrasound (US) is a complementary examination in women with dense breasts. It allows assessment of the cancer features according to the BI-RADS lexicon4, being faster, cheaper, and more accurate than mammography5,6. On the other hand, US has limitations such as reproducibility and quality of the equipment, and therefore is operator and device-dependent7,8. It is worth noting the great development of machine learning techniques in the field of breast cancer classification. Methods based on B-mode image analysis can achieve very high diagnostic performance9,10,11,12,13. A systematic review of the efficacy of cancer classification using various methods of machine learning and imaging diagnostics, including ultrasound, can be found in Yassin et al.14.

Evaluation of tumors based on the BI-RADS classification can be improved by including quantitative ultrasound (QUS) methods. While the B-mode ultrasound does not provide reliable quantitative information on the tissues examined, QUS techniques allow for the evaluation of tissue in terms of its structure and mechanical properties. They can be a valuable complement to ratings based on BI-RADS. Some QUS parameters are closely related to BI-RADS descriptors, while others go beyond the tumor traits described there15. The combination of QUS methods and the BI-RADS assessment has the potential to better identify patients with benign lesions and at the same time increases the accuracy of breast lesion classification16. The use of quantitative parameters enables clinicians to limit the number of biopsies, while still maintaining the same level of detection in terms of malignant cases17. Tissue traits that serve as biomarkers of neoplastic change are usually related to tissue scattering properties, the statistical properties of echo or the texture of tumor images and parametric maps. This classification is based on the differences in the values of these parameters between neoplastic and benign lesions that reflect changes in tissue structure caused by tumour development.

Histopathological analysis shows the differences between the tissue of benign and malignant tumours. Malignant tumours are rich in cellularity; the cells tend to form cell clusters. Benign tumours have more regular arrangement of the cells. Additionally, invasive ductal carcinoma, which forms the majority (80–90%) of invasive breast cancers is characterized by a significantly increased amount of dense fibrous tissue stroma18. It is important for classification techniques that the specific morphological characteristics of cancer tissue are associated with malignancy and have a significant impact on the tumour image and parameters calculated from its ultrasonic echoes. These features result in more complex scatterer composition in malignant tumours19, which can lead to specific signature of backscattered echoes, in QUS.

The scattering properties of tissues can be evaluated by modelling the statistics of the signal envelope using the probability density function (PDF). An overview of distributions used to model scattered signals in soft tissue can be found in the work of Destrempes and Cloutier20. One of the most important distributions used for modelling soft tissue scattering is the Nakagami distribution21. A good description of tissue scattering statistics and its ease in determining the value of the shape parameter made the distribution of Nakagami very popular in QUS techniques16,19,22,23,24,25,26,27. Another feature studied in the context of tumour classification is the information entropy, which is a measure of uncertainty of a random variable28,29. Hughes proposed the use of entropy in quantitative assessment of changes in scattering medium structure30. Tsui et al. have shown that entropy is sensitive to the variability in concentration of scatterers in tissue phantoms31. They then proposed a parametric imaging method based on the entropy determined in small windows32. Another example of the QUS parameter is the texture of the ultrasound image, which also reflects the variability in tumour’s structural echogenicity, as it describes the correlation between spatial distribution and the intensity of pixels in the image. The most often used texture-feature parameters are extracted from the gray-level co-occurrence matrix (GLCM). This method was first described by Haralick et al.33 and now is often used in ultrasound tumour classification25,34.

The results of the breast change classification presented so far have been based on data from the inside of the assessed tumours. However, there are reasons to also use data from the tissue surrounding the tumour, as malignant and benign lesions can have different effects on neighboring tissues18. They differ in morphological features of the cells and essentially the stromal component proliferation. Malignant tumours are not encapsulated, not cohesive and characterized with irregular pattern of growth. Their borders are not well defined and they spread into adjacent tissue rather than displacing or pushing it aside and cause its damage. On the contrary, benign tumours usually have a covering made up of normal cells and their borders are mostly well-defined. They do not penetrate the adjacent tissue and do not damage it.

Furthermore, the B-mode examination underestimates the size of the tumour. Gruber et al.35 studied the imaging of tumours in 121 patients using a mammography, sonography and magnetic resonance methods to assess which one is the most accurate in pretherapeutic sizing of primary breast cancer. Tumour size was found to be significantly underestimated in ultrasound, and the mean difference between the sonographic and histological size was 8 mm. It is worth noting that the greatest difference between sonographic sizing and actual histological tumour size was found with invasive lobular breast cancer. These results have been confirmed by Stein et al. who analysed data from 6543 breast cancer patients and assessed the accuracy of tumour size measurement by ultrasound36. The mean tumour diameter determined by ultrasound was 18.3 mm, whereas the histological mean tumour diameter was 20.8 mm. From these results it follows that the analysis of data limited only to the area of the tumour visible in the B-mode image does not take into account tissue changes in the peripheral area of the tumour and its close vicinity. These areas are important in the classification due to their various modifications during the development of malignant and benign tumours.

The presented study investigated the possibility of distinguishing between malignant and benign tumors using QUS parameters calculated on the basis of signals scattered in the tissue surrounding the tumors, which is the main novelty of the research. The ultrasound data was acquired from 116 patients diagnosed with a suspicious breast lesion. The parameters used in tumour classification were the shape parameter of the Nakagami distribution, the entropy, and ten texture parameters. In our study, parameters determined for the tumour and the rim of the tissue surrounding the tumour were used as independent features in the construction of multi-parametric classifiers.

The observation in this study suggests that the QUS examination of the peritumoral tissue is more effective than the study of the tumour tissue itself. A multi-parametric classifier operating on data from tissue surrounding tumours distinguished changes better than the best classifier operating only on cancer data. On the other hand, the classifier consisting of parameters based separately on tumour and peritumoral tissue data proved to be the best, which suggests the complementarity of quantitative ultrasound information contained in the tumour and external tumour tissue.

Methods

Data acquisition

Research was carried out in the Department of Radiology, Maria Skłodowska-Curie Memorial Institute of Oncology in Warsaw. The Institutional Review Board approved the study protocol. All the procedures performed in the study that involved human participants were in accordance with guidelines set by the 1964 WMA Declaration of Helsinki and its later amendments or comparable ethical standards. All patients signed the informed consent for breast US examination and for ‘backscatter US’ statistical studies. Ultrasound B-mode images and radiofrequency data (RF) were acquired by an experienced sonographer using an Ultrasonix SonicTOUCH® machine (Ultrasonix Medical Corporation, Richmond, BC, Canada) equipped with a L14-5/38 linear probe. The transmit frequency was set at 10 MHz (centre frequency ~7.2 MHz), and the sampling frequency was equal to 40 MHz. The focus was set at the middle of tumour and the number of lines per image was 510. The breast US examinations were performed according to the American College of Radiology BI-RADS guidelines using longitudinal and transverse scan planes4. Lesions of solid BI-RADS category 3, 4, or 5 were included in this study. Ultrasound data from 116 tumours were collected, including 57 malignant and 59 benign cases. The group of malignant lesions contained following tumor types: ductal (21), lobular (10), cribriform (8), tubular (5), micropapillary (3), and mixed (10). The tumors were assessed as low grade (13), intermediate grade (35) and high grade (9). The tumour contour was determined manually by an experienced breast sonographer. Minimal, mean and maximal tumour equivalent diameter (i.e. diameter of a circle whose area is equal to the area of the tumour) in the data set was equal 4.7, 12.7, and 26.6 mm respectively. Core biopsy samples were taken from patients with BI-RADS 4 and 5 lesions. Patients with BI-RADS 3 underwent the fine needle aspiration biopsy (FNAB) and 2 year follow-up. Ultrasound data were categorized on the basis of histopathological, cytological and patient observation findings that diagnosed benign or malignant tumour.

ROI selection

QUS analysis was performed on selected regions of interest (ROI) – a tumour area (internal ROI) and an area surrounding the tumour (external ROI). An example of a tumour image with the ROIs as well as the further data processing scheme is shown in Fig. 1. The width of the analysed tumour rim was selected based on the results of the classification using individual parameters. On average, the best classification results were obtained with a rim width of around 5 mm. This 5 mm width of tissue surrounding the tumours was analysed in all comparisons of the tumour classification efficiency based on data from the internal and external ROI. For convenience, throughout the text, the tissue contained in the contour of the tumour identified by the sonographer is called ‘internal ROI’ and the tissue in the 5 mm thick rim that surrounds the tumour is called ‘external ROI’. Internal parameters define parameters that operate on internal ROI data, as opposed to external parameters that use data from an external ROI. Internal and external classifiers are defined by analogy.

Figure 1
figure 1

Diagram showing the extraction of internal and external ROI from the B-mode image, parametric images of both ROIs and the ROI-averaged parameters determined from them.

Nakagami shape parameter

The Nakagami shape parameter was used to characterize tissue microstructure. The probability density function (PDF) of the Nakagami distribution is given by equation (1) 21:

$$P(A)=\frac{2}{{\rm{\Gamma }}(NAK)}{(\frac{NAK}{{\rm{\Omega }}})}^{NAK}{A}^{2NAK-1}\,\exp \,(-\frac{NAK}{{\rm{\Omega }}}{A}^{2})$$
(1)

where \({\rm{\Gamma }}\) is the gamma function, NAK is the shape parameter, and \({\rm{\Omega }}\) is the scaling parameter associated with average signal power. The shape parameter NAK was estimated using the method of moments21 according to the following formula (2):

$$NAK=\frac{{\langle {A}^{2}\rangle }^{2}}{{\sigma }^{2}({A}^{2})}$$
(2)

where A is the amplitude, \(\langle \rangle \) is the mean and \({\sigma }^{2}()\) is the variance.

Entropy

In this study, weighted entropy37 was used as a measure of local signal envelope heterogeneity. Weighted entropy (ENT) was estimated using formula (3):

$$ENT(A)=-\,\sum _{i=1}^{n}\,w({A}_{i})P({A}_{i}){lo}{{g}}_{2}P({A}_{i})$$
(3)

where n represented the number of samples in a data block, and w and P were the weight and the probability associated with the i-th amplitude value Ai, respectively. Amplitude values A divided by the sum of all amplitude values in the window were used as the weights. The probability P was estimated using the histogram method.

Textural features

The textural features of the US images were extracted using the Gray Level Co-occurrence Matrix (GLCM). The GLCM is a matrix that contains probabilities of occurrence of certain gray tones in a pair of pixels being in a particular relative spatial position. For the purpose of the GLCM calculation the gray scale of the ultrasound images was limited to a 20–100 dB range and quantized into 20 gray levels which resulted in GLCMs of size 20 × 20. The spatial relationship in any considered pair of pixels was defined as a vertical or horizontal displacement by 0.3 mm (4 pixels). We considered the vertical and horizontal textural features to carry different information, and therefore the GLCMs obtained for vertical and horizontal spatial relationships were used for determination of separate texture parameters marked with the letters ‘V’ and ‘H’ respectively. This approach is different in comparison to the previously cited works25,34, where the average parameters were used, estimated on the basis of the GLCM matrix calculated in four directions (0°, 45°, 90° and 135°). Based on each GLCM a number of texture parameters were calculated. These were: contrast (CON), correlation (COR), energy (ENE), homogeneity (HOM), and variance (VAR):

$$CON=\sum _{i,j}\,|i-j{|}^{2}GLCM(i,j)$$
(4)
$$COR=\sum _{i,j}\,\frac{(i-{\mu }_{i})\,(j-{\mu }_{j})}{{\sigma }_{i}{\sigma }_{j}}GLCM(i,j)$$
(5)
$$ENE=\sum _{i,j}\,GLCM{(i,j)}^{2}$$
(6)
$$HOM=\sum _{i,j}\,\frac{GLCM(i,j)}{1+|i-j{|}^{2}}$$
(7)
$$VAR=\sum _{i,j}\,{(i-{\mu }_{i})}^{2}GLCM(i,j)$$
(8)

where i and j indicated the discrete gray levels and were also the indices of GLCM elements. The μ and σ denoted the mean and standard deviation of the i and j coordinates. It must be noted that each GLCM was normalized (so that sum of its elements was equal 1) before determination of said parameters.

Parameters estimation

The spatial parametric images of tumours and the surrounding rims were generated using the sliding window technique. The parametric image represented a map of the parameter value distribution in the analysed ROI, external or internal. The parameter value (Nakagami parameter, entropy, or any of the parameters of the texture) was estimated based on the block of ultrasonic data from each window. The window size was equal 1 mm × 1 mm, which corresponded to three times the pulse length, as recommended by Tsui et al.38. Window overlap in the axial and lateral direction equaled 92%. Parametric maps were corrected to minimize the impact of acoustic beam formation and the system transfer function. Data from the reference tissue phantom (Dansk Fantom Service, model 1126-B) was obtained with the same scan settings as in the patient study. This data was used to generate reference parametric maps. Based on these maps, correction curves were determined for each of the considered parameters, which were then used to improve the parametric maps of tumors and their surroundings. Next, for each tumour the parameter value was averaged over all pixels of parametric maps in the two scanned planes of tumour. The averaged parameter was used as a predictor of neoplastic changes in subsequent classifications. The data processing scheme is presented in Fig. 1.

Statistical analysis

Twelve parameters were considered including Nakagami parameter, weighted entropy, and ten texture parameters determined from the GLCM matrix, five for each vertical and horizontal spatial relationship. For the classification of tumours based on single-parameter and multi-parameter classifiers, the k-nearest neighbours (k-NN) algorithm39 was used, with k equal to 4 and standardized Euclidean distance. Classifiers were cross-validated through the ‘leave-one-out’ technique40. Evaluation of the classification results was based on the analysis of the receiver operating characteristic (ROC) curves41, in particular the area under the ROC curve (AUC), sensitivity, specificity, and accuracy. In order to compare the effectiveness of the classification based on internal and external ROI data, AUC values were determined for all single-parameter classifiers (SPCs) and multi-parameter classifiers (MPCs) built up of all the possible combinations of single parameters. This ‘exhaustive search’ approach42 was applied to three types of MPCs. The first one consisted of parameters determined using data from the tumour (internal ROI), the second from data collected from the tumour rim (external ROI), and the third one used both types of parameters. The best classifiers of each type were chosen based on their highest AUC values. Parameters included in selected MPCs are shown in the results section. To assess the statistical significance of classification results, corresponding p-values were determined through a two-sided Wilcoxon rank sum test. All calculations were done using Matlab® 2017a (The MathWorks, Inc., Natick, MA).

Results

Boxplots characterizing individual parameters values estimated from benign and malignant tumours are presented in Fig. 2, in gray and black respectively. The maximum whiskers length was specified as a 1.5 interquartile range. The AUC values for all single-parametric classifiers estimated inside and outside the tumour are shown in Fig. 3. The usefulness of the tissue surrounding breast tumours to classify benign and malignant lesions was examined using multi-parametric classifiers. For this purpose, the efficiency of classification of all possible classifiers built based on the internal parameters themselves and only external parameters was checked. As a result, two groups of AUC values were obtained, each with a count of 4095. A comparison of the AUC values in these two groups is shown in Fig. 4a. Figure 4b shows a histogram of the differences between the AUC values estimated for the same classifier but first using data collected outside the tumour (external ROI) and the second time operating on the data from the tumour tissue (internal ROI).

Figure 2
figure 2

Comparison of standardized values of individual parameters, internal (a) and external (b), calculated for benign (gray lines) and malignant (black lines) tumours.

Figure 3
figure 3

AUC values comparison between single-parametric classifiers estimated inside and outside the tumour.

Figure 4
figure 4

AUC values comparison of all internal and external multi-parametric classifiers (a) and AUC differences between corresponding external and internal multi-parametric classifiers (b).

The AUC, sensitivity, specificity, accuracy, and statistical significance of SPCs and the best (regarding AUC values) MPCs are presented in the Tables 1 and 2 respectively. Statistical significance has been divided into three classes; extremely significant (\(p < 0.001\)), highly significant (\(p < 0.01\)) and significant (\(p < 0.05\)) and marked with ***, **, and *, respectively. Table 2 contains performance assessment of the best MPC built only from internal parameters, the best MPC based only on external parameters, and the best MPC using the external and internal parameters together. The constituent parameters of the best classifiers were as follows. Parameters NAK, CONH, ENEV, HOMV, and VARH were components of the best internal classifier. Parameters CORV, CORH, HOMV, VARV, and VARH were components of the best external classifier. The combined multi-parametric classifier used the parameters CORV and ENEV determined from the tumour and CONV, ENEV, ENEH, HOMV, and HOMH determined on the basis of data from the tissue surrounding the tumour.

Table 1 Performance comparison of single-parametric internal and external classifiers.
Table 2 The performance parameters of the best multi-parametric classifiers.

Discussion

In the presented studies, the effectiveness of breast cancer classification was compared on the basis of QUS analysis of two types of ultrasound data, collected from within the tumour and from the surrounding tissue. The quantitative parameters included the shape parameter of the Nakagami distribution, weighted entropy, and a set of texture parameters. In Fig. 2 boxplot pairs corresponding to the values of parameters determined for benign and malignant tumours are shown. For almost all boxplot pairs, the difference of medians in the external parameter group was greater than in the group of internal parameters. The only exceptions were CONH and CORH. The greater difference between medians generally translates into a higher classification efficiency. This is clearly visible in Fig. 3, which compares the AUC values for lesions classification using external and internal SPCs. In most cases the AUC for external parameters was larger than for internal parameters. The only exceptions were again CONH and CORH parameters. The largest difference in AUC was observed for the HOMV parameter (ΔAUC = 0.22). The AUC values for external SPCs in six cases reached values above 0.8. P-values for SPCs indicated that most of them are statistically significant from the point of view of differentiation of benign and malignant tumours. Among external SPCs, 11 out of 12 were characterized by a very high statistical significance (\(p < 0.001\)). In the case of internal SPCs, there were five cases with similar statistical significance. No statistically significant difference (\(p > 0.05\)) was found for the two SPCs, based on internal VARH and external CORH, and the resulting AUC values (0.59 and 0.55 respectively) were close to 0.5 which corresponds to a random classification. It is worth noting that both of these parameters were calculated from the GLCM matrix determined for the horizontal direction. Their counterparts counted for vertical direction GLCM were characterized by a significantly lower p-value and higher AUC (see Table 1). The reason for this difference can be twofold. The Point Spread Function (PSF) of ultrasonic imaging systems operating in the pulse-echo mode is diametrically different for vertical and horizontal directions, which can translate into the estimates of texture parameters determined horizontally and vertically. Another explanation may be that malignant tumours tend to grow deeper into the breast more often than benign, which may cause their non-isotropic structure. It seems that the textural parameters of ultrasound images for vertical and horizontal directions should be considered as describing different texture properties. Their mixing, i.e. the calculation of the average parameter values for several directions at once, may lead to the loss of information contained in each of them individually. However, it should be emphasized that the proposed explanations are only hypothetical and require further research.

Histograms of AUC values calculated for multi-parametric classifiers are shown in Fig. 4a. It can be seen that the external MPCs on average work better compared to internal ones. Differences were also determined between the AUC values estimated in external and internal ROI for each MPC. The histogram of the ΔAUC is shown in Fig. 4b. It is very clearly seen that most of the differences are positive, which means that in most cases the external classifier had a higher AUC compared to its internal equivalent. For 4095 compared classifiers, only eight internal classifiers had a higher AUC (Fig. 4b). As in the case of single-parameter classifiers, multi-parametric classifiers using external parameters gave clearly better results than those using internal parameters. It is also worth noting that the effectiveness of the best classifier operating solely on the basis of external parameters was similar to the performance of the best classifier (Table 1), which used two internal parameters and five external parameters and reached the highest AUC (0.94) and accuracy (0.91).

The presented results suggest difference in the ultrasound backscatter from the tissue surrounding the benign and malignant tumors. These results are consistent with previous studies which have provided a potential biological explanation. The peritumoral tissue had been intensively studied due to its important role in the development of cancer and its spread. Peritumoral invasion of cancer cells is a prognostic factor significantly associated with an increased risk of recurrence and death in patients with breast cancer43,44. The alterations in stroma play an important role in invasiveness of the breast cancer. Itoh et al.45 suggested that changes in the peritumoral stroma, in particular an increase in stiffness was due to infiltration of tumor cells. In the early stage of cancer spread, the amount of collagen in the tumor’s surrounding tissue increases46. These collagen changes on the margins of the tumour are known as ‘desmoplasia’. In breast cancer it is increased collagen cross-linking which leads to increased focal adhesion and induces the tumour invasion47. What’s more, the actual orientation of collagen fibers also facilitates the invasion of cancer cells48. The fibers, which are ‘curly’ and anisotropic in normal stroma, in the case of invasive cancer form a characteristic signature of straight and bundled collagen fibers oriented perpendicularly to the tumor border46,49,50. Tumor cells on the border of the tumor with the stroma enter the stroma along the radially arranged collagen fibers. It has been suggested that these tumor-associated collagen signatures may serve as a feature helpful in identifying and characterizing breast tumors50. An important role of the peritumoral tissue was demonstrated by Tadayyon et al.51 when monitoring the effects of neoadjuvant chemotherapy. Changes in the tissue surrounding the tumor caused by the therapy were sufficient to detect them using ultrasound. The QUS parameters estimated in the tumor and tumor margin indicated the degree of tumor response to therapy and could predict a 5 year survival without recurrence. A high diagnostic value of acoustic features of peritumoral tissue has been suggested, which is consistent with our results. It is also worth noting that Tadayyon et al. has chosen a margin width of 5 mm as optimal for characterizing the tumor response to chemotherapy, and the same width of the tumor margin was optimal in the presented above classification results. This suggests the existence of a certain range of the thickness of the peritumoral tissue, in which malignant tumors affect the acoustic properties of the tissue and changes in these properties can be estimated using the QUS methods.

The use of external parameters improves the classification, but also has some limitations. Some tumors are located so shallow that the surrounding tissue includes the skin, which affects the determined QUS parameters. In such cases, it is not recommended to use part of the skin-related QUS map when determining the average parameters. Incorrect determination of external parameters may be also caused by high attenuation in the tumor. In the resulting acoustic shadow the signal-to-noise ratio (SNR) is low and the QUS parameters are subject to a large error. In extreme cases, the acoustic shadow makes it impossible to determine the lower edge of the tumor, thus making it impossible to determine areas for calculating internal and external parameters.

Conclusions

The obtained results indicate that the signals received from the surroundings of breast tumours contain important information allowing for the classification of tumours as malignant or benign. Moreover, the majority of the individual parameters tested showed a higher classification efficiency when they were calculated for the tumour surroundings, and not for its interior. A similar relationship was observed in the case of multi-parametric classifiers, which is important in the context of the development of new cancer classification methods based on QUS techniques. In our research, the multi-parameter classifier, using a combination of internal and external parameters, was the best, although the best classifier using only external parameters was not significantly worse. It also suggests that external parameters are even more important and more valuable for classification than internal ones. In conclusion, we believe that quantitative analysis of peritumoral tissue can complement conventional US of breast tumours, thereby making it easier to diagnose breast lesions. The results of the study also show that alterations in peritumoral stroma caused by malignant neoplasms translate into changes in tissue properties that can be detected using ultrasonic quantitative techniques. These findings may be useful in research on the development and spread of cancerous tissue.