A multicenter, hospital-based and non-inferiority study for diagnostic efficacy of automated whole breast ultrasound for breast cancer in China

This study is the first multi-center non-inferiority study that aims to critically evaluate the effectiveness of HHUS/ABUS in China breast cancer detection. This was a multicenter hospital-based study. Five hospitals participated in this study. Women (30–69 years old) with defined criteria were invited for breast examination by HHUS, ABUS or/and mammography. For BI-RADS category 3, an additional magnetic resonance imaging (MRI) test was provided to distinguish the true negative results from false negative results. For women classified as BI-RADS category 4 or 5, either core aspiration biopsy or surgical biopsy was done to confirm the diagnosis. Between February 2016 and March 2017, 2844 women signed the informed consent form, and 1947 of them involved in final analysis (680 were 30 to 39 years old, 1267 were 40 to 69 years old).For all participants, ABUS sensitivity (91.81%) compared with HHUS sensitivity (94.70%) with non-inferior Z tests, P = 0.015. In the 40–69 age group, non-inferior Z tests showed that ABUS sensitivity (93.01%) was non-inferior to MG sensitivity (86.02%) with P < 0.001 and HHUS sensitivity (95.44%) was non-inferior to MG sensitivity (86.02%) with P < 0.001. Sensitivity of ABUS and HHUS are all superior to that of MG with P < 0.001 by superior test.For all participants, ABUS specificity (92.89%) was non-inferior to HHUS specificity (89.36%) with P < 0.001. Superiority test show that specificity of ABUS was superior to that of HHUS with P < 0.001. In the 40–69 age group, ABUS specificity (92.86%) was non-inferior to MG specificity (91.68%) with P < 0.001 and HHUS specificity (89.55%) was non-inferior to MG specificity (91.68%) with P < 0.001. ABUS is not superior to MG with P = 0.114 by superior test. The sensitivity of ABUS/HHUS is superior to that of MG. The specificity of ABUS/HHUS is non-inferior to that of MG. In China, for an experienced US radiologist, both HHUS and ABUS have better diagnostic efficacy than MG in symptomatic individuals.

Breast cancer is the most common cancer in women worldwide 1 .Over half of all cases (53.0%) occur in less developed regions of the world 1,2 .There is a trade-off between benefit and harm of breast cancer screening 1, [3][4][5][6][7][8][9] .False-positive results lead to anxiety and unnecessary, often invasive diagnostic procedures.Breast cancer screening can often over-diagnose disease and lead to unwarranted treatment.The accuracy of screening services may vary from one population to another, implying that a single screening procedure may not be universally effective.
Inclusion criteria.Inclusion criteria were (1) female patients 30-69 years old; (2) woman who visited doctor for breast cancer examination and (3) no visible signs of breast cancer.Exclusion criteria were: (1) women who were pregnant, breastfeeding or planning to become pregnant; (2) lumpectomy history, contralateral mastectomy, breast augmentation; (3) surgical or percutaneous biopsy in the last 12 months; (4) diagnosis or treatment for cancer in the last 12 months.In each hospital, at least 300 subjects were scanned following similar routine and workflow (Fig. 1).

Qualification of research center and staff.
A total of five hospitals were included in this study.They are: Sun Yat-sen University Cancer Hospital; Chinese Academy of Medical Sciences Cancer Hospital; Tianjin Medical University Affiliated Tumor Hospital; Hangzhou First People's Hospital; Shanghai Jiaotong University Affiliated Xinhua Hospital.
The system under evaluation in this study was the Invenia 3D-Automated Breast Ultrasound System (ABUS), manufactured by GE Healthcare (Sunnyvale, CA USA).ABUS is a computer-based system for evaluating the complete breast.For each evaluation, each breast was imaged in three views: lateral (LAT), anteroposterior (AP) and medial (MED) with an automated 6 to 14 MHz linear array transducer attached to a rigid compression plate (covering areas of 15.4 × 17.0 × 5.0 cm).Each view acquired up to about 300 2D images and reconstructed in the coronal plane from the skin to the chest wall.The standardized review process involves using a patented, thickslice coronal plane for quick navigation through the breast, as well the use of "survey mode, " which is similar to cine and allows radiologist rapid interpret of many images.The acquisition time for each view was approximately 60 s, with about 3-4 min per breast.
HHUS was performed in the supine position by experienced radiologists.The devices used to conduct HHUS included the GE LOGIQ9 (GE Medical Systems, Milwaukee, WI, USA), the Aixplorer system (Supersonic Imagine, Aix en Provence, France), the iU22 Ultrasound System (Philips Medical Systems, Bothell, WA, USA) and the s2000 (Siemens Medical Solutions, Mountain View, CA, USA).
The devices used to perform mammography and obtain mammographic images included the GE Sengraphe DS (GE Medical Systems, Milwaukee, WI, USA), the Hologic Selenia (Hologic, Bedford, MA, USA) and Fujifilm FDR MS-2500 (Fujifilm Corp, Tokyo, Japan).Generally, HHUS was performed by 10 radiologists, all with ≥ 5 years of experience in US examination and diagnosis.ABUS diagnosis was made also by 5 radiologist with at least 5 years' experience of HHUS.MG diagnosis was performed by another 10 radiologist , all with at least 5 years of experience in MG diagnosis.MRI diagnosis was performed by the other 5 radiologist, all with ≥ 5 years' experience in MRI diagnosis of breast.The radiologist performing and interpreting the US images and a different radiologist interpreting the MG were not permitted to know the results of the other current screening examination until their interpretations had been recorded, although prior breast imaging (if any) was available together with risk factor and biopsy/surgical history.
BI-RADS assessment results were sorted into six categories: 0 = incomplete, 1 = normal, 2 = benign, 3 = probably benign, 4 = suspicious, 5 = highly suggestive of malignancy, for HHUS, ABUS, and MG.The highest BI-RADS classified result among HHUS, ABUS, and MG would be the referral reference.For BI-RADS category 3, a magnetic resonance imaging (MRI) test was necessarily provided to distinguish the true negative results with false negative results.For BI-RADS category 1 to 2, there was no referral expecting that 10% of them were randomly selected to do MRI examination.For women classified as category 4 or 5, either core aspiration biopsy or surgical biopsy was done and a pathological diagnosis was followed.The MRI BI-RADS were also assessed.Women with BI-RADS category 1 to 3 of MRI would be considered to be negative.Otherwise, the woman would receive a biopsy examination to get the pathological information.

Results
From February 2016 to March 2017, a total of 2844 women consented to participate in our study.1947 women were eligible for the study and completed scanning examination.Breast density was also reassessed by the radiologist and classified by using BI-RADS density category 1 ("almost entirely fat"), category 2 ("scattered fibroglandular densities"), category 3 ("heterogeneously dense"), or category 4 ("extremely dense")." 24.31% of women were classified as having BI-RADS breast density type1-2, 75.69% type 3-4 by radiologists (Table 1).For analyses, category 1-2was categorized as "low-density breasts, " and categories 3-4 were defined as high-density breasts.
In total (1947 subjects) ( Non-inferiority and superiority analysis of sensitivity.ABUS vs. HHUS.In 30-39 age group (Table 4), non-inferior Z tests showed that ABUS sensitivity (87.21%) was non-inferior to HHUS sensitivity (91.86%) with P = 0.325.As HHUS sensitivity was higher than the sensitivity of ABUS, it lead to superiority test of ABUS vs. HHUS was not available.In the 40-69 age group (Table 4), non-inferior Z tests showed that ABUS sensitivity (93.01%) was non-inferior to HHUS sensitivity (95.44%) with P = 0.014.Superiority test of HHUS vs. ABUS is also not available.
For all participants (Table 5), ABUS sensitivity (91.81%) compared with HHUS sensitivity (94.70%) with non-inferior Z tests, P = 0.015.Therefore, it can be inferred that the overall sensitivity of ABUS are not inferior to that of HHUS.Superiority test of HHUS vs. ABUS for all participants is not available.
In low-density breast subgroup (Table 4), non-inferior Z tests showed that ABUS sensitivity (94.06%) was non-inferior to MG sensitivity (91.09%) with P = 0.008 and HHUS (95.05%) sensitivity was non-inferior to MG sensitivity (91.09%) with P < 0.001.Superiority Mcnemar test show that ABUS sensitivity was not superior to MG sensitivity, P = 0.183 and HHUS sensitivity was not superior to MG sensitivity, P = 0.079.

Discussion
This multicenter study demonstrated that ultrasound (ABUS or US) is superior to mammography in dense breast patients, but perform as good as X-rays in low-dense breast patients.Furthermore, we found that both ABUS and HHUS, the sensitivity is superior to MG, ABUS specificity (92.34%) was non-inferior to MG specificity (90.56%).This conclusion suggests that US at least in symptomatic populations is more effective at detecting breast cancer than MG.
ABUS have been shown to achieve the same diagnostic accuracy as HHUS 18,28,29 .In Su Kyung Jeh study, the diagnostic performance of ABUS was higher than that of HHUS in respect of specificity and accuracy 29 .Chang et al. 25 reported that both ABUS and HHUS had high sensitivity (both 100%) and high specificity (95.0% and 85.0%, respectively) for 69 lesions.In addition, the ABUS had a higher diagnostic accuracy (97.1%) than HHUS (91.4%) for breast masses.The authors concluded that ABUS is a promising modality in breast imaging.In our study, ABUS achieved higher accuracy than HHUS (ABUS 92.66% vs. HHUS 90.50% in all subjects; ABUS 92.90% vs. HHUS 91.08% in the 40-69 age group).In addition, ABUS had the highest specificity compared to HHUS, ABUS and MG (ABUS 92.89% vs. HHUS 89.36% in all subjects; ABUS 92.86% vs. MG 91.68% vs. HHUS 89.55% in the 40-69 age group) .This may be because ABUS can display more coronal plane-related information such as mass margins, shape, spiculations, and distortion associated with tissue retraction (Fig. 2).Meanwhile, breast cancer detection rates were higher in HHUS and ABUS than in MG (20.18% of HHUS / 19.57% of ABUS vs. 14.54% of MG).Most of the breast cancers detected are invasive breast cancer.The reason maybe that most of the participants were actually symptomatic and had tumor diameters with mean diameter close to 20 mm.
In the study, the sensitivity of ABUS was lower than that of HHUS, which may be related to the compression of mammary gland tissue during the operation of ABUS, resulting in unclear display of some lesions.However, from the non-inferiority analysis results, sensitivity of ABUS is not inferior to HHUS.Meanwhile, specificity of ABUS is superior to HHUS in both the 30-39 age group and the 40-69 age group.This may be due to the additional information of the coronal plane and it helps to differentiate between benign and malignant breast lesions 18,[30][31][32] .Therefore, on the issue of diagnostic efficacy, ABUS and HHUS have their own strengths and weaknesses in sensitivity and specificity.Moreover, non-inferiority tests also demonstrated that US (ABUS/HHUS) specificity is non-inferior to MG.Meanwhile, it must also be noted that the ABUS diagnosis is actually made by a radiologist with considerable HHUS experience.In addition, according to the ABUS user interface and habits of the process, operators are generally the first to read the conventional 2-D information before interpretation of the coronal plane.Therefore, the diagnostic decision of ABUS may come from the diagnostic information of the conventional 2-D section more in routine ABUS operation.Thus, the comparison of the diagnostic efficacy between ABUS and HHUS may reflect, to a greater extent, the capability comparison between two experienced HHUS radiologists, other than ability contrast of two different machines.However, the greater strength of ABUS lies in its standardized cross-section and its potential tele-consultation capabilities, which empower experienced radiologist to work for less-experienced areas.
The distinction between efficacy as measured in experimental studies and the effectiveness of a mass population intervention is a crucial one for public health decision-making.Therefore, the limitation of this study is that all the conclusions come from the symptomatic population, thus limiting the extension of evidence to the asymptomatic population.In the future, there is still a need for randomized controlled validation study in asymptomatic populations.
In summary, the sensitivity of ABUS/HHUS is superior to that of MG.The specificity of ABUS/HHUS is non-inferior to that of MG.Therefore, given the affordability, feasibility and good performance of ultrasound, ABUS provides a standardized and reproducible imaging device that can be used for breast cancer detection.

Figure 1 .
Figure 1.Flowchart showing patient selection and study design.

Figure 2 .
Figure 2. A 32-year-old woman with ductal carcinoma of the left breast.(A) shows a hypoechoic lesion with handheld ultrasound (HHUS).And (B,C,D) show the heterogeneous hypoechoic lesions in the medial (B), lateral (D), and anterior-posterior (C) position of Automated Breast Ultrasound (ABUS).

Table 1 .
Patient demographic and clinical characteristics at enrollment.

Table 4 .
Non-inferiority and superiority analysis results in 40-69 years age group.*Non-inferiority P value; # superiority P value; FPR: false positive rate; AUC: area under curve; AC: Accuracy.