Main

Neoadjuvant chemotherapy (NACT) is an increasingly used approach for patients with locally advanced or primarily inoperable breast cancer, and is an option for patients with potentially chemosensitive tumours (Fisher et al, 1997; Kaufmann et al, 2006). As the achievement of pathological complete response (pCR) is significantly associated with a favourable disease free and overall survival, it is proposed as a surrogate clinical endpoint for long-term survival (van der Hage et al, 2001; Bear et al, 2006; von Minckwitz et al, 2012; Cortazar et al, 2014).

Recent studies have demonstrated that shrinking tumours need less surgical treatment (von Minckwitz et al, 2008). Extrapolating this to the extreme edge, patients with a pCR would require only a reduced extent of surgery or potentially no surgery at all (Kummel et al, 2014). In any case, these hypotheses still need to be tested in prospective, randomised trials. The most important precondition for such a trial would be an exact diagnosis of a pCR without surgery.

Up to now, prediction of pCR, that is, diagnosing a pCR without surgery, is based on tumour biology at diagnosis, the applied NACT regimen and breast imaging results; all with mediocre accuracy (Gianni et al, 2005; Chagpar et al, 2006; Goldstein et al, 2007; Tiezzi et al, 2007; Shin et al, 2011). Tumour biology has already demonstrated to predict pCR to some extent; for example, triple negative breast cancers (TNBC) show pCR rates of up to 64% (von Minckwitz et al, 2014b) and for HER2+ tumours pCR rates of up to 66% (Schneeweiss et al, 2013), but much lower pCR rates in luminal-type-like tumours (von Minckwitz et al, 2012).

To assess clinical tumour response, physical examination, breast ultrasound, mammography, and breast magnetic resonance imaging may be used (Fiorentino et al, 2001; Denis et al, 2004; Londero et al, 2004; Yeh et al, 2005), whereas breast MRI was found to be the most accurate to evaluate tumour response to NACT (Croshaw et al, 2011; Marinovich et al, 2013). Definition and assessment of clinical complete response (cCR) differed relevantly among several imaging studies even if the procedures were based on the WHO (Miller et al, 1981) or the EORTC/RECIST (Eisenhauer et al, 2009) definitions. For example, Schott et al determined the sensitivity of physical examination, mammography, ultrasound, and MRI for detecting a pCR in this situation to be 50%, 50%, 25%, and 25%, respectively (Schott et al, 2005); Shin et al reported an accuracy of pCR prediction in cases with a cCR after NACT to be 38% for mammography, 13% for ultrasound, and 75% for MRI (Shin et al, 2011).

Owing to this diagnostic uncertainty, surgery after NACT is considered obligatory for all patients to completely remove residual disease in non-pCR cases and/or to diagnose a pCR (NICE, 2009).

Aim of the study

In order to improve pCR prediction (diagnosis without surgery), we aimed to explore the ability of minimal invasive biopsy (MIB) techniques to predict (diagnose) pCR in the breast after NACT.

Patients and Methods

Study design and patient cohort

In this multicenter analysis, we included anonymised, prospectively assembled data of patients with histologically confirmed non-metastatic invasive breast cancers between 2009 and 2013. Patients were consecutively enroled if they had a cCR diagnosis (according to the definitions at unit level) after having received NACT. We intentionally focused this study on a cohort of cCR cases because cCR after NACT is becoming more and more frequent. This situation is both due to the widespread use of NACT and the growing efficiency of treatment protocols. Up to date, NACT results in a cCR rate of about 23–28% (von Minckwitz et al, 2014a). The study was organised and funded by the German Breast Group (data management and quality control) and the participating centres (equipment, personnel, local administration). At all sites, informed consent was obtained from each patient for the performance of the MIB after NACT as a diagnostic procedure. The ethics approval of the institutional review board (Heidelberg University, Medical School, Germany) was obtained for the retrospective, anonymised, pooled analysis of the prospectively assembled data from all centres. Patients’ characteristics are shown in Table 1 – Cohort description.

Table 1 Cohort description

Treatment

Patients received NACT according to national guidelines (anthracyclin- and taxan-based regimen, HER2+ tumours were treated with a trastuzumab-based regimen; Wockel and Kreienberg, 2008). Some patients were treated in phase III NACT protocols (GeparQuinto and GeparSixto; von Minckwitz et al, 2014b). All patients underwent breast-conserving surgery or mastectomy after MIB.

MIB procedures after NACT

Core-cut (CC) and vacuum-assisted (VAB)-MIB techniques were performed according to the respective national guidelines (ACR Guidelines and Standards Comiittee BL et al, 2008; Wockel and Kreienberg, 2008) before surgery. The technical procedure (equipment, machines, and guidance) did not differ in principle from standard diagnostic biopsies, although guidance might be more challenging after NACT than in a primary diagnostic setting, as discussed below. Core cut was performed by experienced physicians preferentially using 14 gauge needles, VAB using 9–11 gauge needles. The physicians were regarded as experienced, if they had a history of at least 3 years of continued practice in the regarding techniques with >50 procedures per year. The interval between MIB and surgery was 7 days at most.

Histopathological evaluation and assessment of pathological tumour response to NACT

Pathological examination and immunohistochemistry of the pre-treatment MIBs were performed as part of routine clinical practice according to national guidelines (Wockel and Kreienberg, 2008). Tissue of MIB and surgical specimen were examined by the same local pathologist (non-blinded setting) and reported as containing invasive and/or non-invasive tumour cells or not. We defined pCR as histopathological complete absence of vital invasive and non-invasive tumour cells in all removed breast specimen as part of breast-conserving surgery or mastectomy (ypT0).

Radiological assessment and interpretation – definition of cCR

Cases were classified as cCR when no signs for residual disease using physical examination and/or ultrasound and/or mammography and/or MRI were detected (Eisenhauer et al, 2009) according to the local standard operating procedures at unit level.

Statistical analysis

The statistical analysis was performed with SPSS Statistics software version 20.0 (IBM Corp., Armonk, NY, USA). This is an explorative study using descriptive statistical methods. Reported P-values are not adjusted for multiplicity owing to the high number of performed tests and therefore have to be interpreted descriptively. Phi-coefficients were accessed to evaluate the representativity of the correlations found in logistic regression.

To analyse the diagnostic accuracy, we calculated the false-negative rate (FNR), sensitivity, and specificity as well as the negative predictive values (NPV) for the whole study cohort and the three defined subgroups (TNBC, HER2+, and HR+/HER2−) were calculated without adjustment for prevalence of cCR and pCR. Thereby, it was assumed that estimates for the prevalence of cCR and pCR can be directly deduced from this representative cohort, and therefore the NPV can be estimated from the sample. Corresponding 95% confidence intervals were calculated based on a normal approximation for binomially distributed data.

We defined the NPV to be the most important measure in order to address our research question. To calculate the measures of diagnostic accuracy, we used the following definitions: if both MIB and the surgical specimen were staged as ypT0 in pathological workup, it was counted as a true-negative MIB (pCR was correctly detected). If MIB was staged as ypT0 and the surgical specimen was ypTis or higher, it was referred to as a false-negative MIB (pCR was falsely assumed). If MIB was staged as ypTis or higher and the surgical specimen was ypT0, it was defined as a false-positive MIB (pCR was falsely ruled out). If both MIB and the surgical specimen were staged ypTis or higher, it was counted as a true-positive MIB (pCR was correctly ruled out).

We also compared the NPVs of the two centres that contributed most cases with the NPVs of the entire cohort, to test if there might have been a learning effect in centres contributing many cases.

Univariate logistic regression analyses was performed to investigate if the kind of MIB procedure (VAB vs CC), the use of a clip marker and the number of biopsy specimen taken by MIB are predictors for a correct diagnosis of pCR in MIB. The statistical significance of the differences in odds ratios (OR) between the predictors was assessed by means of the Mantel–Haenszel test.

Finally we presented the relationship of false-negative biopsies and amount of residual tumour by reporting frequencies of the ypT stages of the cases with false-negative biopsies.

Manuscript structure

The manuscript structure and content adhere to STARD statement (Bossuyt et al, 2003).

Results

Patient and tumour characteristics

A total of 164 patients with invasive breast cancers treated in 26 different institutions met the inclusion criteria for this analysis. Treatment was performed and data collected from 2009 to 2013.

With regard to the different biological breast cancer subtypes, 50 (30.5%) were HR+/HER2−, 62 (37.8%) were HER2+, and 52 (31.7%) were TNBC. Core cut was performed in 116 patients and VAB in 46 patients. Core cut was predominantly guided by ultrasound (n=112; 96.6%), only two CC were guided by mammography. The VAB was guided by mammography in 16 patients and by ultrasound in 30 patients. In two cases the information on MIB technique was not available.

In all, 93 cases (56.7%) were diagnosed with pCR after surgery. The pCR rate was highest in patients with HER2+ tumours (n=46; 74.2%) followed by patients with TNBC (n=35; 67.3%), and patients with HR+/HER2− tumours (n=12; 24.0%). For details see Table 1.

Negative predictive values and FNRs of MIB diagnosis of pCR

The NPVs were 71.3% (95% CI: (63.3%; 79.3%)) in the whole cohort and 75.6% (95% CI: (63.0%; 88.1%)) in TNBC; 83.7% (95% CI: (73.3%; 94.0%)) in HER2+, 42.9% (95% CI: (4.5%; 61.2%)) in HR+/HER2− cancers. The FNR of MIB in the whole cohort was 49.3% (95% CI: (40.4%; 58.2%)). The FNRs differed slightly between the subgroups, ranging from 64.7% (95% CI: (50.7%; 78.7%)) in TNBC to 50.0% (95% CI: (36.0; 64.0)) in HER2+ and 42.1% (95% CI: (23.8%; 60.4%)) in HR+/HER2− cancers. For details see Tables 2 and 3.

Table 2 Results of MIB and pathological result of surgical specimens for the overall cohort
Table 3 Statistical quality criteria of MIB diagnosis of vital tumour cells

None of the mammographically guided VABs showed a false-negative result (0 out of 16 cases; NPV 100%; FNR 0%), whereas the CCs showed 28 false-negative results in 116 cases (NPV 70.2%; FNR 60.9%). In the ultrasound-guided VAB group 8 in 15 negative MIB were true-negative (NPV 53.3% (95% CI: 78.6%; 28.1%)).

Evaluation of a possible learning effect in high volume-contributing centres

In our analysis, there was no indicator of a learning effect or an alteration of procedure due to experience. More specifically, the NPV of the two largest contributing centres (NPV=71.3 (95% CI: 60.9–83.9)) does not differ significantly from the NPVs of the entire cohort (NPV=72.4 (95% CI: 63.2–79.3)).

Predictors of pCR in univariable logistic regression analysis

In univariate logistic regression, the superiority of VAB in general (mammographically and ultrasound guided) over CC in general was not statistically significant (OR 1.15; 95% CI: (0.44; 3.05), P=0.776). The MIB procedures guided by a clip marker tended to achieve a higher rate of true-negative results (OR 1.98; 95% CI: (0.81; 4.85), P=0.137) than without the use of a clip marker. The use of a clip marker also improved the NPV (74.2% (95% CI: 65.3%; 83.1%) with clip marker vs 62.1% (95% CI: 44.4%; 79:7%) without clip marker). More than three biopsies taken by MIB did not lead to a higher accuracy compared with patients with less than three biopsies taken (OR 0.67; 95% CI: (0.20; 2.26), P=0.516). For details see Table 4.

Table 4 Univariate logistic regression. Predictors of true-negative MIB

False-negative biopsies and residual disease

In all, 35 MIBs were false-negative. In only 11 cases (31.4% of the false-negative biopsies), the surgical specimen showed minimal in situ residual disease (ypTis). Of those showing invasive residual disease 19 cases had a tumour <2 cm (ypT1) and 5 cases >2 cm (ypT2).

Discussion

To the best of our knowledge, this is the first publication on the use of MIB procedures to diagnose pCR in breast cancer patients after NACT. The motivation and underlying idea of this research is likewise hypothetical but clinically relevant: if pCR can be diagnosed with sufficient certainty without a surgical intervention the latter might be without benefit for the patient. So far, there is no consensus about which NPV is regarded high enough to discuss a reduction of the extent of surgery, but the NPV found in this study population (71.3%) is definitely too low to recommend therapeutic consequences.

Nevertheless, interestingly no false-negative result was obtained in patients who underwent mammographically guided VABs (NPV=100%; n=16). This finding suggests that MIB procedures using the most reliable guidance to the (former) cancer location (what is regularly a clip marker in cCR situations) might have the potential to result in a very high accuracy in diagnosing a pCR.

Moreover, the overall NPV of MIB in our analysis was higher than the NPV of standard breast imaging (including MRI) as reported by Crowshaw et al (NPV=60%; Croshaw et al, 2011). This finding is in line with the results of this analysis (NPV diagnosing a pCR with imaging alone: 56.6%; 95% CI: (64.3–55.3%)).

A correct localisation of the potentially remaining tumour cells and/or clip marker is crucial for a representative MIB. In this multicenter analysis, MIB procedures were carried out according to national guidelines, but were not standardised between the contributing units and are used experimentally without much experience in this special clinical setting. The ultrasound-guided VAB subgroup presented a lower NPV of 53.3%, suggesting that tumour localisation was more accurate using mammographical guidance. The poorer performance of ultrasound-guided MIB might be due to the fact that clip markers optimised for ultrasound detection have been developed only recently and there is still little experience in their detection.

Obtaining multiple specimen in a biopsy (more than three specimen taken either in the same or different positions) did not improve the probability of a true-negative result (OR 0.67; 95% CI: (0.20; 2.26), P=0.516) in logistic regression analysis. One possible explanation might be that in cases of difficult localisation of the tumour in imaging, more biopsies were taken in order to increase the probability of hitting the tumour bed despite bad imaging conditions. This result might suggest that more biopsies cannot overcome bad localisation conditions.

To our mind, an important result of the analysis is that within the group of mammographically guided VAB (a procedure in which a clip marker is always used to detect the former tumour bed) there was no false-negative result of the MIB, hinting at a positive impact of the use of a clip marker and a high volume of the VAB specimen (9–11 gauge in VAB rather than 14 gauge in CC). We also tested whether the superior performance of VAB might be due to a selection bias, for example, because the radiologist’s decision for a certain technique was influenced by tumour biology or response to therapy, by comparing the structure of the VAB group to the overall cohort, but found no significant differences.

As a minor aspect: In the overall cohort MIB had a specificity of 93.5% (95% CI: (89.2%; 97.9%); six false-positive results). One would not expect false-positive biopsies, because if the surgical specimen does not contain tumour cells, the MIB should not show any tumour cells either. One possible explanation is that a tumour of smaller size than the MIB specimen is completely removed through the MIB and therefore the surgical specimen is tumour-free. It may also be possible that the tumour is missed by the surgery and invasive tumour cells remain in the patient, which might lead to a false-positive biopsy. However, since a systematic, interdisciplinary tumour board review was performed in all cases after surgery, which is used to monitor the appropriateness of surgical treatment for all patients, we may expect that the rate of tumours missed by surgery is extremely low.

To draw any conclusions regarding an eventual reduction of the extent or at best omission of breast tumour surgery in cases of assumed (and MIB diagnosed) pCR, further prospective studies with standardised MIB procedures and a more detailed pathological workup including information on the representativity of the specimen are needed.