Efficacy and safety of core-needle biopsy in initially detected thyroid nodules via propensity score analysis

We compared the efficacy and complications of core-needle biopsy (CNB) with those of fine-needle aspiration (FNA) in a large population of patients with initially detected thyroid nodules via a propensity score analysis. Outpatients with initially detected thyroid nodules, who had undergone CNB or FNA between January 2013 and December 2013, were selected. This study included 4,822 thyroid nodules from 4,553 consecutive patients. Adjustments for significant differences in patients’ baseline characteristics were facilitated via propensity score analysis. Subgroup analyses were performed according to nodule sizes ≥ 1 cm. The non-diagnostic result rate, malignancy rate, complication rate, and diagnostic accuracy were compared. A 1:1 matching of 1,615 patients yielded no significant differences between two groups for any covariate. The non-diagnostic result rate was significantly lower in the core-needle biopsy group than in the fine-needle aspiration group (5.2% vs. 12.1%), while the malignancy rate (23.7% vs. 11.8%) and sensitivity (75.9% vs. 55.6%) were significantly higher. However, the specificities were similar (100% and 99.9%, respectively). Propensity score and subgroup analyses showed similar results. The complication rate was similar between groups in matched cohorts. CNB is a promising and safe diagnostic tool for patients with initially detected thyroid nodules.


Methods
Study population. The protocol of this observational study was approved by the institutional review board of Asan Medical Center, a tertiary referral center, which waived the requirements for informed written consent for use of these data. All methods were performed in accordance with the relevant guidelines and regulations.
The study population was obtained from a historical cohort of 6,762 thyroid nodules from 6,493 consecutive patients who underwent CNB or FNA between January 2013 and December 2013 at Asan Medical Center, an academic, tertiary referral hospital located in South Korea. Patients who had previously undergone CNB or FNA were excluded (n = 1,940). Finally, a total of 4,822 cases of initially detected thyroid nodules from 4,553 patients were included in this study: 2,114 nodules from 1,928 patients who had undergone CNB (CNB group) and 2,708 nodules from 2,625 patients who had undergone FNA (FNA group). Figure 1 shows the flowchart for CNB and FNA patient inclusion.
As a reference standard, malignancy was diagnosed after surgery or after CNB. Benign nodules were diagnosed accordingly: after surgery; after CNB; after at least two sets of benign findings on FNA on different occasions; or after benign cytology findings on FNA with the nodule size remaining stable after 1 year 20 .
Analysis of US findings. US examinations were performed using one of the three US systems, i.e. an iU22 or an HDI-5000 unit (Philips Healthcare, Bothell, WA), or an EUB-7500 unit (Hitachi Medical Systems, Tokyo, Japan). Each system was equipped with a linear, high-frequency probe (5)(6)(7)(8)(9)(10)(11)(12)(13)(14). All US-guided procedures were performed by radiologists under the supervision of two faculty radiologists (J.H.B. and J.H.L., with 19 and 14 years of clinical experience, respectively, in performing and evaluating thyroid US). US-guided CNB and FNA procedures for thyroid nodules were performed according to current practice guidelines 21 . US-guided CNBs were performed under local anesthesia with 1% lidocaine using a disposable 1.1-or 1.6-cm excursion, 18-gauge, double-action, spring-activated needle (TSK Ace-cut; Create Medic, Yokohama, Japan) 11,14,15,20 . Before insertion of the core needle, the vessels along their approach route were evaluated by power Doppler US in order to prevent hemorrhage. Using a freehand technique, the core needle was advanced from the isthmus of the thyroid toward the nodule 20 . After the needle tip had been advanced towards the edge of the nodule, the distance of fire (1.1 or 1.6 cm) was measured before sequential firing of the stylet and cutting cannula of the needle 20 . Several reports suggested that CNB may be beneficial for thyroid nodules with certain characteristics of nodules such as calcification or predominantly cystic component 22,23 . On the basis of these evidences, we performed CNB for heavily calcified nodules and predominantly cystic nodules for which FNA may be less effective. Additionally, whether to perform CNB or FNA was determined according to the referring physicians' preference.
US-guided FNAs were routinely performed using a 23-gauge needle. Direct smears were made in all cases, and all smears were immediately fixed with alcohol after the FNA procedure and were stained with Papanicolaou 15 . The number of needle passes was determined by the operator during the procedure, and a maximum of four needle passes were permitted for each nodule. Additional FNAs were recommended in the case of incomplete visual assessment results. Maximum number of FNAs performed during a single session was three. The adequacy of the procedure was monitored via real-time US, and the adequacy of the tissue core was assessed by visual inspection 20 . An additional CNB was performed when targeting of the lesion was considered inaccurate. Maximum number of CNBs performed during a single session was two. Each patient was observed after firm local compression of the biopsy site for 10-20 minutes following the biopsy. If patients complained of pain or neck swelling, a repeat US examination was performed in order to evaluate possible complications.
Histopathologic analysis of CNB specimens and cytopathologic analysis of FNA. All CNB specimens and FNA cytological analyses were reviewed by a thyroid cytopathologist (D.E.S., with 11 years of clinical experience in thyroid cytopathology). FNA cytology diagnoses were categorized into six categories according to the Bethesda System for Reporting Thyroid Cytopathology 15,24 . Although standardization of CNB diagnostic criteria for thyroid nodules had yet to be established during our study, the histologic results of CNB were categorized into the six categories of the Bethesda System (Supplementary materials) 11, 14, 15, 24, 25 . Image review. The US images were reviewed independently by two radiologists (C.H.S. and J.H.B. with 5 and 19 years of clinical experience, respectively, in performing and evaluating thyroid US), neither of whom had any information regarding the patient's clinical history, previous imaging results, or histologic results. Discrepancies between the US findings of the two investigators were resolved by consensus. The US findings for the nodules were evaluated for the following features: internal composition (solid, predominantly solid, predominantly cystic, or cystic), shapes (ovoid to round, taller than wide, or irregular), margins (smooth, spiculated, or ill-defined), echogenicity (isoechoic, hypoechoic, markedly hypoechoic, or hyperechoic), and the presence of microcalcifications, macrocalcifications, or rim calcifications 20, 26, 27 . Statistical analysis. A comparison of study outcomes included non-diagnostic result rate, malignancy rate, complication rate, and diagnostic accuracy. Major and minor complications were defined by the Society of Interventional Radiology 28,29 . A major complication was defined as either a potentially life-threatening event requiring clinical treatment, or that which may lead to substantial morbidity or disability, or result in lengthened hospitalization 28,29 . All other complications were considered as minor. The diagnostic criteria for malignancy were defined as Bethesda category 6 (malignancy). The diagnostic accuracy of CNB included its sensitivity and specificity for a diagnosis of malignancy.
It was anticipated that the CNB and FNA groups would differ substantially with respect to preprocedural characteristics. Therefore, we collected information on baseline variables that were available in both CNB and FNA registries to facilitate adjusted comparisons. In order to limit the effects of selection bias and potentially confounding variables in this observational study, we performed adjustments for significant differences evident in patient baseline characteristics via the use of propensity score matching and inverse probability weighting [30][31][32] . Propensity scores estimating probability, on the basis of patient characteristics, were developed with the use of a logistic regression model in the adjustment of between-group differences evident in patient baseline characteristics. After all propensity score matching was performed, we compared the baseline variables between the two groups. Distribution of propensity scores was evaluated via procedural modality examining the sufficient overlap between groups in ensuring comparability 31 . In assessing the adequate level of calibration, c-statistic and Hosmer and Lemeshow Goodness-of-Fit tests were performed. Inverse probability weighting, which is a propensity score-based technique that can be used to compensate for imbalances in study groups, was also performed 32 . Additionally, subgroup analyses were performed according to nodule sizes ≥ 1 cm. Significance was defined as p < 0.05. All statistical analyses were performed using SAS 9.4 (SAS Institute, Cary, NC).

Results
Characteristics of the study population. All patients tolerated the CNB and FNA procedures. The baseline characteristics of the study patients, according to the procedure, are shown in Table 1. In the CNB group, these patients included 446 men and 1,668 women with a mean age of 53.4 ± 12.6 years (mean ± SD). The mean size of the 2,114 nodules was 1.68 ± 1.23 cm, with 1,396 nodules (66.0%) being ≥ 1.0 cm. Of these patients: 860 underwent CNB because their nodules had suspicious features seen on US; 125 had heavily calcified nodules; and 72 had predominantly cystic nodules. The remaining 1,057 patients underwent CNB due to their referring physician's preference. In the FNA group, these patients included 639 men and 2,069 women with a mean age of 53.9 ± 11.8 years (mean ± SD). The mean size of the 2,708 nodules was 1.24 ± 0.82 cm, with 1,474 nodules (54.5%) being ≥ 1.0 cm. Significant differences were evident between the two groups in terms of nodule composition, margin, echogenicity, and calcification. The frequency of suspicious US characteristics including spiculated margins, marked hypoechoicity, and micro/macrocalcifications in the CNB group was significantly higher than that in the FNA group 26 . The mean patient follow-up periods were 20.8 ± 13.7 months in the CNB group and 20.6 ± 13.9 months in the FNA group.
Characteristics of the patients matched for propensity scores. Propensity score matching was performed for the entire population. A total of 1,615 patients who underwent CNB were matched 1:1 with patients who underwent FNA for age, sex, US characteristics that included nodule size, composition, shape, margin, echogenicity, and calcifications by propensity score matching. The logistic model including the 8 variables yielded a c-statistic of 0.7243. Hosmer and Lemeshow Goodness-of-Fit test showed that the propensity score model had an adequate level of calibration (p = 0.164). In matched cohorts, there were no longer any significant difference between the CNB group and the FNA group for any covariate (Table 1).
Outcomes for the pooled cohorts and the matched cohorts. The results and final diagnoses of the CNB and FNA groups are summarized in Table 2. Table 3 shows the study outcome according to the procedure modality in pooled cohorts and the matched cohorts. Non-diagnostic result rate was significantly lower in the CNB group (5.2%; 109 of 2,114 nodules) than in the FNA group (12.1%; 329 of 2,708 nodules; p < 0.001). Malignancy rate was significantly higher in the CNB group (23.7%; 501 of 2,114 nodules) than in FNA group (11.8%; 319 of 2,708 nodules; p < 0.001). The complication rate was slightly higher in the CNB group (0.33%; 7 of 2,114 nodules) than in the FNA group (0.074%; 2 of 2,708 nodules; p = 0.048). None of the included patients experienced any major complications associated with intervention or hospitalization. Eight patients developed a hematoma after the procedure, but resolution of the hematoma occurred following compression and rest for 1 hour. One patient developed a vasovagal reaction.
Final diagnoses were obtained for 1,703 of 2,114 nodules (80.6%) in the CNB group and 2,190 of 2,708 nodules (80.9%) in the FNA group and they were included in evaluating diagnostic performance. Six hundred forty five of 1,703 nodules (37.9%) in the CNB group and 461 of 2,190 nodules (21.1%) in the FNA group were diagnosed after surgery. The sensitivity for diagnosing malignancy in the CNB group (75.9%; 501 of 660) was significantly higher than in the FNA group (55.6%; 239 of 439; p < 0.001). However, the specificities were similar (100%, 1,043 of 1,043, and 99.9%, 1,758 of 1,760, respectively). In terms of false negative rates, both CNB (2.4%; 21 of 883) and FNA (1.0%; 17 of 1,629) showed low false negative.
Propensity score matching and inverse probability weighting showed unchanged results in non-diagnostic result rate, malignancy rate, sensitivity, and specificity. Although there was a significant difference between the CNB and FNA groups in terms of the complications rate in unmatched cohorts (p = 0.048), there was no significant difference following propensity score matching and inverse probability weighting (p = 0.375, p = 0.188, respectively).

Discussion
In our study, we compared the outcomes of CNB and FNA in a large population of patients with initially detected thyroid nodules by performing a propensity score analysis. Our observational study showed that CNB achieved a significantly lower non-diagnostic result rate and higher malignancy rate than that of FNA. These results were consistently evident across propensity score matching, inverse probability weighting, and subgroup analysis.
Although there was a significant difference in complications rate between the two groups in unmatched cohorts, there was no significant difference in matched cohorts. The sensitivity of the CNB group was significantly higher than that of the FNA group; however, the specificities were similar. Subgroup analyses according to the nodule sizes ≥ 1 cm also showed consistent results. Therefore, CNB appears to be a promising and safe diagnostic tool for patients with initially detected thyroid nodules. Currently, there are no definitive guidelines on how CNB should be used in the evaluation of initially detected thyroid nodules. Although there were limited data on CNB, we hope that consistent evidence favoring the use of CNB including the results of our study will be validated by future clinical trials. There were significant differences between the two groups in terms of nodule size, nodule composition, margin, echogenicity, and calcification in pooled cohorts. Similarly, higher malignancy rates were evident in the CNB group relative to that of the FNA group (23.7% vs. 11.8%, respectively). This was probably due to the significantly larger nodule sizes evident in the CNB group (1.68 cm) relative to that of the FNA group (1.24 cm), with the incidence of suspicious US characteristics being significantly higher in the CNB group than in the FNA group. To reduce the effects of selection bias and potential confounding variables in this observational study, propensity score matching and inverse probability weighting were performed. In matched cohorts, significant differences between the CNB group and the FNA group for all covariates were abolished. The malignancy rate was consistently higher in the CNB group than in the FNA group via validation by propensity score matching (22.5% vs. 14.7%, respectively) and inverse probability weighting (22.7% vs. 9.0%, respectively).
Although a small population pilot study (31 patients) reported that CNB was more effective for initially detected suspicious thyroid nodules compared to FNA 18 , this large population study verified its efficacy and safety. Current literature has demonstrated that CNB significantly lowered the pooled proportion of non-diagnostic results, higher sensitivity for diagnosing malignancy when compared against FNA for nodules with previous non-diagnostic FNA results 10,11,15 and previous indeterminate FNA results 14,15 . The advantages of CNB may be explained due to its ability to sample large amounts of tissue, assess histologic architecture (rather than cytological evaluation) and function on a low rate of operator dependence, if targeting of thyroid nodules is successful 15 . In terms of cost, a recent study demonstrated that CNB can detect the benign thyroid nodules that are classified as previous indeterminate FNA results, and these patients can avoid diagnostic surgery and hospitals can reduce   their surgical costs by one-third 33 . There is no study comparing the costs of CNB and FNA and further study will be needed. In our study, there were no major complications after CNB or FNA. Minor complications were rare in both CNB and FNA. Complications were observed in only 7 cases of the 2,114 patients after CNB (0.33%) and in only 2 cases of the 2,708 patients after FNA (0.074%). Before 1980, large-needle biopsy (14-gauge) without US guidance was less effective and yielded more complications 34,35 ; however, modern, spring-activated biopsy needles (18-22 gauge), under US guidance, can achieve lower complication rates. These findings suggest that both CNB and FNA procedures are safe, well tolerated, and have a low incidence of complications. Nevertheless, operator experience is important for performing CNB to minimize procedure-related complications. Therefore, CNB should only be performed by clinically experienced and trained operators under US guidance, who are familiar with the US anatomy of the thyroid and perithyroidal areas 36 .
Since thyroid malignancies smaller than 1 cm are often considered clinically insignificant, the recent American Thyroid Association guidelines have recommended that only nodules > 1 cm should be evaluated 1 . Therefore, we performed subgroup analyses accordingly to nodule sizes ≥ 1 cm, thereby showing consistently significantly lower non-diagnostic result rates and higher malignancy rates evident in the CNB group relative to that of the FNA group, in pooled and matched cohorts alike. In addition, there was no significant difference between the CNB and the FNA groups in terms of complication rate and diagnostic accuracy. These findings affirm CNB efficacy for nodule sizes ≥ 1 cm.
The major limitation of our study was that it was based on observational data. Our study results should be interpreted with some reservation because of the possibility of a selection bias towards suspicious nodules owing to US usage for the CNB group. Secondly, as this was a single-center study, the generalizations of these results may be limited. Third, efficacy of CNB versus FNA may not be an accurate comparison even with propensity score matching and inverse probability weighting when bias is at the level of the ordering physician.
In conclusion, this study demonstrated that the observations of significantly lower non-diagnostic result rates and higher malignancy rates in the CNB group compared to the FNA group were overall consistent in propensity score and subgroup analyses, therefore, CNB appears to be a promising and safe diagnostic tool for patients with initially detected thyroid nodules.