Clinical Value of Lymph Node Ratio Integration with the 8th Edition of the UICC TNM Classification and 2015 ATA Risk Stratification Systems for Recurrence Prediction in Papillary Thyroid Cancer

Recently, the 2015 American Thyroid Association (ATA) risk stratification and the 8th edition of the American Joint Committee on Cancer/Union for International Cancer Control (AJCC/UICC) TNM staging system were released. This study was conducted to assess the clinical value of the lymph node ratio (LNR) as a predictor of recurrence when integrated with these newly released stratification systems, and to compare the predictive accuracy of the modified systems with that of the newly released systems. The optimal LNR threshold value for predicting papillary thyroid cancer (PTC) recurrence was 0.17857 using the Contal and O’Quigley method. The 8th edition of the AJCC/UICC TNM staging system with the LNR and the 2015 ATA risk stratification system with the LNR were significant predictors of recurrence. Furthermore, calculation of the proportion of variance explained (PVE), the Akaike information criterion (AIC), Harrell’s c index, and the incremental area under the curve (iAUC) revealed that the 8th edition of the TNM staging system with the LNR, and the 2015 ATA risk stratification system with the LNR, showed the best predictive performance. Integration of the LNR with the TNM staging and the ATA risk stratification systems should improve prediction of recurrence in patients with PTC.

Thyroid cancer is the most common endocrine cancer. Since the incidence of thyroid cancer is increasing, several risk stratification and treatment guidelines for patients with differentiated thyroid cancer (DTC) have been devised [1][2][3][4][5][6] . Recently, to provide a more accurate system that predicts disease-specific survival (DSS), the 8 th edition of the TNM staging system was released by the American Joint Committee on Cancer/Union for International Cancer Control (AJCC/UICC). However, the 8 th TNM staging system has a limitation in terms of estimating recurrence risk and reflecting the biological behavior of DTC [7][8][9] . In practice, predicting recurrence might be more important than predicting DSS since DTC has a much lower mortality rate than other common malignancies.
The 8 th TNM staging system underestimates the significance of lymph node metastasis. By contrast, the revised 2015 ATA guidelines point out the importance of metastatic lymph nodes (MLNs), indicating that the number and size of MLNs in the neck are predictors of papillary thyroid cancer (PTC) recurrence. In line with this, complete evaluation of MLNs is now conducted more accurately. However, this new system also has a limitation in that the stratification strategy, which comprises three groups, is too simple for a personalized medicine approach that needs to reflect the diverse clinical or pathological behaviors of DTC [10][11][12][13][14][15] .
The ratio of MLNs to examined LNs (the lymph node ratio [LNR]) has also been investigated as an important predictor of PTC recurrence (i.e., recurrence-free survival [RFS]) and DSS 16,17 . Here, we examined clinico-pathological data in a cohort of PTC patients who underwent total thyroidectomy (TT) and central compartment node dissection (CCND), with or without modified radical neck dissection (MRND), at a single tertiary referral hospital. To determine the optimal LNR threshold for PTC recurrence, we performed statistical analyses using the Contal and O'Quigley method based on the log rank test. We then examined the effect of integrating the LNR with the 8 th TNM staging or 2015 ATA risk stratification system to predict RFS. Finally, we compared the predictive accuracy of this modified system with that of the previous/current TNM and ATA risk stratification systems.
Classification of patients with PTC according to the 8 th edition of the AJCC/UICC staging system combined with the LnR. The 8 th TNM staging system recommends that the age threshold should be changed from 45 years to 55 years. Remarkably, minor ETE was removed, resulting in new categories T3a and T3b. The definition of a central neck LN (N1a) was also changed to include both level VI and level VII compartments 8,15,18 . However, here, there was no change in the N stage because no patient had level VII LN metastasis. Supplementary Table 1 shows patients classified according to the 7 th and 8 th TNM staging systems. As expected, a significant number of T3 patients from the 7 th T staging system were reclassified as T1 (1074, 44.3%) and T2 (155, 6.4%) in the 8 th T staging. Accordingly, a large number of stage II, III, and IV patients from the 7 th TNM were reclassified as stage I (724, 29.9%) and II (279, 11.5%).
As described in Methods, the optimal threshold value calculated for the LNR was 0.

Risk factors for tumor recurrence.
Recurrence was detected in 134 (5.5%) patients during follow-up, and the mean disease-free survival time was 46.2 months (range, 12-170 months). As presented in Supplementary  Table 4, diverse clinico-pathological factors as well as the mean LNR differed between the recurrence and non-recurrence groups (mean values, 0.35 vs. 0.13, respectively; P < 0.0001). Combining the LNR with the 8 th TNM staging and 2015 ATA stratification systems revealed that the recurrence group tended to have a higher LNR than the non-recurrence group, particularly the 8 th TNM stage II and III and the 2015 ATA intermediateand high-risk groups.
Univariate and multivariate analyses performed to estimate the hazard ratio (HR) of clinico-pathological factors revealed that advanced stage, especially 8 th TNM III and IV, showed an increasing HR for PTC recurrence ( Table 2). Consistent with the results from Supplementary Table 4, univariate analysis revealed that a high LNR increased the HR for 8 th TNM stages I, II, and III, and multivariate analysis revealed an increase for all stages. Univariate and multivariate analyses conducted to calculate the HR of clinico-pathological factors in the ATA risk stratification systems also revealed an association between advanced risk and a higher HR (Table 3). Again, when combined with the LNR, univariate and multivariate analyses revealed that a high LNR increased the HR for all risk groups.
Comparison of performance and predictive accuracy of the TNM staging and ATA risk stratification systems for RfS when combined with the LnR. To compare the predictive power of the TNM staging systems with or without the LNR, we calculated the PVE value, the AIC, Harrell's c index, and the iAUC using the 7 th TNM, 8 th TNM, and 8 th TNM with the LNR (threshold = 0.4), and 8 th TNM with the LNR (threshold = 0.17857). Since our own and other previous studies suggested an optimal LNR threshold of 0.4 for predicting recurrence or disease-specific mortality, we decided to use 0.4 as the second LNR threshold in the analysis 16,19 . The 8 th TNM with the LNR (threshold = 0.17857) showed the highest predictive accuracy of all four calculation methods (Table 4) (Table 5).

Discussion
This study aimed to estimate the effect of integrating the LNR with TNM staging and ATA risk stratification systems to predict PTC recurrence. Indeed, the 8 th TNM staging system is important to assess the risk of mortality, and may not be appropriate to predict the risk for disease recurrence. Although the revised 2015 ATA guidelines emphasized the characteristics of MLNs including the number of involved LNs, the size of the largest involved LN, and extranodal extension, these potential prognostic factors do not have pertinent threshold values 18,20 . To optimize management, more tailored risk stratification of LN status such as the LNR is needed in PTC patients to provide an objective determination of prognosis for these patients.
In fact, the LNR has been known as a prognostic variable in PTC. To inform the management of low-and high-risk tumors, surgeons should ensure that the LN yield is adequate during surgery and pathologists also need to perform a careful histologic examination of specimens. Based on this idea and on the relatively large amount www.nature.com/scientificreports www.nature.com/scientificreports/ of existing research data, we postulated that, among the diverse characteristics of MLNs, the LNR was the most suitable for clinical application 19,21-28 . Supporting our postulation, a higher LNR increased the HR of same-stage tumors, even after adjusting for epidemiological and basic clinico-pathological tumor characteristics. For TNM stage I, a high LNR increased the HR to 4.925 (confidence interval [CI], 2.896-8.375), indicating that the LNR might be an important predictor of recurrence, even for low-stage disease. Consistent with this, for stage II a high LNR increased the HR from 3.102 (CI, 1.241-7.753) to 4.727 (CI, 2.291-9.755). These results for stage I and II PTC suggest that a high LNR is the most powerful predictor of recurrence in those with low-stage disease since the HRs for stage I and II PTC are quite similar (4.925 vs. 4.727, respectively). Further, integrating the LNR with the 2015 ATA risk stratification system enhanced the predictive performance. The HR of the intermediate-risk group with a high LNR was higher than that of the high-risk group with a low LNR (10.011; CI, 3.657-27.404 vs. 7.549; CI, 1.88-30.318).
The problem with the LNR as a predictor is determining an optimal threshold since the definition of the LNR and the extent of surgery differ between previous studies. In fact, in a previous study, we reported an optimal threshold of 0.4 or 0.5 for central and lateral compartment LN groups, respectively 16 . We did not include patients with no MLNs (pN0) in the previous study, and LNR thresholds were calculated by selecting the inflection points of the binomial logistic regression curves for the probability of recurrence. However, in the present study, we included pN0 patients to discover the appropriate LNR across all TNM stages. We also re-assessed the optimal threshold using the Contal and O'Quigley method, a technique that uses the log rank test and is suitable for determining thresholds across all stages. This re-assessment yielded a threshold of 0.17857.
To compare the two thresholds, we estimated the performance of the TNM staging and ATA risk stratification systems using four statistical calculations: PVE, AIC, Harrell's c index, and iAUC. According to the results, the 8 th TNM staging system plus a LNR of 0.17857 was the most accurate predictor of RFS and DSS (data not shown). Harrell's c index and iAUC could not be calculated for DSS due to the low mortality rate in this cohort. Likewise, all four calculation methods indicated that the 2015 ATA risk stratification system with a LNR of 0.17857 was the most accurate predictor of RFS and DSS (data not shown). In terms of predicting overall survival (OS), the 2015 ATA risk stratification yielded a higher PVE (1.174) when integrated with the previous LNR of 0.4 than when integrated with the present LNR of 0.17857 (PVE = 1.165) (data not shown). Taken together, these statistical analyses suggested that the new LNR of 0.17857 was useful as a standard indicator to determine RFS across all stages of PTC, and the previous LNR of 0.4 was also valuable to predict OS in patients with pN1. Therefore, our data show the critical importance of the LNR threshold if the LNR is to be used as a prognostic marker.
This study has a few limitations. First, it was retrospective and we could not verify the precise pathologic status of MLNs including the size of micrometastatic foci and extranodal extensions. Second, the study population  www.nature.com/scientificreports www.nature.com/scientificreports/ was sampled from a single tertiary referral hospital and might thus have been biased toward more advanced disease cases. Third, the study included only patients who underwent TT with CCND or MRND (other surgical procedures such as hemi-thyroidectomy or lobectomy were excluded). These inclusion and exclusion criteria might introduce selection biases when determining the LNR threshold. Finally, it was impossible to perform head-to-head comparisons of the TNM staging and ATA risk stratification systems in terms of OS, RFS, and DSS since statistical differences in covariates between the two systems limited scientific analyses.
In conclusion, integration of the LNR with the TNM staging and the ATA risk stratification systems increases the accuracy of predicting PTC recurrence. Further studies will be undertaken to examine the predictive role of well-defined analyses of MLNs in a clinical setting.

patients.
The clinico-pathologic characteristics of 2705 patients with PTC (from January 1991 to December 2010) were analyzed retrospectively via complete review of medical charts and pathology reports. To ensure adequate LN dissection yield, patients with fewer than six central LNs harvested by CCND or fewer than 18 LNs harvested by MRND were excluded (n = 281). Among the patients finally enrolled (n = 2424), 1422 (58.7%) underwent TT with prophylactic or therapeutic CCND, and 1002 (41.3%) underwent TT with prophylactic or therapeutic CCND, with therapeutic MRND performed for clinically suspicious or pathologically confirmed N1b nodes. The mean follow-up duration was 114.0 months (range, 63-265 months). Of the patients who underwent TT, 2313 (95.4%) received RAI ablation at 4-8 weeks post-surgery using a dose based on the ATA guidelines. The follow-up protocol was also based on ATA guidelines. PTC recurrence was confirmed by imaging modalities and/or pathologic diagnosis by ultrasound-guided fine needle aspiration biopsy (US-FNAB). Some patients were included in previous analyses published in prior papers 16,29 . Measurement of the LnR. Two experienced pathologists re-examined (independently) the LN status of specimens for evidence of LN metastasis. The LNR was defined as the number of MLNs divided by the total number of LNs retrieved from the central compartment with or without nodes from lateral compartments. The Contal and O'Quigley method for PTC recurrence was used to calculate the threshold value for the LNR; the value was 0.17857 30 .
Statistical analysis. Student's t-test and the Chi-square test or Wilcoxon rank sum test were used to compare groups as appropriate. Univariate and multivariate Cox regression analyses were performed to identify independent predictors of RFS. P < 0.05 indicated statistical significance. To estimate the performance of the TNM staging and ATA risk stratification systems, four statistical parameters were calculated: the proportion of variance explained (PVE), the Akaike information criterion (AIC), Harrell's c index, and the time-dependent receiver operating characteristics (ROC) curve (incremental area under the curve, iAUC). All statistical analyses were performed using IBM SPSS statistics 23.0 (SPSS Inc., Chicago, IL, USA), SAS (version 9.4, SAS Inc., Cary, NC, USA) and R package version 3.1.3 (http://www.R-project.org). ethical approval and informed consent. This study was approved by the institutional review board of Severance Hospital and was conducted in accordance with the recommendations of the institutional review board, which waived the requirement for informed consent due to its retrospective nature.

Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.  Table 5. Comparative analysis of the performance and predictive accuracy of the ATA risk stratification for recurrence-free survival. Abbreviations: PVE, proportion of variance explained; AIC, Akaike information criterion; iAUC, incremental area under the curve; LNR, lymph node ratio.