Main

FG (FG) represents an important prognostic factor in patients with renal cell carcinoma.1 It relies on nuclear size, shape, and prominence of nucleoli.1 To date, 11 studies tested the ability of FG in prediction of prognosis in chromophobe renal cell carcinoma.2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 Seven of those failed to confirm the value of FG.2, 3, 4, 5, 6, 7, 8 However, all relied on small sample sizes (n=49–291), thus power may have been insufficient.2, 3, 4, 5, 6, 7, 8 Conversely, four other reported the opposite findings.9, 10, 11, 12 Here, FG was found to accurately predict prognosis and study populations included patients with chromophobe renal cell carcinoma.9, 10, 11, 12

Based on this lack of consensus, we decided to examine the discriminant accuracy of FG in prediction of cancer-specific mortality after partial or radical nephrectomy for chromophobe renal cell carcinoma patients. Specifically, we tested and quantified the added value of FG relative to other established prognostic factors. Additionally, we also compared the gains in discriminant accuracy related to the use of the conventional four-tiered FG scheme relative to a modified three-13 and two-tiered14 FG schemes.

Materials and methods

Study Population

Patients diagnosed with renal cell carcinoma of all stages treated with partial or radical nephrectomy were identified within the Surveillance, Epidemiology, and End Results registry between 1988 and 2008. Only patients with chromophobe histological subtype were included. Exclusions consisted of patients aged <18 years and missing FG information. Additional exclusions consisted of missing tumor size, pathological tumor stage, nodal and/or distant metastases information. Death certificate only and/or autopsy cases were removed from our analyses. This resulted in 1862 assessable patients.

Description of Variables

FG represented the main variable of interest. The conventional four-tiered FG system, originally described by Fuhrman et al,1 was examined. It consisted of four strata.Grade 1 tumors have round uniform nuclei 10 μm in diameter with minute or absent nuclei. Grade 2 tumors have slightly irregular nuclei with diameters of 15 μm and visible nucleoli at × 400. Grade 3 tumors have moderate nuclear irregularity, diameters of at least 20 μm, and large nucleoli readily visible at × 100. Finally, grade 4 nuclei have clumped chromatin and markedly irregular or pleomorphic nuclei, with multilobated forms.

The conventional four-tiered FG scheme may be modified. We relied on two modified FG schemes. The modified two-tiered FG scheme was described by Zisman et al14 and consisted of grouping FG 1 and 2, as well of grouping FG 3 and 4. The modified three-tiered FG system was described by Ficarra et al13 It consisted of grouping FG 1 and 2 and keeping FG 3 and 4 unchanged. Others variables consisted of patient age, gender, race (white, black, other), tumor size measured in centimeters, nephrectomy type (partial or radical), tumor stage (pT1, pT2, pT3, pT4), nodal stage (pN0/x, pN1–2), and presence of distant metastases (M0, M1).

Statistical Analyses

Frequencies and proportions, as well as means, medians and interquartile ranges were reported for categorical and continuously coded variables, respectively.

In the first step, power analyses with 85% power and 5% type 1 error were performed in order to assess the number of events required to detect a 10% difference in cancer-specific mortality-free survival at 5 years.

In the second step, we relied on univariable Cox regression to assess the statistical significance of all examined variables, including the conventional four-tiered FG scheme and the two modified FG schemes. Additionally, we computed the discriminant accuracy of each predictor using the area under the curve method. The 5-year cancer-specific mortality-free survival rates were also computed for this step.

Finally, four separate multivariable Cox regression models for prediction of cancer-specific mortality were fitted. The first model included all variables, except for FG. The second model included all variables, in addition to the conventional four-tiered FG scheme. The third and fourth models included all variables, in addition to, respectively, adding the three- and two-tiered FG schemes. Within the multivariable analyses model, we tested the independent predictor status of all variables including the conventional four-tiered FG scheme and the modified three- and two-tiered FG schemes. Finally, we computed the discriminant accuracy for each of the models. For the first model, the discriminant accuracy estimated did not include FG. For the second model, the discriminant accuracy included the conventional four-tiered FG scheme. For the third and fourth models the discriminant accuracy included, respectively, the modified three- and two-tiered FG schemes. The differences in discriminant accuracy values between the four models were compared with Mantel–Haenszel test.15, 16

All tests were two-sided with a statistical significance set at 0.05. Analyses were conducted using the statistical package for R (the R foundation for Statistical Computing, version 2.13.1).

Results

Patient characteristics are shown in Table 1. Of the entire cohort (n=1862), the majority were male (59%) and white (81%). Average patient age was 60 years (median 60). Radical nephrectomy was performed in 75% of patients. Average tumor size was 5.9 cm (median 4.5 cm). Most patients harbored T1 tumors (65%). One percent and 2% had nodal and distant metastases, respectively.

Table 1 Descriptive characteristics of the study population (n=1862) of patients with chromophobe renal cell carcinoma treated with partial or radical nephrectomy within the Surveillance, Epidemiology, and End Results (SEER) database between years 1988 and 2008

FG 1, 2, 3, and 4 were recorded in, respectively, 8%, 57%, 30%, and 5% of patients. A modification of the conventional four-tiered FG scheme consisted of grouping FGs 1 and 2, which resulted in a three-tiered FG scheme (1–2 vs 3 vs 4). This scheme resulted in a distribution of 65%, 30%, and 5% respectively in FG 1–2, 3, and 4 subgroups. A two-tiered FG scheme has also been proposed and consisted of additionally pairing FG 3 and 4 together (1–2 vs 3–4). The distribution of the modified two-tiered FG scheme was 64.9% and 35.1%, respectively, in FG 1–2 and 3–4 subgroups.

In power analyses calculations, 46 deaths were required to detect a 10% difference in cancer-specific mortality-free survival at 5 years with 85% statistical power and 5% type 1 error. However, a total of 65 deaths were recorded in our study. These deaths were stratified according to T-stage and FG in Table 2.

Table 2 Cancer-specific death frequency and proportion stratified according to T-stage and Fuhrman grade

The overall 5-year cancer-specific mortality-free survival rate was 94.8%. Stratification of 5-year cancer-specific mortality-free survival rates according to the four-tiered FG system revealed 96.8%, 96.5%, 91.8%, and 89.1% cancer-specific mortality-free survival rates for, respectively, FG 1, 2, 3, and 4 (all log-rank P≥0.09). When the same stratification was performed according to the three-tiered FG system, the 5-year cancer-specific mortality-free survival rates were 96.5%, 95.6%, and 92.2% for FG 1–2, 3, and 4, respectively (all log-rank comparisons P≤0.008). For the two-tiered FG system, the 5-year cancer-specific mortality-free survival rates were 98.0% and 91.4% for FG 1–2 and 3–4, respectively (log-rank P<0.001).

In univaraiable Cox regression analyses predicting cancer-specific mortality (Table 2), the conventional four-tiered FG scheme achieved overall statistical significance (P=0.01). Patients with FG 2, 3, and 4 were, respectively, at 0.9, 1.9, and 2.9-fold higher risk of cancer-specific mortality relative to FG 1 patients. However, none of these individual differences achieved statistical significance (all P≥0.1). The modified three-tiered FG scheme achieved overall statistical significance (P=0.003). Patients with FG 3 and 4 were, respectively, at 2.1 and 3.1-fold higher risk of cancer-specific mortality relative to FG 1–2 patients. Both individual groups achieved statistical significance (P≤0.005). Finally, the modified two-tiered FG scheme also achieved statistical significance (P≤0.01). The cancer-specific mortality rate was 2.2-fold higher in patients with FG 3–4 relative to FG 1–2 patients.

In univariable discriminant accuracy analyses (Table 3), the conventional four-tiered FG scheme ranked third (area under the curve=62.9%) after tumor size (area under the curve=72.4%) and tumor stage (area under the curve=68.9%). The modified three-tiered FG scheme achieved 62.4% accuracy vs 61.9% for the modified two-tiered FG scheme. The difference between the conventional four-tiered FG scheme and the two modified FG schemes was statistically significant (P≤0.03).

Table 3 Univariable and multivariable Cox regression analyses predicting the probability of cancer-specific mortality

In multivariable analyses (Table 3), the conventional four-tiered FG scheme failed to achieve independent predictor status (P=0.2). The modified three-tiered FG scheme also failed to achieve independent predictor status (P=0.1). Only the modified two-tiered FG scheme reached independent predictor status (P=0.04). The modified two-tiered FG scheme revealed that FG 3–4 patients had a 1.7-fold higher rate of cancer-specific mortality relative to FG 1–2, after adjusting for the effect of all other variables.17, 18, 19

In multivariable analyses of discriminant accuracy (Table 3), the multivariable model that included all variables except FG, resulted in 80.3% accuracy. Conversely, the multivariable model that included the conventional four-tiered FG scheme reached 79% accuracy. The multivariable model that included the modified three-tiered FG scheme reached 78.5% accuracy. Finally, the multivariable model that relied on modified two-tiered FG scheme reached 79.5% accuracy. The difference in discriminant accuracy between the model without FG versus the one with the conventional four-tiered FG scheme was statistically significant (P=0.01). Similarly, the difference in discriminant accuracy estimates between the model without FG and the two models with modified FG schemes were statistically significant (P≤0.02).

Discussion

The natural history of treated renal cell carcinoma can be predicted using several established variables.20 The stage of the primary, presence of lymph-node invasion, presence of distant metastases, and tumor grade represent examples of established predictors.17, 18, 19, 21

In renal cell carcinoma, FG represents the most widely used grade stratification scheme.1 It was endorsed for use in renal cell carcinoma by the Rochester Renal Cell Carcinoma Consensus Conference, with additional endorsement by the College of American Pathologists.22 Despite its confirmed value in clear cell renal cell carcinoma,16 the benefit of FG in chromophobe renal cell carcinoma is less obvious. For example, seven groups of investigators recently questioned the value of FG in chromophobe renal cell carcinoma.2, 3, 4, 5, 6, 7, 8 However, all seven-investigator groups relied on relatively small patient samples (n=49–291) and all tested FG in multivariate analyses.2, 3, 4, 5, 6, 7, 8 Lack of statistical significance in such analyses may originate from insufficient sample size. Based on this consideration, we decided to test the added value of FG in a large population-based sample of individual treated with partial or radical nephrectomy for chromophobe renal cell carcinoma. Our analyses were twofold. First we tested the discriminant accuracy and added value in prediction of cancer-specific mortality of the conventional four-tiered FG scheme. Additionally, we tested the same parameters using a modified three- and two-tiered FG schemes.

Our results showed several important findings. First the conventional four-tiered FG scheme (area under the curve=62.9%) ranked third in prediction of cancer-specific mortality in patients with chromophobe renal cell carcinoma. Specifically, only tumor size (area under the curve=72.4%) and stage (area under the curve=68.9%) showed higher discriminant accuracy than the conventional four-tiered FG scheme. The modified three- (area under the curve=62.4%) and two-tiered (area under the curve=61.9%) FG schemes also ranked fourth and fifth, respectively. Taken together, these findings showed that the conventional FG and the two modified FG schemes offered reasonable discriminant accuracy when they were considered individually.

The above univariable findings sharply contrasted with the subsequent multivariable findings. Specifically, in multivariable analyses we quantified the added value of FG relative to other established predictors of prognosis in chromophobe renal cell carcinoma. Here, the inclusion of conventional four-tiered FG scheme failed to add any discriminant ability. On the contrary, the discriminant accuracy decreased after inclusion of FG, from 80.3 to 79% (−1.3%; P=0.013). Similar decreases in discriminant ability were recorded when modified FG schemes were considered. For example, the use of the modified three-tiered FG scheme resulted in a 1.8% decrease vs 0.8% when the modified two-tiered FG scheme was examined.

The above findings that originated from so far the largest patient cohort of chromophobe renal cell carcinoma (n=1862) clearly do not support the use of FG in prediction of cancer-specific mortality. In consequence, alternative grading scheme should be used. Such systems should be tested in large independent cohorts for purpose of validation. For example, Paner et al6 suggested the use of an alternative grading scheme. This system was known as the chromophobe tumor grading system. It relied on geographic nuclear crowding and anaplasia.6 So far, its value has been examined in 124 patients with chromophobe renal cell carcinoma. It demonstrated a benefit by virtue of achieving independent predictor status. Similarly, Finley et al4 from the University of California in Los Angeles compared the chromophobe tumor grading system to FG in a relatively small sample of 82 patients. Both grading schemes failed to achieve independent predictor status in multivariable analysis.4 Nonetheless, the chromophobe grading system represented an example of a potentially valuable grading scheme. It awaits further testing in large patient populations.

Our analysis is unique with respect to its sample size. However, our methodology has been used before.15, 16 Rioux-Leclerq et al,15 as well as Sun et al16 examined the potential added value of the conventional four-tiered FG scheme and of two modified FG schemes in patients with all histological subtypes of renal cell carcinoma and in patients with exclusive clear cell renal cell carcinoma, respectively. In patients with all histological subtypes of renal cell carcinoma, the modified FG schemes demonstrated equal predictive accuracy to the conventional four-tiered FG scheme (area under the curve=84.6%).15 In patients with exclusively clear cell renal cell carcinoma histology, the predictive accuracy of the three- and two-tiered FG schemes were, respectively, 83.8% and 83.6%, compared with 83.8% for the conventional four-tiered FG scheme.16 Based on equivalence of discriminant accuracy values Rioux-Leclerq et al did not suggest superiority of the modified FG schemes relative to the conventional four-tiered FG scheme. Based on general simplicity of the modified two-tiered FG scheme, Sun et al suggested their use in clinical practice. Unlike those previous studies, our study showed no gain from the inclusion of any of the tested FG schemes.

Our study has limitations. It is a retrospective design, just like all previous studies that assessed the value of FG in renal cell carcinoma.2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16 Also, our study is limited by the variables that were recorded in the Surveillance, Epidemiology, and End Results registry. Information on the presence of tumor necrosis and nucleolar prominence would have added to its strength. Lack of central pathological review represents another important limitation. It is also shared with all other studies assessing FG. Last but not least, lack of access to tissue specimens renders the assessment of inter- and intra-observer variability impossible.

In conclusion, our study represents the largest assessment of the value of FG in patients with chromophobe renal cell carcinoma. Our results show that FG doesn’t add any value when other variables are considered. Based on these findings, we do not recommend the use of FG to predict cancer-specific mortality in patients with chromophobe renal cell carcinoma.