Gini's mean difference and the long-term prognostic value of nodal quanta classes after pre-operative chemotherapy in advanced breast cancer

Gini's mean difference (GMD, mean absolute difference between any two distinct quantities) of the restricted mean survival times (RMSTs, expectation of life at a given time limit) has been proposed as a new metric where higher GMD indicates better prognostic value. GMD is applied to the RMSTs at 25 years time-horizon to evaluate the long-term overall survival of women with breast cancer who received neoadjuvant chemotherapy, comparing a classification based on the number (pN) versus a classification based on the ratio (LNRc) of positive nodes found at axillary surgery. A total of 233 patients treated in 1980–2009 with documented number of positive nodes (npos) and number of nodes examined (ntot) were identified. The numbers were categorized into pN0, npos = 0; pN1, npos = [1,3]; pN2, npos = [4,9]; pN3, npos ≥ 10. The ratios npnx = npos/ntot were categorized into Lnr0, npnx = 0; Lnr1, npnx = (0,0.20]; Lnr2, npnx = (0.20,0.65]; Lnr3, npnx > 0.65. The GMD for pN-classification was 5.5 (standard error: ± 0.9) years, not much improved over a simple node-negative vs. node-positive that showed a GMD of 5.0 (± 1.4) years. The GMD for LNRc-classification was larger, 6.7 (± 0.8) years. Among other conventional metrics, Cox-model LNRc's c-index was 0.668 vs. pN's c = 0.641, indicating commensurate superiority of LNRc-classification. The usability of GMD-RMSTs warrants further investigation.


Section 1. Introduction
In western countries, breast cancer is the most common cancer type in women and the incidence has been rising continuously (1). In Belgium 6628 new cases were reported in 1998, which corresponds with 35.5 % of all cancers in women. In 1997, 2416 deaths from breast cancer were reported (2). The primary treatment for small tumours is breast conserving surgery (BCS) or mastectomy (3). There is a general consensus about the survival benefit of adjuvant radiotherapy after BCS (4,5). The role of adjuvant radiotherapy after mastectomy is more controversial. It is generally admitted that post-mastectomy radiotherapy (PMRT) is beneficial in high risk patients (patients with large tumour (T3-stage, >5 cm) and/or with 4 or more than 4 positive axillary lymph nodes) (4,6). But for lower risk patients (small tumour, and nodenegative or node-positive with <4 positive nodes), it has been considered that there is insufficient evidence to support the use of PMRT in these patients (4). In their investigation of the US Surveillance, Epidemiology, and End Results (SEER) population data, other authors also failed to find a survival benefit for PMRT in low risk patients (7,8). The criticism was however the retrospective nature of these population studies. Adverse selection factors could have masked any beneficial role of PMRT in the low-risk patients (8).
Since extensive meta-analyses have failed to provide an answer to the issue of PMRT in low-risk patients, there is a need to consider alternative investigation strategies. In our radiotherapy department at the Academisch Ziekenhuis, Vrije Universiteit Brussel (AZ-VUB) since its creation in 1984, node-positive patients have received PMRT regardless of the size of the primary tumour and regardless of the number of positive axillary lymph nodes. Unlike the SEER, treatment outcome of our patients would be unbiased by adverse selection. This hints at the possibility of making a comparative study between the SEER and the AZ-VUB. Therefore, the primary objective of the present study is to compare the survival outcome of patients from the SEER and from the AZ-VUB, in order to gain insight into the role or lack of role of PMRT in low-risk breast cancer women (primary tumour <= 5 cm, node-negative or node-positive with 1 to 3 positive axillary lymph nodes).
The structure of the paper is as follows. Section 2 describes the SEER data. Section 3 describes the AZ-VUB data and summarizes the data selection. Section 4 lists the statistical methods used. Section 5 addresses the issue of fundamental comparability of the SEER and the AZ-VUB data, namely, are findings on the effects   Invasion of (or fixation to) pectoral fascia or muscle; deep fixation; attachment or fixation to pectoral muscle or underlying tissue 40 Invasion of (or fixation to) chest wall, ribs, intercostal or serratus anterior muscles 50 Extensive skin involvement: Skin edema, peau d'orange, "pigskin," en cuirasse, lenticular nodule(s), inflammation of skin, erythema, ulceration of skin of breast, satellite nodule(s) in skin of primary breast 60 (50) plus (40)  Prior to 1998, the definition of "First Course" for all malignancies except leukemias was (30): All cancer-directed treatment administered to the patient within four months after the initiation of therapy (e.g. within 4 months of excisional biopsy). All modalities of treatment were included regardless of sequence or the degree of completion of any component method. Exceptions were: 1. If it was documented that the planned first course of therapy continued beyond or began after four months of initiation, include all as first course. 2. Should there be a change of therapy due to apparent failure of the original planned and administered treatment or because of progression of the disease, the later therapy should be excluded from the first course and considered part of a second course of therapy. From 1998, the 4-months time definition was increased to 1 year.
The SEER program have been described as the gold standard for cancer registration in the United States (25). The SEER registries have a comprehensive quality assurance program, by case-finding audits, education and training of personnel (31). The completeness of case incidence ascertainment is 98% and follow-up is 95% (25). In addition, each year the SEER registries reabstract medical records for a sample of cases to evaluate the accuracy of each of the data elements collected from the records (32).
For surgery and radiotherapy, several studies have used the Medicare to verify agreement between SEER records and insurance claims. Medicare is the primary health insurer for 97% of the US population 65 years or older (32). A good agreement between the SEER and Medicare claims was shown for breast cancer inpatients who underwent cancer-directed surgery (33,34). The accuracy of records was poorer for outpatients or for those who did not underwent cancer-directed surgery (no surgery or biopsy-only cases). The kappa measure of agreement (range from 0, no agreement, to 1, perfect agreement) was 0.70 for outpatients, 0.88 for inpatients (34).
For receipt of radiotherapy, Du et al found that more than 18% of Medicare patients identified as receiving radiotherapy were not so identified by SEER, and 7% of those identified by SEER were not identified by Medicare (35). The agreement was good for local and regional stages, with kappa respectively 0.82 and 0.77, whereas the agreement was poor for distant stage, kappa 0.074, and unstaged patients, kappa 0.50 (35). Virnig et al found a good agreement between SEER and Medicare with kappa 0.87 (36). The agreement was consistent across registries and over time during the study period 1991-1996 (36).
Limitations of the SEER data are the lack of informatin on comorbidity, on use of of diagnostic procedures, on recurrences. Receipt of systemic therapy or details of radiotherapy are not available.
In our studies using the SEER data, we have selected patients in whom breast cancer was the first primary tumour, histologically confirmed, hospital based, in whom cancer-directed surgery had been performed. These criteria correspond to cases where the SEER data have been shown to be the most reliable. Selection of patients was further limited by period of diagnosis, from 1988 (availability of detailed tumour extension data) to 1997 (before the change of treatment definition). The 9-registries Post-mastectomy RT, GGS 2004 12 data was used as the main investigation database, reserving the other registries for future validation studies.

. Description
In our hospital, the radiotherapy started functioning in 1984. The list of patients who were referred for radiation treatment at our hospital were recorded in a stand-alone dBase file. Maintenance was done by the chief nurse M.S. The structure of field records changed progressively over the years (Table 4). Data were related to billing and work management. Tumour data were "diagnose" and "TNM" (for definition of TNM, see Appendix).  MAAND  JAAR  JAAR  JAAR  JAAR  NAAM  NAAM  MAAND  NAAM  MAAND  MAAND  MAAND  MAAND  VOORNAAM  VOORNAAM  NAAM  DOSSIERNR  NAAM  NAAM  NAAM  NAAM  DIAGNOSE  DIAGNOSE  DOSSIERNR  AMBULANT  DOSSIERNR  DOSSIERNR  DOSSIERNR  DOSSIERNR  TNM  TNM  AMBULANT  PATIENTNR  AMBULANT  PATIENTNR  PATIENTNR  PATIENTNR  JAAR  JAAR  PATIENTNR  SIMULATIE  PATIENTNR  EXTERN  EXTERN  EXTERN  MAAND  MAAND  SIMULATIE  HERSIMUL  SIMULATIE  VERW_ARTS  VERW_ARTS  VERW_ARTS  PATIENTNR  PATIENTNR  HERSIMUL  PLANNING  HERSIMUL  POSTNR  POSTNR  POSTNR  SELNR  EXTERN  PLANNING  BLOKKEN  PLANNING  AMBULANT  AMBULANT  AMBULANT  POSTNR  BLOKKEN  MASKER  BLOKKEN  DIAGNOSE  DIAGNOSE  DIAGNOSE  AMBULANT  MASKER  PLANNINGCT  MASKER  TNM  TNM  TNM  SIMULATIE  PLANNINGCT  DIAGNOSE  PLANNINGCT  META  META  META  HERSIMUL  DIAGNOSE  AANTALZITT  DIAGNOSE  SIMULATIE  SIMULATIE  SIMULATIE  PLANNING  AANTALZITT  AANTALVELD  AANTALZITT  HERSIMUL  HERSIMUL  HERSIMUL  BLOKKEN  AANTALVELD  TNM  AANTALVELD  PLANNING  PLANNING  PLANNING  MASKER  TNM  TOT_DOSIS  TNM  BLOKKEN  BLOKKEN  BLOKKEN  PLANNINGCT  TOT_DOSIS  TOESTELTYP  TOT_DOSIS  MASKER  MASKER  MASKER  AANTALZITT  TOESTELTYP  EXTERN  TOESTELTYP  PLANNINGCT  PLANNINGCT  PLANNINGCT  AANTALVELD  EXTERN  JAAR  EXTERN  AANTALZITT  AANTALZITT  AANTALZITT  TOESTELTYP  POSTNR  BID  POSTNR  AANTALVELD  AANTALVELD  AANTALVELD  BID  BID  TOESTELT2  HOSPITALIZ  TOESTELTYP  TOESTELTYP  TOESTELTYP  TOESTELT2  TOESTELT2  AANTZITT2  GEHOSPIT  HOSPITALIZ  BID  BID  AANTZITT2  AANTVELD2  AANTVELD2  FRACAMB  LIGDAGEN  TOESTELT2  TOESTELT2  AANTVELD2  AANTZITT2  TARIF1  FRACHOSP  TOESTELT2  AANTZITT2  AANTZITT2  TARIF1  TARIF1  TARIF2  BID  AANTZITT2  AANTVELD2  AANTVELD2  TARIF2  TARIF2  TARIF3  TOESTELT2  AANTVELD2  TARIF1  TARIF1  TARIF3  TARIF3  D1  AANTZITT2  TARIF1  TARIF2  TARIF2  D1  D1  D2  AANTVELD2  TARIF2  TARIF3  TARIF3  D3  D2  D3  TARIF1  TARIF3  D1  D1  D2  D3  HUISARTS  TARIF2  D1  D2  D2  HUISARTS  HUISARTS  ADRES  TARIF3  D2  D3  D3  ADRES  ADRES  TEL  D1  D3  HUISARTS  HUISARTS  TEL  TEL  HADRES  D2  STAD  ADRES  ADRES  HTEL  HADRES  HTEL  D3  HUISARTS  TEL  TEL  From 1994-1995, the listing of patient treatment became part of the hospital network appointment-scheduling-billing system. Clinical tumour data for breast cancer patients were collected in a separate database by Dr. C.C. The goal of the Breast Cancer database was to form the basis for evaluation of treatments outcomes. Source of data were paper and electronic medical records, radiation treatment files, and surgical-senology database maintained by Dr. J.L. Current maintenance is by Dr. M.V. The maintained database structure is shown in Appendix A1 "Mamma juni 2005". The database is formed by the concatenation of the original dBase and the surgical database mentioned above, expanded with other descriptive fields. New records are appended whenever a patient receives a first simulation procedure (appointment for this first technical contact is determined with the hospital's appointment system). Data is abstracted from the radiation treatment folder which is completed at simulation time (Appendix A2). The radiation treatment folder itself is a summary of main pathology and treatment features. Subsequent data about the patient are added during the course of consultations and hospitalisations, outcome information communicated by colleagues, patient, family or social department. The current database is implemented in Filemaker. There is no direct link with the hospital's Electronic Medical Datafile (EMD).

Selection of patients
For the purpose of the present study, we aimed at a selection of patients that should be compatible with prior SEER studies. The criteria were: • women • first primary • invasive carcinoma (exclude non epithelial tumors, sarcoma, lymphoma, in situ) • unilateral (exclude synchronous bilateral invasive carcinomas; allow synchronous contralateral in-situ) • pT1-2, (maximum tumour diameter <= 50 mm) • M0 (non metastatic) • Total number of examined axillary lymph nodes (ntot) known • Total number of positive axillary lymph nodes (npos) known • if pTx, use cT • post-surgery radiation treatment delivered, within 4 months of definitive surgery or within 12 months if adjuvant chemotherapy and radiation was preplanned in sequence with chemotherapy. • exclude patients treated with Halsted operation • exclude patients not operated or receiving biopsy only • Further restrictions: ntot <=50, age 25-95 (extreme outliers).

Variables and endpoints of interest
Variables that were a priori considered for analyses were:

Imputation of partly missing tumour size
It is based on the distribution of known tumor size by pT-stage, taking into account the distribution skewness and rounding of measurements, using Table 9. The imputation procedure is as follows. 1) Case clinical-radiological size available: If clinical size matches pT, use clinical size. If mismatch and not T4, interpolate betwee pT mean size (see Table) and clinical size. If interpolation larger than pT, use largest pT boundary value. Example: pT1 (size <=20 mm), clinical size 50 mm, the value half-way between mean 13.8 mm and the recorded clinical size is 31.9, which is in contradiction with pT1 -> use 20 mm. If pT1 and recorded clinical size 25 mm, value half-way is 19.4 mm -> use 19.4 mm.
2) Case clinical size not available: Use median size according to pT1-pT3, with random "jitter" (Table). If pT0, assign size as 0.1 mm. If pT not available, or pTis or pT4: use cT0-cT3 if available.
Note that patients with tumour size >50 mm are not analyzed in the present paper.
The validity of the imputation for tumour size was subsequently verified in patients for whom pathological description was found (Table 10). The average signed error was 1 mm, and the average absolute error was 5 mm. 2) if startRT is unknown, but date end RT is known: -case total RT dose known: (=Rtdose+Rtboost) / 2 (usual fraction dose) = number of fraction.
Number of fraction / 5 (usual number of fractions per week), multiply by 7 -> estimated duration of RT in days.
3) else, neither start RT or end RT known: DateDgc <-3/3/3333 (we assign dummy date of origin beyond next millenium, consequently time from dummy origin to any event during our lifetime will be negative).

Database difficulties
We will briefly mention difficulties in exploring our own database. The two major problems (most time consuming) were related to dates, and to the sequence of primaries.
Discrepancies in dates of death were encountered. We noted a high frequency of first day of the month, and a high frequency of 15th day. These probably indicated rounding error when the exact date of death was unknown. Whenever there was a discrepancy between the database and the medical record, the shorter survival duration was selected. Rarely there was a mismatch between date of death mentioned in the medical record, and the date of death mentioned in the autopsy report. Whenever available, we used the autopsy date. Some typing errors were noted, e.g. month larger than 12, o or O for 0. Some difficulties were related to century date rollover due to the use of 2-digits year number.
The other problem was the sequence of primaries. Sometimes this could only be ascertained by browsing through several medical reports.

Final selection
Records for whom ntot, npos, tumour size and diagnostic dates were entirely missing were rejected.
After cleaning records, verification and correction of data using the medical and the radiation treatment files, there were 3907 records identified. Selection of the data used a Filemaker script and retrieved 2109 records. After exclusion of ntot outliers (1 case) and age outlier (1 case 21 years old), and exclusion of cases diagnosed before 1984 (the radiotherapy department had not yet been created), there were 2092 individual records available for analysis. All these patients received postsurgery radiation treatment, not more than 4 months after surgery (without adjuvant chemotherapy), or not more than 1 year after surgery (if adjuvant chemotherapy was given prior to radiation treatment).

Section 4. Methods
The outcome of interest in this paper is the time to death (event), or survival time (37). Details of the procedures are available in textbooks. They are reproduced here for convenience.
The survival probability S(t) is the probability that a patient survives from the time origin to a specified time t. Since not all patients can be observed during their whole lifetime, censoring occurs, that is, patients are still alive (no event) at the end of the observation period, or are lost to follow-up before the end of the observation period.
Estimation of survival used the Kaplan-Meier product-limit method (38). Denoting j an index for the ordered survival times (t 1 , t 2 , ... t j-1 , t j ), r j the number of patients alive (at risk) just before t j , and d j the number of events at t j , the estimated survival probability is Equivalently Note that S(t) is the product of successive conditional survival rates. The Kaplan-Meier survival estimate assumes that censoring is noninformative.
A closely related function is the hazard rate h(t), that is, the conditional failure rate of individuals who are under observation at time t having an event at that time. The hazard can be computed from the survival probability by where log denotes the natural logarithm.
The cumulative hazard H(t) is defined as the integral of the hazard, and is related to the survival by The cumulative hazard can also be computed by the Nelson-Aalen estimator: H(t) is the sum of successive death rates.
Multivariate survival analyses used the Cox proportional hazards regression model (39,40). Denote covariates X 1 , X 2 , ... X p (for example X 1 =histology, X 2 =grade, X 3 =location, etc.), and X i1 , X i2 , ... X ip as the covariates for the patient i, where i = 1, 2, ..., n. The Cox model specifies the hazard for the patient i as where h 0 (t) is an unspecified nonnegative function of time called the baseline hazard, and b 1 , b 2 , ... , b p are coefficients.
Taking the natural logarithm and suppressing the subscripts i, it might be written as Estimates of the coefficients b 1 , b 2 , ... , b p are obtained by the method of maximum partial likelihood.
The Cox model assumes that the ratio of the hazard functions for any two patient subgroups (i.e. two groups with different values of the explanatory variables X 1 , X 2 , ..., X p ) is constant over follow-up time.
In the model, the coefficient b j (for the jth covariate) represents the increase in log hazard if the covariate X j is increased by one unit and all other covariates are held constant. When X j is continuous, this implies the assumption that the relationship between the covariate and the log hazard, i.e. the functional form, is linear.
Several types of residuals (differences between expected and observed survival) are defined in the Cox model. The paper used the Schoenfeld residuals to assess the porportional hazards assumption or constancy in time of the hazard ratios. The martingale residuals and the generalized additive model (GAM) was used to assess the functional form of continuous covariates (40).

Section 5. Concordance SEER and AZ-VUB
Previous research work on the SEER data have shown that registry geographical area, age at diagnosis, tumour size, extent of nodal involvement, tumour location, histology, histopathological grade, hormone receptor status, surgery, radiation, were statistically significant prognostic factors, while laterality was not significant (13).
For the AZ-VUB, Table 11 based on the 2092 available records shows results of multivariate models. As for subsequent Tables, hazard ratio >1 indicates relative increased risk of death vs. reference for categorical variables, or relative increase per unit change for continuous variables. Tumour location, histology, estrogen receptor and grade appeared non-significant. Nevertheless, the values of the hazard ratios are concordant with the SEER. We note that patients internally referred appeared to have a better prognosis. This suggests the possibility of geographical variation, but this was not investigated.  The functional forms for age, tumour size and number of positive nodes from the AZ are non-linear ( Figure 1). The shapes are in keeping with earlier SEER studies (16,21,22). Using the respective transforms found from the SEER data, • age + |age-47.5| 1.5 for age (in years), • exp(-exp(-(size-15)/10)) for size (in mm), • log e ((npos+0.5)/(ntot-npos+0.5)), for nodes numbers, we note that the global model was slightly improved (larger Rsquare and likelihood ratio tests, Table 12), and the transforms satisfied the linearity test (Table 13). Although deviation from the proportional hazards assumption was significant for histology and for the nodal transform (Table 12), the rho-values indicated small deviations, excluding thus major deviation from the proportional hazards.  We also examined disease-free survival (defined as survival without recurrence, metastasis or second primary) and recurrence-free survival (defined as survival without recurrence or metastasis). The models are qualitatively similar (Table  14). We conclude from this section that: 1) The SEER data do not have information on local-regional or metastatic recurrence. Analysis of the AZ-VUB data indicates that overall-survival is an acceptable surrogate of specific disease outcome.
2) Results from modeling the SEER data are applicable to the AZ-VUB data, indicating the relative similarity of breast cancer in the two databases. This is important since this allows us to proceed to merging for direct comparisons.

Section 6. Comparison of SEER and AZ-VUB treatment outcome
The two databases were merged, 83686 patients from SEER, 2092 patients from AZ-VUB. The median follow-up for patients alive from the SEER was 73 months (mean 76, range 1-143), from the AZ-VUB was 70 months (mean 74, range 1-250). Figure 2 shows the unadjusted overall survival comparison of the SEER and AZ-VUB patients, regardless of treatment and nodal status. There is a statistically significant overall survival advantage for the AZ-VUB patients. Table 15 shows the respective 5-year and 10-year overall survival, indicating 1% survival advantage for AZ-VUB patients at 5-year and 3% at 10-year.  The question that arises: is the survival difference attributable to differences in tumour and/or treatment characteristics, or to differences in the populations unrelated to tumour/treatment? The multivariate model from Table 16 indicates that the survival advantage of AZ-VUB patients remains significant when adjusted with other characteristics. But, for recall, all AZ-VUB patients received RT, therefore the model cannot answer whether the survival advantage is due to treatment or to population differences. Thus, more detailed analyses are required. Considering that there is an overall survival difference both in the above unadjusted and in multivariate analyses, it appeared justified to perform subgroup comparisons. Before proceeding to subgroup comparisons, we need to examine the distribution of treatments (Table 17), in order to determine what comparisons can be reasonably considered. Table 17. Distribution of surgical (bcs) and radiation (rt) treatments by registry (isaz), by nodal status (N1) and extent of nodal involvement (n4). "mbb"= might be biased.  Table 17 shows that among the SEER patients, node-negative or node-positive with less than 4 positive nodes mastectomy patients receiving radiation (RT) is uncommon (row 2 vs. 1, and row 6 vs. 5). Omission of RT after breast conserving surgery, although more substantial, is also uncommon (row 3 vs. 4, row 7 vs. 8, row 11 vs. 12). The distribution is in keeping with general guidelines (4). Comparisons among the SEER patients have been done before (8,14). To repeat the same analyses will neither provide additional information nor solve the problem of these treatment biases. However, among the AZ-VUB patients, per definition all patients received RT. Therefore the best comparisons that might be performed is between the least likely biased treatments from the SEER and the AZ-VUB.

34
The comparisons are done within the context of multivariate models in Tables  18-22. Labels abbreviations refer to Table 17. In each of the Tables 18-22, results of interest are highlighted in italics. a) Breast conserving surgery patients who received radiation: modeling indicates no significant difference in survival outcome between SEER and AZ-VUB patients treated with BCS and RT (Table 18). b) Mastectomy patients: RT did not appear significant, but a survival advantage was noted for AZ-VUB patients (Table 19). c) Node-negative mastectomy patients: comparing AZ-VUB patients who received RT with SEER patients who did not receive RT shows no significant difference for RT (Table 20). d) Node-positive mastectomy patients: comparing AZ-VUB patients who received RT with SEER patients who did not receive RT shows a significant advantage with RT (Table 21). The mortality reduction with RT is maintained regardless of few nodes involved (columns "1-3 positive nodes) or more nodes involved (columns "4+ positive nodes"). e) Node-positive mastectomy patients with 4+ positive nodes: comparing SEER with AZ-VUB patients who both received RT shows no significant difference between SEER and AZ-VUB patients (Table 22). We conclude from this section that, within multivariate models that adjust for tumour characteristics: 1) With the exception of node-negative mastectomy patients in whom the comparison appears inconclusive (Table 20), 2) the survival advantage of AZ-VUB vs. SEER was observed only whenever the comparison was with SEER patients who did not received RT (Table 21, all 3  columns), 3) whereas, whenever AZ-VUB and SEER patients both received RT, no significant survival difference was found (Table 19 and 22). 4) The small difference in overall survival between the overall SEER and AZ-VUB patients noted at the beginning of this Section (Figure 2) appears thus attributable to a large disadvantage in survival among node-positive mastectomy patients who did not receive RT as shown by the hazard ratios of Table 21, or graphically in Figure 3.

Section 7. Discussion
The results are inconclusive regarding the role of post-mastectomy radiotherapy among node-negative patients, but show a significant survival advantage among node-positive patients, regardless of nodal category. We cannot account for unknown factors such as co-morbidity, social or economic conditions of patients, therefore we cannot exclude potential unknown population differences. However, Section 5 shows the qualitative comparability of SEER and AZ-VUB prognostic factors. Section 6 shows the similarity of survival outcome of breast conserving surgery patients receiving radiation, and the similarity of 4+ node-positive mastectomy patients receiving radiation between SEER and AZ-VUB. These findings argue for the comparability between the patient populations, and argue that differences in survival are attributable to the differences in treatment strategies among some subgroups of patients.
Randomized clinical trials of post-mastectomy were not designed to address the problem of subgroups (6). Authors have argued the lack of evidence of a survival advantage in patients with 1-3 positive nodes (4), while others found a survival advantage for these patients in randomized clinical trials (41,42). Our results are in keeping with the latter findings. It should also be remarked that there is a growing literature that draws the attention to a high rate of local-regional recurrences in the 1-3 positive nodes mastectomy patients who did not receive radiation treatment (43,44). While ultimately individual treatment decisions have to take into account other issues like quality of life which was not investigated in this paper, in view of the literature and in view of the present results, the survival outcome and the risk of recurrence need to be discussed whenever post-surgery treatment is considered. In any case, to change our guidelines to omit radiotherapy would not represent progress, but instead a disservice to our patients.
Regarding our secondary objective to identify problems in our own registry, there is an obvious difference in size between the SEER database and our own. With the SEER, one can afford discarding missing records. There are strengths in our registry such as recording details of treatments (not analyzed in the present paper), but the limited size will require improving the quality of follow-up. Limited follow-up restricts the power of survival analyses. Currently tracing of patient status is done from several alternate sources. It might be on the basis on follow-up consultations, but this follow-up can be inexistent or very sparse, e.g. once yearly. Tracing outcome might be based on occasional medical prestation, e.g. a laboratory examination from the family practitioner. On an individual basis, our social department requests lastdate of residence or date of death from counties administration where the patient was last known to reside. These are however circumstantial follow-ups that are clearly incomplete, as seen in our short median follow-up of 70 months as compared with the SEER 73 months, despite our longer follow-up range. Linking with billing records might improve our follow-up assessments, but would be applicable only to patients who are referred to our hospital. Another potential improvement could be a link with the social security reimbursement system in order to trace events, like the SEER-Medicare linkage. This might help improve accuracy of treatment records, though this might bias follow-up towards patients with more healthcare problems. Another alternative would be a direct link with the population counties registries.
Regarding cancer registration in general, there have been some improvement recently with the social security reimbursement for multidisciplinary consults, which provides an incentive for minimal registration of cancer data. It might perhaps evolve in the future towards more detailed registration. But, for the time being, the only source of detailed individual patients cancer data immediately available is the SEER, which explains that it is a reference for any data exploration in cancer. As shown in Table 1, the SEER data was made possible by financial investments. In our case, registration was done on a voluntary basis, which means in fact considerable hidden costs in time rarely available. This clarifies the paradox that we had more facilities with the SEER than with our own patient records.   Metastasis in 4 to 9 axillary lymph nodes (at least 1 tumor deposit larger than 2.0 mm) pN2b Metastasis in clinically apparent c internal mammary nodes in the absence of axillary lymph node metastasis pN3 Metastasis in 10 or more axillary lymph nodes, or in infraclavicular lymph nodes, or in clinically apparent c ipsilateral internal mammary nodes in the presence of 1 or more positive axillary lymph nodes; or in more than 3 axillary lymph nodes with clinically negative microscopic metastasis in internal mammary nodes or in ipsilateral supraclavicular lymph nodes pN3a Metastasis in 10 or more axillary lymph nodes (at least 1 tumor deposit greater than 2.0 mm), or metastasis to the infraclavicular lymph nodes pN3b Metastasis in clinically apparent c ipsilateral internal mammary lymph nodes in the presence of 1 or more positive axillary lymph nodes; or in more than 3 axillary lymph nodes and in internal mammary nodes with microscopic disease detected by sentinel lymph node dissection but not clinically apparent. pN3c Metastasis in ipsilateral supraclavicular lymph nodes There are instances when the pathologist cannot make this determination because the complete staging procedure, such as a lymph node dissection, has not been performed or because information about a prior procedure is unavailable. In such situations "X" is used rather than a number in the TNM designation. a Classification is based on axillary lymph node dissection with or without sentinel lymph node dissection. Classification based solely on sentinel lymph node dissection without subsequent axillary dissection is designated (sn) for "sentinel node," eg, pN0(i+)(sn). b Isolated tumor cells (ITC) are defined as single tumor cells or small cell clusters not greater than 0.2 mm. They may be detected by routine histologic examination or by immunohistochemical (IHC) or molecular methods. ITCs do not usually show evidence of malignant activity (eg, proliferation or stromal reaction). c Clinically apparent is defined as detected by imaging studies (excluding lymphoscintigraphy) or by clinical examination. Not clinically apparent is defined as not detected by imaging studies (excluding lymphoscintigraphy) or by clinical examination. d Micrometastases may show histologic evidence of malignant activity (eg, proliferation or stromal reaction).

Distant Metastasis (cM and pM) MX
Presence of distant metastasis cannot be assessed M0 No distant metastasis M1 Distant metastasis TNM Descriptors For identification of special cases of TNM or pTNM classifications, the "m" suffix and "y," "r," and "a" prefixes are used. Although they do not affect the stage grouping, they indicate cases needing separate analysis. The "m" suffix indicates the presence of multiple primary tumors in a single site and is recorded in parentheses: pT(m)NM. The "y" prefix indicates those cases in which classification is performed during or following initial multimodality therapy (ie, neoadjuvant chemotherapy, radiation therapy, or both chemotherapy and radiation therapy). The cTNM or pTNM category is identified by a "y" prefix. The ycTNM or ypTNM categorizes the extent of tumor actually present at the time of that examination. The "y" categorization is not an estimate of tumor prior to multimodality therapy (ie, before initiation of neoadjuvant therapy). The "r" prefix indicates a recurrent tumor when staged after a documented disease-free interval, and is identified by the "r" prefix: rTNM. The "a" prefix designates the stage determined at autopsy: aTNM.

Additional Descriptors Residual Tumor (R)
Tumor remaining in a patient after therapy with curative intent (eg, surgical resection for cure) is categorized by a system known as R classification.

RX
Presence of residual tumor cannot be assessed R0 No residual tumor R1 Microscopic residual tumor R2 Macroscopic residual tumor For the surgeon, the R classification may be useful to indicate the known or assumed status of the completeness of a surgical excision. For the pathologist, the R classification is relevant to the status of the margins of a surgical resection specimen. That is, tumor involving the resection margin on pathologic examination may be assumed to correspond to residual tumor in the patient and may be classified as macroscopic or microscopic according to the findings at the specimen margin(s).

Vessel Invasion
By AJCC/UICC convention, vessel invasion (lymphatic or venous) does not affect the T category indicating local extent of tumor unless specifically included in the definition of a T category. In all other cases, lymphatic and venous invasion by tumor are coded separately as follows: Lymphatic (Small Vessel) Invasion (L) LX Lymphatic vessel invasion cannot be assessed L0 No lymphatic vessel invasion L1 Lymphatic vessel invasion Venous (Large Vessel) Invasion (V) VX Venous invasion cannot be assessed V0 No venous invasion V1 Microscopic venous invasion V2 Macroscopic venous invasion Regional Lymph Nodes (pN0): Isolated Tumor Cells Isolated tumor cells (ITC) are single cells or small clusters of cells not more than 0.2 mm in greatest dimension. Lymph nodes or distant sites with ITC found by either histologic examination, immunohistochemistry, or nonmorphologic techniques (eg, flow cytometry, DNA analysis, polymerase chain reaction [PCR] amplification of a specific tumor marker) should be classified as N0 or M0, respectively. Specific denotation of the assigned N category is suggested as follows for cases in which ITC are the only evidence of possible metastatic disease. pN0 No regional lymph node metastasis histologically, no examination for isolated tumor cells (ITCs) pN0(i-) No regional lymph node metastasis histologically, negative morphologic (any morphologic technique, including hematoxylin-eosin and immunohistochemistry) findings for ITCs pN0(i+) No regional lymph node metastasis histologically, positive morphologic (any morph. techn., including h-e and immunohis.) findings for ITCs pN0(mol-) No regional lymph node metastasis histologically, negative nonmorphologic (molecular) findings for ITCs pN0(mol+) No regional lymph node metastasis histologically, positive nonmorphologic (molecular) findings for ITCs Sentinel Lymph Nodes The sentinel lymph node is the first node to receive drainage from a primary tumor. There may be more than 1 sentinel node for some tumors. If a sentinel node contains metastatic tumor, it indicates that other more distant nodes may also contain metastatic disease. If sentinel nodes are negative, other regional nodes are less likely to contain metastasis. Sentinel lymph nodes that have been examined for ITCs are denoted as follows: pN0(sn) No sentinel lymph node metastasis histologically (ie, none greater than 0.2 mm), no additional examination for isolated tumor cells (ITCs) pN0(i-)(sn) No sentinel lymph node metastasis histologically (ie, none greater than 0.2 mm), negative morphologic findings for ITCs pN0(i+)(sn) No sentinel lymph node metastasis histologically, positive morphologic findings for ITCs pN0(mol-)(sn) No sentinel lymph node metastasis histologically, negative nonmorphologic findings for ITCs pN0(mol+) (sn) No sentinel lymph node metastasis histologically, positive nonmorphologic findings for ITCs