Article | Open

Breast-cancer-specific mortality in patients treated based on the 21-gene assay: a SEER population-based study

Published online:


The 21-gene Recurrence Score assay is validated to predict recurrence risk and chemotherapy benefit in hormone-receptor-positive (HR+) invasive breast cancer. To determine prospective breast-cancer-specific mortality (BCSM) outcomes by baseline Recurrence Score results and clinical covariates, the National Cancer Institute collaborated with Genomic Health and 14 population-based registries in the the Surveillance, Epidemiology, and End Results (SEER) Program to electronically supplement cancer surveillance data with Recurrence Score results. The prespecified primary analysis cohort was 40–84 years of age, and had node-negative, HR+, HER2-negative, nonmetastatic disease diagnosed between January 2004 and December 2011 in the entire SEER population, and Recurrence Score results (N=38,568). Unadjusted 5-year BCSM were 0.4% (n=21,023; 95% confidence interval (CI), 0.3–0.6%), 1.4% (n=14,494; 95% CI, 1.1–1.7%), and 4.4% (n=3,051; 95% CI, 3.4–5.6%) for Recurrence Score <18, 18–30, and 31 groups, respectively (P<0.001). In multivariable analysis adjusted for age, tumor size, grade, and race, the Recurrence Score result predicted BCSM (P<0.001). Among patients with node-positive disease (micrometastases and up to three positive nodes; N=4,691), 5-year BCSM (unadjusted) was 1.0% (n=2,694; 95% CI, 0.5–2.0%), 2.3% (n=1,669; 95% CI, 1.3–4.1%), and 14.3% (n=328; 95% CI, 8.4–23.8%) for Recurrence Score <18, 18–30, 31 groups, respectively (P<0.001). Five-year BCSM by Recurrence Score group are reported for important patient subgroups, including age, race, tumor size, grade, and socioeconomic status. This SEER study represents the largest report of prospective BCSM outcomes based on Recurrence Score results for patients with HR+, HER2-negative, node-negative, or node-positive breast cancer, including subgroups often under-represented in clinical trials.


Despite unprecedented advances in breast cancer diagnosis and treatment, health-care quality and outcomes remain variable, with significant disparities associated with many factors, such as age and race, and the location of care.1,​2,​3,​4 Leading organizations, including the Institute of Medicine,5 the American Society of Clinical Oncology,6,7 and the European Organisation for Research and Treatment of Cancer,8 have emphasized the need for new research models to more precisely identify what works in clinical practice and encourage appropriate value-based cancer treatment.

In recent years, technologies such as multigene expression analysis, next-generation sequencing, and liquid biopsy, have raised the potential to define appropriate patient subgroups for more precise cost-effective care and improved health outcomes. The 21-gene assay has been clinically validated in ‘prospective–retrospective’ studies on archival tumor tissue to provide both prognostic and predictive information for chemotherapy benefit in early stage, hormone-receptor-positive (HR+), node-negative, or node-positive breast cancer.9,​10,​11,​12 Recently, the first results from the Trial Assigning Individualized Options for Treatment (TAILORx), a multi-center, prospectively conducted trial of 10,253 women with early-stage breast cancer, were reported.13 For the 1,626 trial participants with Recurrence Score results <11 (TAILORx low-range stratum) who received hormonal therapy alone without chemotherapy, 5-year freedom from distant recurrence was 99.3% (95% confidence interval (CI), 98.7–99.6%). These findings demonstrate that patients with low Recurrence Score results can be effectively spared from adjuvant chemotherapy. Results from two other recent studies provide additional evidence of excellent outcomes for patients treated with hormonal therapy alone based on low Recurrence Score results. In the Clalit Health Services study, patients with node-negative or micrometastatic disease and Recurrence Score results <18 (standard low-range group), nearly all of whom (98%) were treated with hormonal therapy without chemotherapy, had <1% risk of distant recurrence and 0% risk of breast-cancer-specific mortality (BCSM) at 5 years.14 In the prospective PlanB trial, patients with 0–3 positive lymph nodes and Recurrence Score results 11, who were treated with hormonal therapy alone, had 3-year relapse-free survival exceeding 98%.15 As we await results of the TAILORx mid-range stratum (Recurrence Score results 11–25), it would be desirable to have additional evidence of the utility of the 21-gene assay in patients with Recurrence Score result <18, the standard cutoff used in contemporary clinical practice for selection of hormone therapy alone.

Initiated in 1973, the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute is an authoritative population-based cancer surveillance program, covering ~30% of the US population and capturing over 98% of incident cancer cases in these covered regions.16 SEER Program registries collect standardized patient information on demographics, primary tumor site, and characteristics (histology, grade, stage, and so on), first course of treatment, and survival (survival time, vital status, and cause of death), as mandated by respective state laws. SEER has required the collection of breast cancer multigene test results for cases diagnosed with breast cancer in 2010 and after. To supplement registry data with more complete and accurate multigene test results, we electronically linked Recurrence Score results from the Genomic Health Clinical Laboratory database with each of the SEER registries.

In this first report, we determined the relationship between Recurrence Score results and prospective BCSM for the large population in the SEER program with node-negative and node-positive breast cancer, including subgroups (e.g., racial minorities, the elderly, and the young) that are often under-represented in clinical trials.



A total of 379,103 patients were newly diagnosed with primary invasive breast cancer between 2004 and 2011 in the participating SEER registries (Table 1; Supplementary Figure S1). These patients resided in 12 individual SEER states: California (3 registries, 41%), New Jersey (12%), Georgia (11%), Washington (13 Puget Sound region counties only, 6%), Connecticut (5%), Kentucky (5%), Louisiana (5%), Michigan (metropolitan Detroit counties only, 5%), Iowa (4%), Hawaii (2%), New Mexico (2%), and Utah (2%).

Table 1: Patient demographics, by nodal status and Recurrence Score assay status (N=241,681)

Of 379,103 patients, 45,287 (12%) had HR+, nonmetastatic disease and Recurrence Score results, including 40,134 with node-negative disease, 4,691 with micrometastates or up to three positive nodes (N+(mic,1–3)), and 462 with four or more positive nodes or unknown/missing nodal status. Nearly all patients (99.5%) who had 21-gene assay testing had fewer than four positive lymph nodes. Only 165 of 16,202 patients with 4–9 positive nodes and 41 of 7,320 patients with 10 or more positive nodes had testing. Demographic characteristics of tested and untested patients of all ages diagnosed between 2004 and 2011 with node-negative or node-positive (N+(mic,1–3)) are shown in Table 1. Median follow-up of patients with node-negative disease was longer than that of patients with node-positive (N+(mic,1–3)) disease (39 months versus 30 months), reflecting later adoption of the test for use in patients with node-positive disease.

A total of 38,568 patients (10%) were eligible for the prespecified primary analysis (had HR+, HER2-negative, node-negative, nonmetastatic disease; had Recurrence Score results; and were 40–84 years of age; Supplementary Figure S1). Median age of the prespecified primary analysis cohort was 57 years; 99.4% were female; 84% were white, 29% and 54% had tumors of histologic grade 1 and grade 2, respectively; 25% and 53% had tumors of 1 cm and >1 to 2 cm in size, respectively. Median follow-up for the primary analysis cohort was 39 months; 8,239 (21%) patients had >5 years of follow-up.

Recurrence Score result and breast-cancer-specific mortality (Node-negative)

Of 38,568 patients in the prespecified primary analysis cohort, 21,023 (55%) had Recurrence Score results <18, 14,494 (38%) had results 18–30, and 3,051 (8%) had results 31. BCSM was significantly associated with Recurrence Score results (P<0.001) (Figure 1). Unadjusted 5-year estimates of BCSM were 0.4% (95% CI, 0.3–0.6%), 1.4% (95% CI, 1.1–1.7%), and 4.4% (95% CI, 3.4–5.6%) for patients Recurrence Score results <18, 18–30, and 31, respectively. Chemotherapy use was reported as ‘yes’ in 7%, 34%, and 69% of patients in the Recurrence Score <18, 18–30, and 31 groups, respectively (the remaining patients in each group were reported in SEER as ‘no/unknown chemotherapy use’).

Figure 1
Figure 1

Five-year estimates of breast-cancer-specific mortality, by Recurrence Score group (prespecified primary analysis). Patients with HR+, HER2-negative, node-negative breast cancer who had Recurrence Score (RS) results <18 (green), 18–30 (yellow), or 31 (red) were included in the primary analysis. Five-year estimates of breast-cancer-specific mortality (BCSM) with 95% CIs (green, yellow, and red shading) were 0.4% (0.3–0.6%) in the RS <18 group, 1.4% (1.1–1.7%) in the RS 18–30 group, and 4.4% (3.4–5.6%) in the RS 31 group. Five-year estimates of BCSM±s.e. are shown to the right of their respective lines. Numbers of patients at risk in each group are shown beneath the graph.

Five-year estimates of overall survival and non-breast-cancer-specific mortality (non-BCSM) were also determined. There was a significant difference in overall survival among Recurrence Score groups (P<0.001), with earlier mortality corresponding to Recurrence Score results 31 (data not shown). There was no significant difference in non-BCSM among Recurrence Score groups (P=0.10). Five-year other-cause mortality for the primary analysis cohort overall was 5.1% (95% CI, 4.8–5.4%), and 5.1% (95%CI, 4.7–5.6%), 5.0% (95%CI, 4.5–5.5%), and 5.6% (95%CI, 4.6–6.9%) for Recurrence Score <18, 18–30, and 31 groups, respectively. Age was the strongest predictor of non-BCSM (P<0.001).

Recurrence Score result and breast-cancer-specific mortality (Node-positive)

A total of 4,691 patients with positive lymph nodes (N+(mic,1–3)) had Recurrence Score results (Supplementary Figure S1). Of these, 2,694 (57%) had Recurrence Score results <18, 1,669 (36%) had results 18–30, and 328 (7%) had results 31. In addition, 50% of these patients had tumors >10 to 20 mm in size, 55% had moderate grade tumors, and 78.3% were 50 years of age or older (Table 1). Among patients with node-positive (N+(mic,1–3)) disease, 5-year BCSM was significantly different for the three Recurrence Score groups: 1.0% (95% CI, 0.5–2.0%) in the <18 group, 2.3% (95% CI, 1.3–4.1%) in the 18–30 group, and 14.3% (95% CI, 8.4–23.8%) in the 31 group (P<0.001). Chemotherapy use was reported as ‘yes’ (rather than ‘no/unknown’) in 23%, 47%, and 75% of patients in the Recurrence Score <18, 18–30, and 31 groups, respectively. Excluding patients with micrometastatic disease, the 5-year BCSM for patients with Recurrence Score results <18 and 1–3 positive nodes (n=2,617) was 1.3% (95% CI, 0.6–2.9%).

Subgroups and breast-cancer-specific mortality

Recurrence Score group was significantly prognostic (P0.001) for 5-year BCSM for the node-negative and the node-positive (N+(mic,1–3)) populations as wholes, in every node-negative subgroup of age, grade, race, and socioeconomic (SES) status, and in the node-positive (N+(mic,1–3)) subgroups with substantial numbers of patients (>10 to 20 mm, moderate grade, age 60–69 years, and White; Figure 2a–j Supplementary Table S1). Notably, 5-year BCSM was 1.3% or lower for patients with Recurrence Score results <18, regardless of nodal status and age group. Similarly, low BCSM was observed for patients <70 years of age with Recurrence Score results 18–30. However, for patients with Recurrence Score results 31, 5-year BCSM was substantially higher and ranged from 2.0 to 21.6%, depending on age group (Figure 2a). Among patients with node-positive (N+(mic,1–3)) disease, those with Recurrence Score results 31 had 5-year BCSM that exceeded 9.5%, regardless of age group (Figure 2b). For patients with both node-negative and node-positive (N+(mic,1–3)) disease, reported chemotherapy use generally decreased with increasing age.

Figure 2
Figure 2Figure 2

Five-year estimates of breast cancer-specific mortality, by Recurrence Score group (subgroup analyses). Patients with HR+, HER2-positive, node-negative (a,c,e,g,i) and node-positive (micrometastases up to three positive nodes; b,d,f,h,j) breast cancer who had Recurrence Score (RS) results <18 (green), 18–30 (yellow), or 31 (red) were included in subgroup analyses by age (a,b), race (c,d), socioeconomic status (e,f) as defined by the Yost composite index,41 tumor grade (g,h), and tumor size (i,j). Five-year estimates of breast-cancer-specific mortality (BCSM)±s.e. are shown. Percentages of patients with chemotherapy use reported as ‘yes’ as a proportion of all patients (‘yes’ or ‘no/unknown’ chemotherapy use) are shown beneath the graph.

For patients with node-negative or node-positive (N+(mic,1–3)) disease, those with higher Recurrence Score results had higher 5-year BCSM than those with lower results, regardless of race (White or Black; Figure 2c,d) and regardless of SES quintile (Figure 2e,f). Of note, among patients with node-negative disease and Recurrence Score results 31, 5-year BCSM appeared to decrease with higher quintiles, despite similar reported chemotherapy use across SES quintiles (68–70%; Figure 2e). Among patients with node-positive (N+(mic,1–3)) disease and Recurrence Score results 31, 5-year BCSM exceeded 9%, regardless of SES quintile (Figure 2f).

Among patients with node-negative disease, Recurrence Score group was significantly prognostic (P<0.05) for subgroups analyzed by tumor grade (Figure 2g) and tumor size (Figure 2i), except for patients with very small tumors. For patients with tumors 5 mm in size, the estimated risk is elevated in the group with Recurrence Score results 31, although the estimate lacks precision (1.9%; 95% CI, 0.3–12.5%). Among patients with node-negative disease and tumors >4 cm in size, Recurrence Score group was significantly prognostic (P=0.005), but all three Recurrence Score groups have estimated 5-year BCSM of 2.4% or higher (Figure 2i). For patients with node-positive (N+(mic,1–3)) disease, those with higher Recurrence Score results generally had worse 5-year BCSM than those with lower results, regardless of tumor size (Figure 2j).

With respect to tumor grade, regardless of nodal status, 5-year BCSM generally increased with worsening grade, as expected (Figure 2g,h). Within grade categories, however, the group with Recurrence Score results <18 consistently had low 5-year BCSM (<1% for node-negative disease; <2% for node-positive (N+(mic,1–3)) disease), even for patients with high tumor grade. For patients with node-negative disease and Recurrence Score results 31, 5-year BCSM was >2%, even for patients with low tumor grade.

Multivariable model with adjustment for baseline covariates

The effects of Recurrence Score results and potential confounding variables were assessed using a Cox regression model. Compared with patients with a Recurrence Score result <18 and without adjusting for other covariates, patients with Recurrence Score results 18–30 and results 31 had higher hazards for BCSM. These effects were modestly attenuated after adjustment for grade, tumor size, age, and race (Table 2); however, the Recurrence Score 18–30 and 31 groups remain at significantly increased hazards (P<0.001). Because Recurrence Score results are known to affect treatment decisions, an additional model was fit with a treatment interaction. In this model, Recurrence Score results remained prognostic among both patients for whom chemotherapy was reported as ‘yes’ and as ‘no/unknown,’ but the strength of the association was attenuated for those with chemotherapy reported as ‘yes’ (P=0.03 for covariate-adjusted interaction). Comparable models fitted using Recurrence Score result as a continuous linear variable were also significant for prognosis, with and without adjustment for covariates (P<0.001 for both).

Table 2: Multivariable model for breast-cancer-specific mortality (patients with hormone-receptor-positive, HER2-negative, node-negative breast cancer, N=40,134)

In the covariate-adjusted model, increases in the hazard of BCSM were significantly associated with poorly differentiated tumors (hazard ratio (HR) 2.1 (95% CI, 1.3–3.2), P=0.002), tumors >4 cm in size (HR 3.4 (95% CI, 1.3–8.7), P=0.010), age 70–79 years (HR 2.4 (95% CI, 1.2–5.0), P=0.014), and age 80 years (HR 6.1 (95% CI, 2.6–14.3), P<0.001). Patients with very small tumors (5 mm) did not have significantly different outcomes than patients had with tumors >5 to 10 mm in size. The youngest patients (<40 years) did not have significantly different outcomes than patients who were 40–49 years or 50–59 years of age. In contrast, older patients at the time of diagnosis (70 years) had significantly worse BCSM in both the unadjusted and adjusted models.

The multivariable model also demonstrated differences in outcomes by race. An alternate model, replacing race with Yost quintile, as a marker for neighborhood SES, also showed worsening outcomes with lower SES; however, neither race nor SES were significant when both terms were included in the same model, and both effects are relatively weak compared to the genomic and clinical factors.

TAILORx clinical trial cutpoints

Since its initial development, the 21-gene assay results report has used Recurrence Score cutpoints of 18 and 31, and has provided individualized risk estimates for the specific Recurrence Score result for each patient. Alternative cutpoints, however, have been implemented in clinical trials. The ongoing TAILORx randomized trial uses cutpoints of 11 and 25 to define Recurrence Score categories <11, 11–25, and >25.13 Applying these alternative categories to our study cohort, the 5-year BCSM in the HR+, node-negative, non-age-restricted cohort were 0.4% (95% CI, 0.2–0.6%) for 7,281 patients with Recurrence Score results <11, 0.7% (95% CI, 0.6–0.8%) for 26,462 patients with Recurrence Score results 11–25, and 3.6% (95% CI, 3.0–4.4%) for 6,391 patients with Recurrence Score results >25 (P<0.001). Considering both the standard cutpoints and the TAILORx cutpoints, the 5-year BCSM for 10,589 patients with Recurrence Score results 18–24 and 3,905 patients with Recurrence Score results 25–30 were 1.0% (95% CI, 0.8–1.4%) and 2.4% (95% CI, 1.8–3.2%), respectively.


Our prespecified primary analysis of the population-based SEER database electronically supplemented with Recurrence Score results showed that the Recurrence Score result was significantly associated with the likelihood of BCSM (P<0.001). In multivariable analysis that adjusted for the prognostic baseline covariates of patient age, tumor size, tumor grade, and race, as well as reported chemotherapy use, the Recurrence Score result remained strongly predictive of BCSM (P<0.001).

We showed that patients with node-negative disease in the SEER program with Recurrence Score results <18, who had low rates of chemotherapy use reported as ‘yes’ (7%), had 5-year BCSM of 0.4% (95% CI, 0.3–0.6%), a finding that is consistent with those of earlier, prospective–retrospective clinical validation studies performed on archival specimens.9,11,17 For example, 5-year BCSM for patients with Recurrence Score results <18 was 0.9% (95% CI, 0.3–2.8%) in the National Surgical Adjuvant Breast and Bowel Project (NSABP) B-14 study and 1.1% (95% CI, 0.5–1.6%) in the Kaiser study (L. Habel, personal communication). That our results for patients with Recurrence Score results <18 were similar to earlier results, despite dissimilar time periods in which each cohort was enrolled, demonstrates the capacity of the Recurrence Score result to identify patients with excellent prognosis, regardless of the era in which they were diagnosed or the specific treatments they received. Moreover, the favorable outcomes we observed in patients with node-negative disease and Recurrence Score results <18 confirm—and extend—findings of the TAILORx trial, the Clalit Health Services study, and the PlanB trial of the Women’s Healthcare Study Group in Germany, all of which reported excellent outcomes for patients with low Recurrence Score results.13,​14,​15

For patients with node-positive (N+(mic,1–3)) disease, the Recurrence Score result was strongly predictive of BCSM, despite the relatively short-median follow-up. For patients with Recurrence Score results <18, 23% of whom had chemotherapy use reported as ‘yes,’ the 5-year BCSM was 1.0% (95% CI, 0.5–2.0%), similar to the 5-year BCSM observed for patients with node-negative disease. These results in more than 4,600 patients with node-positive disease reconfirm the Recurrence Score results from the SWOG 8814 and ATAC studies showing that the 21-gene assay identifies a group of patients with node-positive disease and Recurrence Score results <18 with favorable outcomes.11,12 The ongoing Treatment (Rx) for Positive Node, Endocrine Responsive Breast Cancer (RxPONDER) clinical trial,18 in which patients with node-positive disease and Recurrence Score results <25 are randomized to endocrine therapy with or without adjuvant chemotherapy, should provide more definitive information on the effect of chemotherapy. Our SEER results provide prospective evidence that certain patients with node-positive disease and low Recurrence Score results have favorable 5-year BCSM, and reassurance that the randomization in RxPONDER was justified.

Prior to our study, there were relatively little data regarding Recurrence Score results and outcomes in younger and older patients with breast cancer. For example, only 3% of patients in the NSABP B-14 study,19 5% in the ongoing TAILORx trial (J. Sparano, personal communication), and 0% in the ongoing MINDACT trial were older than 70 years.20 Our population-based SEER study is therefore noteworthy for providing the largest experience to date with patients 70 years of age, including 4,647 with node-negative disease and 880 with node-positive (N+(mic,1–3)) disease, and with patients <40 years, including 1,480 with node-negative disease and 165 with node-positive (N+(mic,1–3)) disease. Our study found that regardless of age, patients with Recurrence Score results <18 had excellent outcomes: 5-year BCSM was <1.3% in the node-negative group and <1.7% in the node-positive (N+(mic,1–3)) group. Importantly, although younger age is considered an unfavorable prognostic factor,21 the 5-year BCSM of 0.0% that we observed in 682 patients <40 years with node-negative disease and Recurrence Score results <18 indicates that the Recurrence Score assay identified a subset of very young patients that had excellent outcomes.

We noted that outcomes were especially poor for older patients with node-negative disease and Recurrence Score results 31: 5-year BCSM was 10.4% for patients 70–79 years and 21.6% for patients 80 years. Within this Recurrence Score group (31), 72% of patients <70 years, but only 53% of patients 70 years, had chemotherapy use reported as ‘yes.’ Previous studies have reported that older women with breast cancer generally receive less aggressive treatment than younger patients do and are more likely to die from their disease.22,​23,​24,​25 Importantly, practice guidelines by the National Comprehensive Cancer Network on ‘Older Adult Oncology’ acknowledge that older women with breast cancer ‘often do not receive ‘standard of care.26’ Clearly, as population demographics shift, it is imperative that we understand and take action to lessen BCSM in older patients. A more-detailed analysis of the older population in SEER is underway to examine comorbidities, competing risks, and outcomes in patients with and without 21-gene assay testing.

We further show that the Recurrence Score result was significantly prognostic for BCSM in every subgroup of race, SES, and pathology (with the possible exception of tumors 5 mm in size, although the general trend for this subgroup was consistent with observations made in subgroups with larger tumors). For several subgroups historically under-represented in clinical studies, including patients of lower SES and patients who identify as non-White, our results add substantially to the body of knowledge relating the Recurrence Score results to outcomes.

Finally, Recurrence Score biology at diagnosis was a very strong predictor of BCSM but, as expected, was not predictive of other-cause mortality. Age was the strongest predictor of other-cause mortality. It will be important in future studies to assess in greater detail BCSM in the context of other-cause mortality and patient comorbidities.

The availability now of prospective outcomes for over 50,000 patients with Recurrence Score results across multiple studies carries implications for both clinical practice and breast cancer staging. Use of clinical and pathologic factors alone (e.g., luminal A-like or luminal B-like) may be inadequate to select patients for consideration of adjuvant chemotherapy. The 21-gene Recurrence Score assay has the capacity to further risk-stratify within any clinical or pathologic category (including luminal A-like and luminal B-like),27,28 based on tumor biology. At present, the American Joint Committee on Cancer (AJCC) is working to revise the current TNM breast cancer staging system, largely limited to anatomic information, to incorporate new molecular testing information. Collaborations between SEER and diagnostic testing companies, such as the research model pioneered in this study, can provide evidence from large high-quality datasets to support updated criteria for cancer staging.

In general, observational studies can provide valuable information on diagnosis, treatment, and outcomes in actual clinical practice. When very large, observational studies can provide new information about subgroups of patients that randomized clinical trials are often not powered to assess. Nonetheless, observational studies, like ours, have limitations. First, data in the SEER Program are derived from patients who were not randomized to treatment. This potentially introduces bias and confounding factors for estimating treatment effects that cannot be fully controlled. To enhance the rigor of our large observational study, we applied the Good Research for Comparative Effectiveness (GRACE) Principles of good study design and implementation to the methodology and analyses.29 Second, the SEER Program collects no information on breast cancer recurrence and progression. Third, chemotherapy use is under-reported in SEER. Although the magnitude of under-reporting is unknown, one study involving the Medicare population suggested a 30% relative under-ascertainment.30 For the reasons of under-reporting of chemotherapy use and selection bias in who elected to get chemotherapy, we do not report BCSM by treatment choice. Although we do report the percent for whom chemotherapy is reported as ‘yes’ versus ‘no/unknown’ in various subgroups, the limitations of this variable should be kept in mind when interpreting these results. At the time of this analysis with SEER survival follow-up only through 2012, the extent of patient follow-up beyond 5 years was limited. However, with the large sample size, the confidence intervals for the 5-year BCSM estimates are narrow. Moreover, the reported overall benefit of chemotherapy has been observed by 5 years.31 Finally, other multigene signatures for breast recurrence risk were not included in this analysis. It should be noted that the 21-gene assay accounted for >93% of tests reported to SEER by manual collection in 2010–2012.

Our study nevertheless had a number of strengths. First, ours is the largest-to-date study of prospective outcomes based on Recurrence Score results in node-negative and node-positive disease. Second, the SEER Program is remarkable for its stringent ascertainment of population-based patient-specific data at the time of diagnosis and at the time of death. Third, the population-based nature of the SEER registries combined with the size of the database of the Genomic Health Clinical Laboratory (the only laboratory that performs the 21-gene assay) ensures that study results reflect real-world practices.

In the future, we plan to conduct additional analyses, including among others: (a) continued follow-up of the SEER registries to encompass an ever-increasing number of patients; (b) in-depth analyses of important subgroups by, for example, race, ethnicity, SES, and sex; (c) determination of propensity scores for chemotherapy benefit; (d) assessment of factors that influence assay ordering, including geography; and (e) comparison of manually collected and Genomic Health Clinical Laboratory-reported Recurrence Score results. In addition, we plan to conduct an analogous analysis after merging with the SEER-Medicare database.

In conclusion, this study represents a new model for collaboration between National Cancer Institute, SEER registries, and industry to more efficiently and completely capture important genomic test results to inform understanding of ‘real-world’ oncology practice. Our study results strongly reinforce the findings of the prospectively designed TAILORx trial, other prospective outcomes studies, and numerous earlier prospective–retrospective validation studies.9,​10,​11,​12,​13,​14,​15,17 Our SEER study provides additional evidence in >44,000 patients with node-negative and node-positive (N+(mic,1–3)) disease that the 21-gene assay accurately predicts prospective outcomes, independent of patient age, tumor size, and tumor grade.

Materials and methods

Study population

The electronic linkage of the 21-gene Oncotype DX Breast Recurrence Score assay (Genomic Health, Inc., Redwood City, CA, USA) results from the Genomic Health Clinical Laboratory database with the SEER registries database was based on protected health information included in the these databases. The linkage was performed by Information Management Services (IMS, Calverton, MD, USA), a National Cancer Institute contractor that manages cancer surveillance data for SEER Program registries, using the Link Plus software (Centers for Disease Control and Prevention, Division of Cancer Prevention and Control, National Program of Cancer Registries; Atlanta, GA, USA), a deterministic SAS algorithm, and manual adjudication of partial matches by registries staff. De-identified data were released to the study team after SEER approval of a custom data request. This linkage allowed for the inclusion of Recurrence Score results from 2004 and for more complete Recurrence Score capture from 2010 forward (manuscript in preparation).

The primary survival analysis cohort, statistical methodology, and study end point were prespecified before the data linkage was performed. The primary survival analysis cohort was specified as all patients in the SEER research database diagnosed between 1 January 2004 and 31 December 2011 who had lymph node-negative, HR+, HER2-negative, primary invasive breast cancer, and a Recurrence Score result in the Genomic Health Clinical Laboratory database. Standard cutoffs that define risk groups by Recurrence Score results are as follows: low (<18), intermediate (18–30), and high (31). Patients were excluded if they had metastatic disease, any previous history of invasive cancer (but not prior ductal carcinoma in situ), or multiple Recurrence Score results for any reason (e.g., multifocal disease or concurrent primary tumors).

To be more consistent with the patient populations of the NSABP B-14 and Kaiser clinical validation studies,9,17 the primary survival analysis was restricted to patients with lymph node-negative, HR+ HER2-negative breast cancer who were 40–84 years of age at diagnosis. Secondary survival analyses of various subgroups included patients of all ages and patients with node-positive disease.

Patients were considered HR+ if their tumors were estrogen receptor- or progesterone receptor-positive (ER+ or PR+) by the SEER-reported ER and PR immunohistochemistry results (borderline results were considered positive) and by the 21-gene assay quantitative reverse transcription-PCR (RT-PCR) single-gene ER or PR results. Patients were considered node-negative or node-positive based on the data collected by SEER. According to the AJCC 6th and 7th edition,32 micrometastases were considered lymph node-positive and isolated tumor cells were considered lymph node-negative. The SEER database includes no information on HER2 status prior to 2010. Thus, HER2-negative patients were identified among those with Recurrence Score results who had 21-gene assay quantitative RT-PCR single-gene HER2 scores 11.5.33,​34,​35 Rare cases in which tumors were classified in SEER as undifferentiated or anaplastic were considered to be poorly differentiated. Both male and female patients were included in all analyses.

End point

The study end point for all outcome analyses was BCSM. Underlying causes of death (CODs) were ascertained by the SEER registries through linkages with state death certificates and the National Death Index from the National Center for Health Statistics.36 In addition, vital status was ascertained through linkages with other sources, such as state Departments of Motor Vehicles, state voter registration databases, and the Social Security Administration. To correct the known errors with COD attribution, the SEER program developed a special COD variable that maps underlying CODs to the primary cancer diagnosis.37 We used this variable to assign a broad set of CODs to capture deaths from breast cancer among patients diagnosed with an incident breast cancer in SEER. Patients who did not die of breast cancer were censored at time of last follow-up (31 December 2011) or at time of death from other causes.

To protect patient confidentiality, time to event in SEER is captured in months rather than days. Therefore, actuarial methods rather than Kaplan–Meier methods were utilized for event and freedom-from-event calculations. Five-year BCSM was selected as the summary outcome measure of greatest importance based on its importance as a clinically meaningful end point and limited BCSM follow-up beyond 5 years.

Statistical analysis

The primary survival analysis was prespecified to be performed on eligible patients in the three categorical Recurrence Score groups based on the standard 21-gene assay Recurrence Score cutpoints: <18, 18–30, or 31. The continuous Recurrence Score result (0–100) was evaluated in secondary analyses. Actuarial estimates of survival were computed through 5 years with 95% CIs. The log-rank test was used to compare the three Recurrence Score groups. Hazard ratios were calculated using Cox regression models, which were fit using SAS PROC PHREG (SAS/STAT version 9.4, SAS Institute Inc., Cary, NC, USA). Proportional hazards assumptions were assessed and met in the final models. Two sets of prognostic models were fit: one set used the three-category Recurrence Score group variable and the second, as a sensitivity analysis, used the continuous Recurrence Score result.

It is important to note that the Recurrence Score result is provided to patients and physicians to guide treatment decisions. Although the result is one of many factors that influence treatment decision-making, the group with Recurrence Score results <18 has been shown in multiple previous studies to have a much lower rate of reported chemotherapy use than the groups with higher Recurrence Score results.38,​39,​40,​41 Thus, the difference in BCSM curves among the three Recurrence Score groups was expected to be smaller, in percentage terms, than if all patients were treated with hormonal therapy alone. Adjusting for treatment was not straightforward because treatment itself is influenced by the Recurrence Score result and there is a known issue with ascertainment bias for treatment. As a sensitivity analysis, multivariable models were fit including Recurrence Score results, treatment, and an interaction in the model; however, these models should not be used to evaluate comparative effectiveness.

Subgroup analyses were conducted by age, race, tumor size, tumor grade, and SES Index. The SES Index was created by SEER and is based on each patient’s census-tract at-diagnosis attributes reflected by the Yost composite Index. To further protect patients’ privacy, the SES Index is categorized in quintiles.42


  1. 1.

    , , , & Age-related disparity in immediate prognosis of patients with triple-negative breast cancer: A population-based study from SEER cancer registries. PLoS ONE 10, e0128345 (2015).

  2. 2.

    et al. Meta-analysis of survival in African American and white American patients with breast cancer: ethnicity compared with socioeconomic status. J. Clin. Oncol. 24, 1342–1349 (2006).

  3. 3.

    , & Disparities in breast cancer characteristics and outcomes by race/ethnicity. Breast Cancer Res. Treat. 127, 729–738 (2011).

  4. 4.

    et al. Cancer statistics for African Americans, 2016: Progress and opportunities in reducing racial disparities. CA Cancer J Clin.10.3322/caac.21340 (22 February 2016).

  5. 5.

    . Progress Since 2000: Workshop Summary. The National Acad. Press, 2000). Available at (accessed on 14 April 2016).

  6. 6.

    et al. American Society of Clinical Oncology policy statement: disparities in cancer care. J. Clin. Oncol. 27, 2881–2885 (2009).

  7. 7.

    ASCO Institute for Quality. CancerLinQ Available at (accessed on 14 April 2016).

  8. 8.

    The European Organisation for Research and Treatment of Cancer. Aims & Mission. Available at (accessed on 14 April 2016).

  9. 9.

    et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med. 351, 2817–2826 (2004).

  10. 10.

    et al. Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J. Clin. Oncol. 200624, 3726–3734 (2006).

  11. 11.

    et al. Prediction of risk of distant recurrence using the 21-gene recurrence score in node-negative and node-positive postmenopausal patients with breast cancer treated with anastrozole or tamoxifen: a TransATAC study. J. Clin. Oncol. 28, 1829–1834 (2010).

  12. 12.

    et al. Prognostic and predictive value of the 21-gene recurrence score assay in postmenopausal women with node-positive, oestrogen-receptor-positive breast cancer on chemotherapy: a retrospective analysis of a randomised trial. Lancet Oncol. 11, 55–65 (2010).

  13. 13.

    et al. Prospective validation of a 21-gene expression assay in breast cancer. N. Engl. J. Med. 373, 2005–2014 (2015).

  14. 14.

    et al. Real-life analysis evaluating 2028 N0/Nmic breast cancer patients for whom treatment decisions incorporated the 21-gene Recurrence Score result: 5-year KM estimate for breast cancer-specific survival with Recurrence Score results ≤30 is >98%. 2015 San Antonio Breast Cancer Symposium. Abstract P5-08-02. Available at (accessed on 14 April 2016).

  15. 15.

    et al. West German Study Group phase III PlanB trial: First prospective outcome data for the 21-gene Recurrence Score assay and concordance of prognostic markers by central and local pathology assessment. J Clin Oncol. doi:10.1200/JCO.2015.63.5383 (29 February 2016).

  16. 16.

    Howlader N. et al. (2012) SEER Cancer Statistics Review. National Cancer Institute, (Bethesda, MD, USA, 1975–2009). Available at (accessed on 14 April 2016).

  17. 17.

    et al. A population-based study of tumor gene expression and risk of breast cancer death among lymph node-negative patients. Breast Cancer Res. 8, R25 (2006).

  18. 18.

    National Cancer Institute. Tamoxifen citrate, letrozole, anastrozole, or exemestane with or without chemotherapy in treating patients with invasive RxPONDER breast cancer. Available at (accessed on 14 April 2016).

  19. 19.

    Data on file. Genomic Health, Inc.

  20. 20.

    European Organisation for Research and Treatment of Cancer. MINDACT (Microarray In Node negative Disease may Avoid ChemoTherapy) Available at (accessed on 14 April 2016).

  21. 21.

    , , & Different patterns in the prognostic value of age for breast cancer-specific mortality depending on hormone receptor status: a SEER population-based analysis. Ann. Surg. Oncol. 22, 1102–1110 (2015).

  22. 22.

    et al. Association between age at diagnosis and disease-specific mortality among postmenopausal women with hormone receptor-positive breast cancer. JAMA 307, 590–597 (2012).

  23. 23.

    et al. Effect of age and comorbidity in postmenopausal breast cancer patients aged 55 years and older. JAMA 285, 885–892 (2001).

  24. 24.

    et al. Variation in 'standard care' for breast cancer across Europe: a EUROCARE-3 high resolution study. Eur. J. Cancer. 46, 1528–1536 (2010).

  25. 25.

    , , , & Breast cancer treatment guidelines in older women. J. Clin. Oncol. 23, 783–791 (2005).

  26. 26.

    National Comprehensive Cancer Network. Clinical Practice Guidelines in Oncology: Older Adult Oncology. Version 1.2016. Available at (accessed 14 April 2016).

  27. 27.

    et al. clinical impact of risk classification by central/local grade or luminal-like subtype vs. Oncotype DX: First prospective survival results from the WSG phase III planB trial. Eur J Cancer. 51, S311 (abstract 1937) (2015).

  28. 28.

    et al. Luminal subtypes vs. early Ki-67 response and Oncotype DX in early breast cancer: WSG-ADAPT study. 14th St. Gallen International Breast Cancer Conference (2015). Abstract P231. Available at (accessed on 14 April 2016).

  29. 29.

    et al. GRACE principles: recognizing high-quality observational studies of comparative effectiveness. Am. J. Manag. Care 16, 467–471 (2010).

  30. 30.

    et al. Comparison of SEER treatment data with Medicare claims. Med Care; doi:10.1097/MLR.0000000000000073 (15 March 2014).

  31. 31.

    Early Breast Cancer Trialists’ Collaborative Group. Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials. Lancet 365, 1687–1717 (2005).

  32. 32.

    American Joint Committee on Cancer. Breast Cancer Staging: 7th edn. Springer-Verlag: New York. Available at (accessed on 14 April 2016).

  33. 33.

    et al. HER2 concordance between central laboratory immunohistochemistry and quantitative reverse transcription polymerase chain reaction in Intergroup Trial E2197. J Clin Oncol 26, abstract 22009 (2008).

  34. 34.

    et al. Comparison of HER2 testing by IHC/FISH and RT-PCR in estrogen receptor negative or borderline patients with early stage breast cancer. Mod Pathol. 26, 70A, (abstract 282) (2013).

  35. 35.

    et al. Human epidermal growth factor receptor 2 assessment in a case-control study: comparison of fluorescence in situ hybridization and quantitative reverse transcription polymerase chain reaction performed by central laboratories. J Clin Oncol. 28, 4300–4306 (2010).

  36. 36.

    National Center for Health Statistics. Available at (accessed on 14 April 2016).

  37. 37.

    et al. Improved estimates of cancer-specific survival rates from population-based data. J. Natl. Cancer Inst. 102, 1584–1598 (2010).

  38. 38.

    , , & Impact of a commercial reference laboratory test recurrence score on decision making in early-stage breast cancer. J. Oncol. Pract. 3, 182–186 (2007).

  39. 39.

    et al. Prospective multicenter study of the impact of the 21-gene recurrence score assay on medical oncologist and patient adjuvant breast cancer treatment selection. J. Clin. Oncol. 28, 1671–1676 (2010).

  40. 40.

    et al. he effect of Oncotype DX recurrence score on treatment recommendations for patients with estrogen receptor-positive early stage breast cancer and correlation with estimation of recurrence risk by breast cancer specialists. Oncologist 16, 1520–1526 (2011).

  41. 41.

    et al. The effects of Oncotype DX Recurrence Scores on chemotherapy utilization in a multi-institutional breast cancer cohort. Breast Cancer Res. Treat. 126, 797–802 (2011).

  42. 42.

    , , & Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data. Cancer Causes Control 25, 81–92 (2014).

Download references


We acknowledge Anna Lau for medical writing and editorial assistance, and Cindy Loman for statistical graphics support. The ideas and opinions expressed herein are those of the author(s) and endorsement by any State, Department of Public Health, the National Cancer Institute, the Centers for Disease Control and Prevention, or their Contractors and Subcontractors is not intended nor should be inferred. The Surveillance, Epidemiology and End Results (SEER) Program is funded by the National Cancer Institute (NCI). Genomic Health performed the work to electronically submit the Recurrence Score results, but provided no funding for this study. SEER registries were supported as follows: California—the collection of cancer incidence data used in this study was supported by: the California Department of Public Health pursuant to California Health and Safety Code Section 103885; the Centers for Disease Control and Prevention (CDC) National Program of Cancer Registries (NPCR), under cooperative agreement 5NU58DP003862-04/DP003862; the NCI SEER Program, under contract HHSN261201000140C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute. Georgia—this work was supported by the NCI, under contract HHSN261201300015I, task order HHSN26100006, and by the CDC, under cooperative agreement 5/U58/DP003875-03. Hawaii—this work was supported by the NCI, under contract HHSN261201300009I, task order HHSN26100005, and by the University of Hawaii Cancer Center. Iowa—this work was supported by the NCI, under contract HHSN261201300020I, task order HHSN26100006, and by the University of Iowa Holden Comprehensive Cancer Center Support Grant through NCI grant 2P30CA086862-11. Kentucky—collection of the data by the Kentucky Cancer Registry used in this research project was supported by the NCI SEER, under contract HHSN2612013000131, and by the CDC NPCR, under cooperative agreement 5NU58DP003907-04-00. Louisiana—this work was supported by the NCI SEER Program, under contract HHSN261201300016I, task order HHSN26100006, and by the Louisiana State University Health Sciences Center School of Public Health. Michigan—this work was supported by the NCI, under contract HHSN261201300011I; the Karmanos Cancer Institute and Wayne State University Comprehensive Cancer Center Support Grant through NCI grant P30CA022453; and institutional funds from the Karmanos Cancer Institute and Wayne State University. New Mexico—this work was supported by the NCI, under contract HHSN26120130010I, task order HHSN26100005, and by the University of New Mexico Comprehensive Cancer Center Support Grant through NCI grant 2P30CA118100-1. Utah—this work was supported by the University of Utah and Huntsman Cancer Institute Foundation. Washington—this work was supported by the NCI, under contract HHSN26120130012I, task order HHSN26100005, and by the Fred Hutchinson Cancer Research Center Support Grant through NCI grant 5P30CA015704-41. R.C. has received research funding from Genomic Health.

Author information


  1. National Cancer Institute, Bethesda, MD, USA

    • Valentina I Petkov
    • , Nadia Howlader
    • , Kathleen Cronin
    •  & Lynne Penberthy
  2. Genomic Health, Inc., Redwood City, CA, USA

    • Dave P Miller
    • , Nathan Gliner
    • , Frederick L Baehner
    •  & Steven Shak
  3. IMS, Inc., Calverton, MD, USA

    • Will Howe
    •  & Nicola Schussler
  4. University of California, San Francisco, CA, USA

    • Frederick L Baehner
  5. Public Health Institute, Cancer Registry of Greater California, Sacramento, CA, USA

    • Rosemary Cress
  6. University of Southern California, Los Angeles, CA, USA

    • Dennis Deapen
  7. Cancer Prevention Institute of California, Fremont, CA, USA

    • Sally L Glaser
  8. Stanford Cancer Institute, Stanford, CA, USA

    • Sally L Glaser
  9. University of Hawaii Cancer Center, Honolulu, HI, USA

    • Brenda Y Hernandez
  10. Department of Epidemiology, University of Iowa, Iowa City, IA, USA

    • Charles F Lynch
  11. Connecticut Tumor Registry, Connecticut Department of Public Health, Hartford, CT, USA

    • Lloyd Mueller
  12. Karmanos Cancer Institute, Wayne State University, Detroit, MI, USA

    • Ann G Schwartz
  13. Cancer Surveillance System, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

    • Stephen M Schwartz
  14. Rutgers School of Public Health, Piscataway, NJ, USA

    • Antoinette Stroup
  15. Cancer Institute of New Jersey, New Brunswick, NJ, USA

    • Antoinette Stroup
  16. Utah Cancer Registry, Department of Internal Medicine, and Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA

    • Carol Sweeney
  17. University of Kentucky, Markey Cancer Center, Lexington, KY, USA

    • Thomas C Tucker
  18. Emory University, Atlanta, GA, USA

    • Kevin C Ward
  19. New Mexico Tumor Registry, University of New Mexico Comprehensive Cancer Center, Albuquerque, NM, USA

    • Charles Wiggins
  20. Louisiana State University Health Sciences Center, New Orleans, LA, USA

    • Xiao-Cheng Wu


  1. Search for Valentina I Petkov in:

  2. Search for Dave P Miller in:

  3. Search for Nadia Howlader in:

  4. Search for Nathan Gliner in:

  5. Search for Will Howe in:

  6. Search for Nicola Schussler in:

  7. Search for Kathleen Cronin in:

  8. Search for Frederick L Baehner in:

  9. Search for Rosemary Cress in:

  10. Search for Dennis Deapen in:

  11. Search for Sally L Glaser in:

  12. Search for Brenda Y Hernandez in:

  13. Search for Charles F Lynch in:

  14. Search for Lloyd Mueller in:

  15. Search for Ann G Schwartz in:

  16. Search for Stephen M Schwartz in:

  17. Search for Antoinette Stroup in:

  18. Search for Carol Sweeney in:

  19. Search for Thomas C Tucker in:

  20. Search for Kevin C Ward in:

  21. Search for Charles Wiggins in:

  22. Search for Xiao-Cheng Wu in:

  23. Search for Lynne Penberthy in:

  24. Search for Steven Shak in:

Competing interests

D.P.M., N.G., F.L.B., and S.S. are employees of Genomic Health. The remaining authors declare no conflict of interest.

Corresponding author

Correspondence to Valentina I Petkov.

Supplementary information

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit