Introduction

Over 200,000 patients with cancer are diagnosed with brain metastases (BM) annually in the United States1,2. Furthermore, BM incidence rates are increasing in the context of advances in systemic therapy and ubiquity of magnetic resonance imaging (MRI). While multiple validated models exist to estimate survival in patients with BM3,4,5, there are a dearth of models focusing on the presence of BM at initial diagnosis. Identifying patients with BM is crucial to guiding their optimal multidisciplinary management6.

The National Comprehensive Cancer Network (NCCN) guidelines provide considerations for brain MRI in only select circumstances for small cell lung cancer (SCLC), non-small cell lung cancer (NSCLC), breast cancer, kidney cancer, colorectal cancer (CRC) and melanoma7,8,9,10,11,12,13. NCCN recommends brain MRI for all patients with SCLC and most patients with NSCLC. The NCCN has no recommendations for brain MRI in CRC, and it is only recommended for breast and kidney cancer if there are suspicious central nervous system (CNS) symptoms. For melanoma, NCCN recommends brain MRI for stage IV disease and states it can be considered for stage IIIB/C/D disease. Notably, there is limited evidence to support these recommendations.

Given the limitations of the current guidelines regarding brain MRI for patients with a new diagnosis of cancer, we aimed to develop and validate cancer-specific models to predict the presence of BM at time of cancer diagnosis. The results of this work will be helpful to the multidisciplinary team of physicians that care for patients with cancer who are at risk of harboring brain metastases. We were successful in creation and validation of these models which have a varying degree of accuracy between different cancer types.

Methods

Data

The National Cancer Database (NCDB) is a joint project of the Commission on Cancer of the American College of Surgeons and the American Cancer Society, which consists of de-identified information regarding patient demographics, tumor characteristics, first-course treatment for the corresponding diagnosis, and survival for ~70% of patients diagnosed with cancer within the United States14. The data used in the study are derived from a de-identified NCDB file. The American College of Surgeons and the Commission on Cancer have not verified and are not responsible for the analytic or statistical methodology employed, or the conclusions drawn from these data by the investigator. The data used in this study were derived from a de-identified file, and thus were exempt from institutional review. No informed consent is required when using NCDB data.

The NCDB 2010–2018 data including the demographic and clinical characteristics were used for analysis. The primary outcome was defined as the presence of BM at time of diagnosis. The summary statistics including mean with standard deviation for continuous variables and frequency with percentage for categorical variables were provided for overall and stratified by 6 cancer sites (breast cancer, melanoma, kidney cancer, CRC, SCLC, and NSCLC). Other cancers were excluded given the relatively low incidence of brain metastases15,16. The model fitting was performed for each cancer type by considering 10 common risk predictors (e.g., patient age, sex, race, tumor grade, clinical primary tumor stage (T stage), clinical nodal stage (N stage), presence of bone metastases, presence of lung metastases, and presence of liver metastases) and cancer-specific factors (including available tumor markers) that have been shown previously to be prognostic17,18,19,20,21,22,23,24. Notably, patients in our analysis may have had stage I-III disease (if they are coded to have no bone, lung, liver, or brain metastases), or stage IV disease with brain only metastases if they are coded to have brain metastases but no bone, lung, or liver metastases.

Statistical analysis

Group comparisons between patients with and without BM were based on the Pearson Chi-square tests (for categorical variables) and the two-sample t-test as well as the Wilcoxon Rank Sum test (for the continuous variable, age). For each cancer type, we fitted a multivariable logistic regression model using the corresponding set of covariates of interest. Based on the logistic regression analyses, odds ratios (ORs) with respect to each predictor and their 95% confidence intervals (CIs) were estimated. After model fitting for each cancer site, we used the Area Under the Receiver Operating Characteristic Curve (AUC) as the primary evaluation metric of the model performance. For each cancer type, we randomly split the full data into training and testing datasets in a 7:3 proportion, and we estimated the AUC, the optimal probability cut point, and several supplementary metrics, including overall accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) based on the corresponding optimal cutoff, over 100 random-splitting simulations. Further, we developed nomograms and a Webtool for each cancer type based on the logistic regression models to predict BM at diagnosis as guidance for clinicians. Finally, we also calculated the estimated risk of BM at diagnosis based on logistic regressions for each patient and summarized the characteristic distributions in 3 risk subgroups (<1% [low], 1–10% [intermediate], and >10% [high]) for each cancer type. The cutoffs between low, intermediate, and high risk are arbitrary, as there is no well-defined pretest probability for which brain MRI is recommended, though the authors feel that most clinicians would not pursue a brain MRI if risk is <1%, and most physicians would recommend a brain MRI if risk is >10%. All the analyses were conducted in statistical software R Version 4.1. We used R packages “pROC” (version 1.18.4) for estimating optimal cut points of prediction, and “rms” (version 6.7-0) for assisting in generating the nomograms. P-values of less than or equal to 0.05 were regarded as statistically significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Results

A total of 4,828,305 patients were identified in the NCDB from 2010-2018 (2,095,339 breast cancer, 472,611 melanoma, 627,090 CRC, 407,627 kidney cancer, 164,864 SCLC, and 1,060,774 NSCLC).

The overall proportion of patients with BM at diagnosis was 0.3%, 1.5%, 0.3%, 1.3%, 16.0%, and 10.3% for breast cancer, melanoma, CRC, kidney cancer, SCLC, and NSCLC, respectively. The incidence of brain-only metastatic disease (without lung, liver, and bone metastasis), was relatively rare for breast cancer (0.06%), melanoma (0.55%), CRC (0.08%), and kidney cancer (0.28%) but did occur more frequently in patients with SCLC (7.78%) and NSCLC (5.10 %). Table 1 shows the demographic and disease-specific data for all patients. The estimates of odds ratios with 95% CI in the LR analyses are shown in Table 2, as well as in the text in the following disease-specific subsections. Supplementary Tables 1 through 6 show the characteristics of patients with each type of cancer stratified by the presence or absence of BM at diagnosis. Figure 1 shows the mean AUC for all models. Figures 2 through 7 show the nomograms developed for different cancer types, and Supplementary Tables 7 through 18 show the nomogram scores associated with each variable level and a reference table for mapping the total scores to predicted BM risks for each cancer type. Supplementary Tables 19 through 24 show the characteristics of cancer-specific populations that are at low (<1%), intermediate (1–10%), and high risk (>10%) of harboring BM at diagnosis. Supplementary Table 25 shows the Mean AUC as well as the 2.5% and 97.5% quantiles for each model based on 100 random splits of the data. Supplementary Table 26 shows model performance metrics for each cancer-specific model. Supplementary Table 27 shows the percentage of patients with brain-only metastatic disease for each of the cancer types. A link to the Webtool for the risk estimation of BM is listed here: https://tinyurl.com/brain-mets.

Table 1 Patient Characteristics
Table 2 Odds Ratios for Prediction of Brain Metastases at Diagnosis
Fig. 1: The area under the curve for each model based on 100 random splits of the data.
figure 1

AUCarea under the curve.

Fig. 2: Nomogram for prediction of brain metastases from breast cancer.
figure 2

T stage tumor stage, N Stage nodal stage, ER estrogen receptor, PR progesterone receptor, HER2 human epidermal growth factor receptor 2.

Fig. 3: Nomogram for prediction of brain metastases from melanoma.
figure 3

T stage tumor stage, N Stage nodal stage.

Fig. 4: Nomogram for prediction of brain metastases from colorectal cancer.
figure 4

T stage tumor stage; N Stage nodal stage. CEA carcinoembryonic antigen.

Fig. 5: Nomogram for prediction of brain metastases from kidney cancer.
figure 5

T stage tumor stage; N Stage nodal stage.

Fig. 6: Nomogram for prediction of brain metastases from non-small cell lung cancer.
figure 6

T stage tumor stage, N Stage nodal stage.

Fig. 7: Nomogram for prediction of brain metastases from small cell lung cancer.
figure 7

T stage tumor stage, N Stage nodal stage.

Breast cancer

For patients diagnosed with breast cancer, those with BM were more likely to be black race (18.3% vs. 11.9%), have high grade tumors (G3/4 35.4% vs. 26.0%) and more advanced T (T3/4 36.8% vs. 6.2%) and N stage (N2/3 20.0% vs. 2.6%), as well as metastases to bone (65.0% vs. 2.7%), lung (44.4% vs. 1.1%), and liver at diagnosis (31.1% vs. 0.9%). Patients with BM were also more likely to have estrogen receptor (ER) negative (30.0% vs. 14.1%), progesterone receptor (PR) negative (40.9% vs. 22.0%), and human epidermal growth factor receptor 2 (HER2) positive disease (22.5% vs. 10.3%).

Those with tumor grade 2, 3, and 4 respectively had 1.65 (95% CI: 1.42–1.91), 1.93 (95% CI: 1.66–2.25), 2.60 (95% CI: 1.79–3.77) times the odds for BM compared to those with tumor grade 1. Having T1 disease was associated with lower odds of having BM compared to having clinical T0 or T2-4 disease, while patients with N0 disease had lower odds of having BM compared to those with N1-N3 disease. Those with bone metastases had 14.42 (95% CI: 13.48–15.44) times the odds for BM compared to those without bone metastases. Those with lung metastases had 5.35 (95% CI: 5.02–5.69) times the odds for BM compared to those without lung metastases. Those with liver metastases had 2.01 (95% CI: 1.88–2.14) times the odds for BM compared to those without. Patients with ER+ disease had lower odds of BM with OR of 0.61 (95% CI: 0.56–0.66). Similarly, those with PR+ disease had lower odds of BM with a OR of 0.67 (95% CI: 0.62–0.73). HER2+ was associated with a slightly higher odds to have BM (OR = 1.13, 95% CI: 1.06–1.21) compared to HER2- disease. Across the 100 7:3 training/testing random splits, the model showed an average AUC of 0.95 (Fig. 1).

Melanoma

For patients diagnosed with melanoma, those with BM were more likely to be male (71.3% vs. 58.1%), to have high-grade tumors (G3/4 1.8% vs. 0.4%), and to have clinical T stage 0 (22.2% vs. 1.0%). Additionally, they were more likely to have more advanced nodal disease (N2/3 5.7% vs. 1.2%), metastatic disease to bone (19.1% vs. 0.6%), lung (52.5% vs. 1.1%), and liver (21.4% vs. 0.6%) at diagnosis (Figs. 37).

Female patients had 0.66 (95% CI: 0.62–0.70) times the odds to have BM compared to males. Those with grade 3 tumors had about 2.64 (95% CI: 1.06–6.56) times the odds for BM compared to those with grade 1. Patients with T1 disease had lower odds of having BM compared to T0, T3, and T4 disease and those with N0 disease had lower odds of having BM compared to those with N1-3 disease. Furthermore, those with bone metastases had 2.01 (95% CI: 1.82–2.22) times the odds for BM compared to those without bone metastases. Those with lung metastases had 23.40 (95% CI: 21.86–25.05) times the odds for BM compared to those without lung metastases. And those with liver metastases had 1.68 (95% CI: 1.52–1.85) times the odds for BM compared to those without. Patients having tumor ulceration had 2.19 (95% CI: 1.92–2.50) times the odds to get BM compared to those without ulceration. The average AUC across 100 random splits is 0.94 (Fig. 1).

Colorectal cancer

For patients diagnosed with CRC, those with BM were more likely to be younger (65.4 vs. 67.3) and male (52.5% vs. 49.3%). They were more likely to have higher grade tumors (G3/4 20.6% vs. 15.8%) and with higher T (T4 7.1% vs. 4.8%) and N stage (N2 7.7% vs. 2.7%). Additionally, they were more likely to have bone (22.8%, vs. 1.0%), lung (50.9% vs. 4.1%), and liver metastases at diagnosis (54.1% vs. 15.0%), with positive CEA (45.7% vs. 24.2%).

Black race conferred 0.73 (95% CI: 0.64–0.84) times the odds to have BM compared to white race. Those with grade 3 and 4 tumors had 2.70 (95% CI: 2.05–3.56) and 2.72 (95% CI: 1.88–3.91) times the odds of BM compared to those with grade 1 tumors. Patients with T0 disease had 3.25 times the odds of having BM compared to those with T1 disease (95% CI: 2.26–4.69), while those with N0 disease had lower odds of BM compared to those with N1-2 disease. In addition, those with bone metastases had 5.38 (95% CI: 4.77–6.07) times the odds of BM compared to those without bone metastases. Those with lung metastases had 9.76 (95% CI: 8.70–10.96) times the odds for BM compared to those without lung metastases. And those with liver metastases had 1.41 (95% CI: 1.26–1.58) times the odds of BM compared to those without liver metastases. Patients with positive CEA had 1.83 (95% CI: 1.54–2.17) times the odds to get BM compared to normal CEA. The average AUC across 100 random splits is 0.89 (Fig. 1).

Kidney cancer

For patients with kidney cancer, those with BM were more likely to be male (69.0% vs. 62.5%) and white race (87.0% vs. 83.3%). They also were more likely to have higher N stage (N1-3 21.1% vs. 4.4%) as well as have bone (36.2% vs. 4.4%), lung (63.8% vs. 6.1%), and liver metastases (17.8% vs. 2.2%) at diagnosis. A higher proportion of patients with BM had sarcomatoid features (7.2% vs. 2.7%) and grade 4 Fuhrman nuclear grade (8.6% vs. 4.7%). Histology for patients with BM was more likely to be renal cell carcinoma (RCC) as compared to adenocarcinoma/papillary adenocarcinoma or urothelial carcinoma / other (86.7% vs. 70.5%).

Female patients had slightly lower odds of BM compared to males (OR = 0.88, 95% CI: 0.82–0.93), and black patients had lower odds compared to white patients (OR = 0.69, 95% CI: 0.62–0.77). Those with tumor grade 3 tumors had 1.53 (95% CI: 1.08–2.16) times the odds for BM compared to those with grade 1 tumors. Patients with T1 disease were associated with lower odds of having BM compared to those having T0 or T2-4 disease, while those with N0 disease had lower odds of having BM compared to those with N1 disease. Those with bone metastases had about double the odds of BM (OR = 2.23, 95% CI: 2.08–2.39) compared to those without bone metastases. Those with lung metastases had 10.48 (95% CI: 9.74–11.27) times the odds for BM compared to those without lung metastases. Also, those with liver metastases had 1.26 (95% CI: 1.15–1.37) times the odds for BM compared to those without liver metastases. Patients with sarcomatoid features had 1.36 (95% CI: 1.18–1.56) times the odds of getting BM compared to those without sarcomatoid features. Having RCC was associated with 2.54 times the odds to have BM (95% CI: 2.21–2.90) compared to having adenocarcinoma or papillary adenocarcinoma histology. The average AUC across 100 random splits is 0.91 (Fig. 1).

Non-small cell lung cancer

For patients with NSCLC, those with BM were more likely to be younger (64.9 vs. 69.3) and black race (12.7% vs. 10.8%). BM were more common in patients with unknown grade (66.2% vs. 48.9%), as well as more advanced T (T3/4 36.5% vs. 25.3%) and N stage (N2/3 51.5% vs. 29.9%), and those with bone (33.5% vs. 12.0%), lung (21.6% vs. 9.3%), and liver metastases (16.9% vs. 5.3%) at diagnosis. Patients diagnosed with BM also had a higher proportion of adenocarcinoma histology compared to those without BM (65.4% vs. 48.2%).

Female patients had slightly higher odds of BM compared to males (OR = 1.03, 95% CI: 1.00–1.03). Patients with grade 2 to 4 tumors, respectively, had 2.12 (95% CI: 2.00–2.24), 3.93 (95% CI: 3.72–4.15), and 4.01 (95% CI: 3.69–4.37) times the odds for BM compared to those with grade 1 tumors. Having T1 disease was associated with lower odds of having BM compared to having T0 or T2-4 disease, while having N0 disease had lower odds of having BM compared to having N1-3 disease. Those with bone metastases had 1.91 (95% CI: 1.88–1.94) times the odds of BM compared to those without bone metastases. Those with lung metastases had 1.53 (95% CI: 1.50–1.56) times the odds for BM compared to those without lung metastases. Also, those with liver metastases had 1.77 (95% CI: 1.74–1.81) times the odds for BM compared to those without liver metastases. Patients with squamous cell carcinoma had about 0.76 times the odds to get BM compared to those having adenocarcinoma histology (95% CI: 0.74–0.77). The average AUC across 100 random splits is 0.78 (Fig. 1).

Small cell lung cancer

For patients with SCLC, those with BM were more likely to be younger (65.4 vs. 67.9), males (50.3% vs. 46.8%), and black race (8.8% vs. 7.8%). Patients with BM were also more likely to have an unknown grade (82.2% vs. 79.0%), higher T stage (T4, 25.5% vs. 23.5%), and unknown N stage (23.1% vs. 18.8%). Additionally, patients with BM were more likely to have bone (28.2% vs. 19.9%), liver (32.1% vs. 28.0%), and lung metastases (15.9% vs. 9.9%) at diagnosis.

Female patients had slightly lower odds of BM compared to males (OR = 0.90, 95% CI: 0.87–0.92) and black patients had slightly higher odds compared to white patients (OR = 1.11, 95% CI: 1.06–1.17). Patients with T1 disease had lower odds of BM compared to those with T0 or T2-4 disease, and those with N0 disease had higher odds of having BM compared to those with N3 disease. Those with bone metastases had 1.44 (95% CI: 1.39–1.48) times the odds of BM compared to those without bone metastases. Those with lung metastases had 1.65 (95% CI: 1.59–1.72) times the odds for BM compared to those without lung metastases. In addition, those with liver metastases did not have significantly higher odds for BM compared to those without liver metastases. The average AUC across 100 random splits is 0.62 (Fig. 1).

Discussion

Limited resources exist to estimate the risk of BM at the time of initial cancer diagnosis, and only SCLC and NSCLC have clear recommendations in the NCCN regarding the use of brain MRI for staging. In this work, we comprehensively studied of the presence of brain metastases in multiple cancer types based on clinical and pathologic factors. This study successfully developed and validated disease-specific models to predict the presence of BM in patients with a new cancer diagnosis. The models for breast cancer, melanoma, kidney cancer, and CRC exhibited excellent to outstanding discrimination25 with average AUC values based on random training/testing data splitting all larger than 0.87. The models for SCLC had poor discrimination (average AUC at 0.62), and the model for NSCLC showed acceptable discrimination (average AUC at 0.78). This study can be incorporated into guidelines for cancer staging and the nomograms and webtools developed based on our models will aid oncologists in the clinic by giving a pre-test probability of the presence of BM when considering brain imaging.

Detailing the multiplicity of cancer type-specific clinical and histological variables that confer a high (>10%) risk of harboring BM in current staging guidelines would be cumbersome. The generated nomograms and associated web application assist with the operationalization of our findings and will aid with the clinical decision to obtain a brain MRI as part of initial staging work-up. In addition, Supplementary Tables 13 through 18 show the characteristics of patients with each type of cancer who have either <1%, 1–10%, or >10% estimated risk of having BM. As expected, populations with a >10% risk of BM generally have a higher proportion of patients with bone, liver, and lung metastases as well as more advanced T and N stage.

The model developed for SCLC warrants further discussion given its poor discrimination with the average AUC as low as 0.6. The authors feel this is representative of the biology of SCLC, as it is known that SCLC has a high propensity for brain metastasis26. This is reflected in Supplementary Table 24, which shows that there are no individuals in our study that had a <1% risk of having BM as predicted by the nomogram. This supports the NCCN recommendation of screening brain MRI for all patients diagnosed with SCLC, since the likelihood of brain metastases is relatively high and a highly accurate nomogram could not be generated to discriminate between the presence and absence of brain metastasis at diagnosis.

The other models developed herein compare favorably to prior models predicting BM. A nomogram to predict BM from newly diagnosed breast cancer utilizing the surveillance, epidemiology, and end results (SEER) database demonstrated an AUC of 0.6427 as compared to our AUC of about 0.95 for breast cancer. Zhang et al used the SEER database to develop and validate a nomogram for squamous cell carcinoma of the lung with an AUC of 0.828, paralleling our NSCLC model AUC of about 0.78. There are limited data to predict for BM at diagnosis beyond these reports, underscoring the utility of our work.

This study has several limitations. First, the NCDB does not capture CNS symptoms at diagnosis. Certainly, patients with symptomatic disease in the brain are more likely to receive a brain MRI and be subsequently diagnosed and coded as having BM. Thus, the models and nomograms generated may overestimate the risk of BM in patients that are asymptomatic, particularly for cancers other than SCLC and NSCLC where brain MRI screening is recommended in the NCCN guidelines. However, conversely, since MRI screening is not utilized across all patients, it is possible that the model may underestimate the true rate of brain metastases as some patients may have harbored asymptomatic brain metastases but did not have MRI screening. Additionally, we included patients with missing data in this study. As seen in the nomograms, multiple variables include “unknown” as a category, and in general the unknown category is more likely to be associated with BM. The authors propose that the reasoning for this may be reflective of clinical practice when a patient is diagnosed with BM. For example, if a patient presents with BM, the primary tumor characteristics such as grade, T stage, and N stage no longer play a strong role in treatment recommendation, and as such may not be documented or coded appropriately and thus listed as “unknown.” Also, the NCDB does not contain information regarding driver mutations which may affect biologic aggressiveness and risk of brain metastasis 29,30. And, some variables within the models may be inherently correlated (ex. Triple negative breast cancer, high grade, and black race), potentially resulting in some variables not being associated with brain metastases. We elected to keep all baseline demographic and tumor variables in the models regardless of their association with brain metastases for coherency of the models across cancer types. Lastly, most patients with brain metastasis in our study also had metastasis to liver, lung, and/or bone. As such, predictive power may be less in patients without evidence of other metastatic disease, particularly in kidney, breast, colorectal cancer, and melanoma, where brain-only metastatic disease is less common.

In conclusion, we developed and validated models that predict the presence of BM at diagnosis for patients diagnosed with breast cancer, melanoma, CRC, kidney cancer, NSCLC and SCLC. This work can be referred to in guidelines for cancer staging and the nomograms and Webtools can guide clinicians in the decision to obtain brain MRI as a part of their staging work-up.