Abstract
Astrocytoma is a common brain tumor that can occur in any part of the central nervous system. This tumor is extremely harmful to patients, and there are no clear studies on the risk factors for astrocytoma of the brain. This study was conducted based on the SEER database to determine the risk factors affecting the survival of patients with astrocytoma of the brain. Patients diagnosed with brain astrocytoma in the SEER database from 2004 to 2015 were screened by inclusion exclusion criteria. Final screened brain astrocytoma patients were classified into low grade and high grade according to WHO classification. The risk factors affecting the survival of patients with low-grade and high-grade brain astrocytoma were analyzed by univariate Kaplan–Meier curves and log-rank tests, individually. Secondly, the data were randomly divided into training set and validation set according to the ratio of 7:3, and the training set data were analyzed by univariate and multivariate Cox regression, and the risk factors affecting the survival of patients were screened and nomogram was established to predict the survival rates of patients at 3 years and 5 years. The area under the ROC curve (AUC value), C-index, and Calibration curve are used to evaluate the sensitivity and calibration of the model. Univariate Kaplan–Meier survival curve and log-rank test showed that the risk factors affecting the prognosis of patients with low-grade astrocytoma included Age, Primary site, Tumor histological type, Grade, Tumor size, Extension, Surgery, Radiation, Chemotherapy and Tumor number; risk factors affecting the prognosis of patients with high-grade astrocytoma include Age, Primary site, Tumor histological type, Tumor size, Extension, Laterality, Surgery, Radiation, Chemotherapy and Tumor number. Through Cox regression, independent risk factors of patients with two grades were screened separately, and nomograms of risk factors for low-grade and high-grade astrocytoma were successfully established to predict the survival rate of patients at 3 and 5 years. The AUC values of low-grade astrocytoma training set patients were 0.829 and 0.801, and the C-index was 0.818 (95% CI 0.779, 0.857). The AUC values of patients in the validation set were 0.902, 0.829, and the C-index was 0.774 (95% CI 0.758, 0.790), respectively. The AUC values of high-grade astrocytoma training set patients were 0.814 and 0.806, the C-index was 0.774 (95% CI 0.758, 0.790), the AUC values of patients in the validation set were 0.802 and 0.823, and the C-index was 0.766 (95% CI 0.752, 0.780), respectively, and the calibration curves of the two levels of training set and validation set were well fitted. This study used data from the SEER database to identify risk factors affecting the survival prognosis of patients with brain astrocytoma, which can provide some guidance for clinicians.
Introduction
Brain tumors refer to abnormal proliferation of cells in the brain and are the most common malignant tumors of the central nervous system1,2. Clinically divided into primary brain tumors and metastatic brain tumors3. Primary brain tumors refer to tumors caused by cells in the central nervous system, primary brain tumors account for about 1% of new cancers in the United States, about 2% of dead cancers in the United States, and their main primary tumor is glioma4. Previous studies have also found that brain tumors in childhood have a great impact on both morbidity and mortality in children5. At present, the traditional method of clinical treatment of brain tumors is surgery, radiotherapy and chemotherapy6,7,8. Astrocytoma, an aggressive tumor with the worst prognosis, can slightly improve survival with reasonable treatment, but the risk factors for this tumor have rarely been clearly studied9. This study used data from the U.S. National Public Database to analyze risk factors affecting the survival of patients with brain astrocytoma. The SEER database is currently the largest public cancer database, covering approximately 28% of the U.S. population, and the SEER database includes basic information about the U.S. population and information about relevant cancer characteristics10.
In recent years, nomogram have been widely used in the prediction of various diseases, especially tumors. It meets the needs of integrated models and plays a very important role in the current "digital medicine" environment, using nomogram to facilitate prognosis predictions for clinicians11,12,13. Therefore, this study aims to use the data from the SEER database to screen for risk factors affecting the survival of patients with brain astrocytoma, and to establish nomogram model of the survival rate of patients at 3 and 5 years, so as to guide doctors in predicting the prognosis of patients and provide assistance to clinicians.
Materials and methods
Data source
The data for this study were selected from the SEER database established by the National Cancer Institute, and we selected the database containing 13 registries with radiotherapy data, which provided data that could support the completion of this study. A total of 6154 patients diagnosed with astrocytoma of the brain from 2004 to 2015 were extracted from the database, and a total of 2214 patients were screened according to the inclusion and exclusion criteria. The types of astrocytoma include diffuse astrocytoma, anaplastic astrocytoma, pilocytic astrocytoma, unique astrocytoma variants, and astrocytoma, NOS above five types.
Inclusion and exclusion criteria
Inclusion criteria
(i) Patients with astrocytoma of the brain diagnosed in 2004–2015; (ii) The international tumor code ICD-0-3 is C70.0-C75.3, including the brain, frontal lobe, parietal lobe, temporal lobe, occipital lobe, etc. (iii) Those with complete clinical information.
Exclusion criteria
(i) Baseline information (e.g., race) is unknown; (ii) tumor size and tumor number are missing; (iii) survival time is unknown; (iv) proven only at autopsy or death.
Grouping methods
For a more intuitive and standardized study, the study data were transformed into dichotomous or multi-categorical variables. Age was classified into five age groups: < 20, 20–39, 40–59, 60–79, and ≥ 80; Race into black, white, and others; and Histological type into five categories: diffuse astrocytoma, anaplastic astrocytoma, pilocytic astrocytoma, unique astrocytoma variants, and astrocytoma, NOS. The Primary site was divided into brain (C71.0–C71.5), cerebellum (C71.6), brainstem (C71.7), spinal cord (C72.0), and others (C70.0 71.8 C71.9 C72.3 C72.5 C72.8 C75.1 C75.3); the Lateral division was unilateral and bilateral; the Grade was I-IV; for continuous variable Tumor size and Extension were divided using X-tile to select the best grouping method, and finally the Tumor size was classified as ≤ 60 mm and ≥ 61 mm, and the Extension was classified as 10–30 mm and 40–75 mm; Surgery, Radiation, Chemotherapy: yes/no; Tumor number was grouped as 1 and > 1.
Statistical methods
The data extracted from the SEER database were first organized according to the inclusion and exclusion criteria using Excel and classified into low-grade and high-grade brain astrocytoma patients according to WHO classification. The survival rates were calculated by Kaplan–Meier curve method using R-studio 4.2.2 software for low-grade and high-grade brain astrocytoma patients, respectively. and the effect of the included factors on patient survival was demonstrated by K–M curves, and log-rank test was used for group comparisons of the same variables. The data of low-grade and high-grade astrocytoma were randomly divided into training set and validation set in a 7:3 ratio with R-studio software, and χ2 tests were performed between different variables in the training and validation sets using SPSS. Univariate and multivariate Cox regression analyses were performed on the training set data using R-studio4.1.1, create a nomogram of the final filtered variables using the R packages 'rms', 'foreign', and 'survival', and the area under the ROC curve (AUC value) and C-index were used to evaluate the accuracy of the model, with AUC and C-index taking values ranging from 0–1, the closer to 1 indicating the more accurate the model; the calibration curve was used to evaluate the calibration degree of the model, and the closer the calibration curve was to the standard curve indicating the stronger predictive ability of the model. The differences were considered statistically significant at P < 0.05, except for the univariate Cox regression at P < 0.1.
Results
Comparison of patient baseline features
In this study, a total of 2214 patients were included in the study, there were 539 patients with low-grade astrocytoma and 1675 patients with high-grade astrocytoma. R-studio 4.2.2 was randomly split into training set and validation set according to the ratio of 7:3, with 379 patients in the low-level training set, 160 patients in the validation set, 1175 patients in the high-level training set, and 500 patients in the validation set. Comparing the different variables in the training set and the validation set, the p-value of the χ2 test result was > 0.05, and the difference was not statistically significant, indicating that the two groups were randomly assigned. Information on the two different grades and the results of the χ2 tests are shown in Tables 1 and 2.
Impact of different factors on patient survival
Risk factor analysis affecting survival in patients with low-grade astrocytoma
By univariate Kaplan–Meier survival curve and log-rank test, Age (P < 0.0001), Primary site (P < 0.0001), Tumor histological type (P < 0.0001), Tumor size (P = 0.01), Extension (P = 0.00013), Surgery (P = 0.00016), Radiation (P < 0.0001), Chemotherapy (P < 0.0001) and Tumor number (P = 0.015) were risk factors affecting the prognosis of patients with low-grade astrocytoma. The established K-M survival curve and log-rank test results showed that the factors of Age ≥ 80 years, Primary site at brainstem, Diffuse astrocytoma, Tumor ≥ 61 mm, deeper Extension, Bilateral, no Surgery, Radiotherapy, Chemotherapy and Tumor number > 1 were all related to poor survival time (Fig. 1).
Risk factor analysis affecting survival in patients with high-grade astrocytoma
By univariate Kaplan–Meier survival curve and log-rank test, Age (P < 0,0001), Primary site (P < 0.0001), Tumor histology type (P < 0.0001), Tumor size (P < 0.0001), Extension (P < 0.0001), Laterality (P = 0.01), Surgery (P < 0.0001), Radiation (P < 0.0001), Chemotherapy (P < 0.0001) and Tumor number (P < 0.0001) are risk factors for the prognosis of patients with high-grade astrocytoma. The results of the established K-M survival curves and log-rank tests showed that Age ≥ 80 years, Primary site at brainstem, Astrocytoma, NOS, Tumor size < 60 mm, deeper Extension, Bilateral, no Surgery, no Radiotherapy or Chemotherapy and Tumor number > 1 were all associated with poorer survival time in patients (Fig. 2).
Single-factor and multi-factor Cox regression results
Univariate and multivariate COX regression results for low-grade astrocytoma
Patient data from the low-grade astrocytoma training set (13 variables) were included in univariate Cox regression analysis, and the univariate Cox regression excluded the gender variable (P > 0.1). To avoid omitting important variables, 12 variables with P < 0.1 in the univariate Cox regression were included in the multivariate Cox regression. If P < 0.1 in the univariate Cox regression analysis, the factor was associated with prognostic survival of the patients; if P < 0.05 in the multivariate Cox regression analysis, the factor was an independent factor affecting the survival of the patients. The univariate Cox regression results of this study showed that age greater than 40 years, white ethnicity, histological type of tumor, primary site, lateral bilateral tumor, grade II, larger tumor size, deeper entry into the brain, surgery, radiotherapy, chemotherapy, and the number of tumors were factors related to the prognosis and survival of patients; multivariate Cox regression results showed that older age, bilateral tumors, and radiotherapy and chemotherapy were independent factors affecting patient survival (Table 3).
Univariate and multivariate COX regression results for high-grade astrocytoma
The results of high-grade astrocytoma univariate Cox regression showed that age greater than 60 years, diffuse astrocytoma, initial location, bilateral tumors, tumor size, deeper entry into the brain, surgery, radiotherapy, chemotherapy, and tumor number were factors related to the patient's prognosis and survival. Multivariate Cox regression results showed that older age, diffuse astrocytoma, initial location, bilateral tumor, tumor size, deeper brain penetration, surgery, radiotherapy, and chemotherapy were independent factors affecting patient survival (Table 4).
Creation of nomogram
The variables screened in the multifactorial Cox regression analysis (P < 0.05) were included in the R-studio software to create a nomogram model. Different values for each variable were taken to obtain different values of scores, and the total scores were obtained by adding all the scores of each variable, and according to the total scores, the survival rate of patients at 3 and 5 years could be predicted accordingly (Figs. 3, 4).
Validation of nomogram
The area under the ROC curve and C-index were used to evaluate the discrimination of the model, and the calibration curve was used to evaluate the calibration of the model.
Validation of nomogram in patients with low-grade astrocytoma
The AUC values of 3-year and 5-year survival rates of patients with low-grade astrocytoma training set were 0.829 and 0.801, respectively, and the AUC values of 3-year and 5-year survival rates of patients in the validation set were 0.902 and 0.829, respectively (Fig. 5). The C-index was 0.818 (95% CI 0.779, 0.857) for patients in the training set and 0.834 (95% CI 0.785, 0.883) for patients in the validation set. Meanwhile, the predicted survival curves for the 3-year and 5-year patients in the training and validation sets in Fig. 6 are closer to the actual curves, and the curves fit better, indicating that the model is more accurate.
Validation of nomogram in patients with high-grade astrocytoma
The AUC values of 3-year and 5-year survival rates of patients with high-grade astrocytoma training set were 0.814 and 0.806, respectively, and the AUC values of 3-year and 5-year survival rates of patients in the validation set were 0.802 and 0.823, respectively (Fig. 7). The C-index was 0.774 (95% CI 0.758, 0.790) for patients in the training set and 0.766 (95% CI 0.752, 0.780) for patients in the validation set. The 3-year and 5-year predicted survival curves of patients in the training and validation sets were in line with the true curves (Fig. 8).
Discussion
Under the current trend of "digital medicine", it is important for both doctors and patients to use a combination of clinical diagnosis and intelligent means to determine the patient's condition and prognosis related risk factors. On the one hand, it can assist doctors to understand the patient's condition in time for more correct treatment; on the other hand, it is conducive to patients having a clearer understanding of their own conditions, which can greatly promote communication between doctors and patients. At the same time, in recent years, more and more scholars have conducted tumor research by mining SEER database, thus generating a variety of tumor prediction models, which may become a new direction for tumor research in the future14. Patients with astrocytoma of the brain diagnosed from 2004 to 2015 in the SEER database were included in this study, and a total of 2214 patients were screened according to the inclusion and exclusion criteria. Patients were randomly divided into training set and validation set according to different levels in a ratio of 7:3. Results from a univariate Kaplan–Meier survival curve analysis showed that the factors we included had an impact on patient survival, regardless of whether the tumor was low-grade or high-grade brain astrocytoma, with the exception of age and gender. The results of univariate and multifactor cox regression analysis of the training set data for patients of both grades showed that no radiotherapy and chemotherapy were protective factors for patients with low-grade brain astrocytoma with an OR less than 1, whereas the opposite was true for high-grade. It could indicate that patients with certain tumors of low grade would have longer survival without radiotherapy treatment, while patients with high-grade astrocytoma would need radiotherapy to survive longer. This result is clinically consistent and has some clinical significance. Meanwhile, the COX regression results affecting patient survival were consistent with the K-M curve, indicating the accuracy of the results.
Age has been found to be an important factor affecting the survival of patients in both low-grade and high-grade brain astrocytoma, and this result is more consistent with the findings of other scholars. Previous studies have also found a strong relationship between brain tumors and age8, older age predicts higher risk of disease15,16. However, some scholars studying advanced age and brain tumors have also found that elderly people may have slower tumor progression17, and low-grade and high-grade brain tumor log-rank test results and the Cox regression results indicated that older patients are more likely to have lower survival rates. In conclusion, age is an extremely important factor in the prognosis of patients with brain tumors and deserves further study. The gender distribution in this study was relatively balanced. In terms of racial distribution, Whites were overwhelmingly represented. In this study, the K-M curve and Cox regression results showed that the differences between sex and race were not statistically significant (P > 0.5). Studies have found that the incidence and mortality of brain tumors in both men and women have decreased year by year in recent years, but no significant differences have been found between sexes and races18.
The primary site of the patient's brain astrocytoma is also an important factor affecting survival. By comparing the K-M survival curves of low-grade brain astrocytoma with those of high-grade brain astrocytoma, we can find that the survival rate of patients with low-grade brain astrocytoma is significantly higher than that of high-grade brain astrocytoma, and this result is consistent with clinical reality. The data of this study has been analyzed to find that most of the tumors are concentrated in the cerebrum, and experts who have studied children's brain tumors have found that children's brain tumors, especially astrocytoma, are more common in the cerebellum19, which may be related to the wider distribution of age contained in the data of this study. Therefore, the age of the patient can affect the distribution of astrocytoma in the brain. We found that the survival rate of patients with pilocytic astrocytoma, a slow-growing benign tumor that generally does not require radiotherapy, is the highest among both low-grade and high-grade brain astrocytoma by K–M survival curves of brain tumor histology type. The results of cox regression showed that diffuse astrocytoma was a major risk factor for patient survival and astrocytoma has a poor prognosis9,20. At the same time, in this study, we found that the survival rate of patients with high-grade brain astrocytoma with bilateral tumors was lower than that of patients with unilateral tumors by K–M survival curves, and a greater number of tumors, deeper extension, and sequence number were associated with poorer patient survival. But this study found that the smaller the tumor, the lower the survival rate of patients, studies on breast cancer21, adult glioma16 and peripheral schwannoma22 have found that larger tumors are related to poor prognosis, the clinical inconsistency may be due to the fact that the classification of astrocytic tumors in this study is not the latest classification standard, and there are no molecular typing-related classification standards in the 2004–2015 database.. Current treatment for high-grade brain tumors or malignant brain tumors23, surgery on patients, and simultaneous radiotherapy and chemotherapy can benefit the survival of patients7,14. The results of this study yielded an OR greater than 1 for both low-grade and high-grade tumors in patients without surgery relative to patients with surgery, indicating that surgery has a better prognosis for patients, and this result is consistent with the current conventional treatment of brain tumors in clinical practice. The present study also has some limitations, as the SEER database itself provides a limited amount of information, and the database does not provide any information on genes, so we could not study the prognostic factors of brain tumors at the genetic level19. Second, with the development of gene sequencing, brain tumors have entered the era of molecular typing, the data extracted in this study before 2016, there was no molecular typing in the database, so molecular typing analysis could not be performed, and different histotypings would change the prognosis of patients, and it is worth further research in the future.
In conclusion, In this study, the risk factors for patients with low-grade and high-grade brain astrocytoma were screened by univariate Kaplan–Meier survival curves, respectively, while the risk factors affecting the prognosis of patients with brain astrocytoma in both grades were more completely included and the nomogram were successfully established, with high AUC and C-index in both tumor training and validation sets for both grades, and a good calibration curve fit, indicating that the nomogram has a strong predictive ability to predict the 3- year and 5-year survival rates of patients. However, since the data were obtained from the United States, more studies are needed to verify whether the results obtained from the application of this data can be applied to the Chinese population, and the results obtained from this study can provide some reference for clinicians.
Data availability
The data that support the findings of this study are available from SEER database but restrictions apply to the availability of these data, which were used under license for the current study (ID: 12533-Nov2021), and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of SEER database.
References
Mcfaline-Figueroa, J. R. & Lee, E. Q. Brain tumors [J]. Am. J. Med. 131(8), 874–882 (2018).
Jelski, W. & Mroczko, B. Molecular and circulating biomarkers of brain tumors [J]. Int. J. Mol. Sci. 22(13), 1 (2021).
Lapointe, S., Perry, A. & Butowski, N. A. Primary brain tumours in adults [J]. The Lancet 392(10145), 432–446 (2018).
Ostrom, Q. T. et al. Risk factors for childhood and adult primary brain tumors [J]. Neuro Oncol. 21(11), 1357–1375 (2019).
Dang, M. & Phillips, P. C. Pediatric Brain Tumors. Continuum (Minneap Minn). 23(6), 1727–1757 (2017).
Tan, A. C. et al. Management of glioblastoma: State of the art and future directions [J]. CA Cancer J. Clin. 70(4), 299–312 (2020).
Rasheed, S., Rehman, K. & Akash, M. S. H. An insight into the risk factors of brain tumors and their therapeutic interventions [J]. Biomed. Pharmacother. 143, 112119 (2021).
Le Rhun, E. et al. Molecular targeted therapy of glioblastoma [J]. Cancer Treat Rev. 80, 101896 (2019).
Hirtz, A. et al. Astrocytoma: A hormone-sensitive tumor? [J]. Int. J. Mol. Sci. 21(23), 1 (2020).
Mao, W. et al. Treatment of advanced gallbladder cancer: A SEER-based study [J]. Cancer Med. 9(1), 141–150 (2020).
Yang, J. et al. Nomogram for predicting the survival of patients with malignant melanoma: A population analysis [J]. Oncol. Lett. 18(4), 3591–3598 (2019).
Wu, J. et al. A nomogram for predicting overall survival in patients with low-grade endometrial stromal sarcoma: A population-based analysis [J]. Cancer Commun. (Lond) 40(7), 301–312 (2020).
Balachandran, V. P. et al. Nomograms in oncology: More than meets the eye [J]. Lancet Oncol. 16(4), e173–e180 (2015).
Lin, H.-S., Kakken, H., Chao, M. & Wang, L. Analysis of prognostic factors related to patients with chondrosarcoma based on SEER database by drawing column line graphs[J]. Mod. Med. Oncol. 30(07), 1292–1299 (2022).
Yang, K. et al. Glioma targeted therapy: Insight into future of molecular approaches. Mol. Cancer. 21(1), 39 (2022).
Sharma, A. & Graber, J. J. Overview of prognostic factors in adult gliomas [J]. Ann. Palliat. Med. 10(1), 863–874 (2021).
Nayak, L. & Iwamoto, F. M. Primary brain tumors in the elderly [J]. Curr. Neurol. Neurosci. Rep. 10(4), 252–258 (2010).
Cronin, K. A. et al. Annual report to the nation on the status of cancer, part I: National cancer statistics [J]. Cancer 124(13), 2785–2800 (2018).
Tabash, M. A. Characteristics, survival and incidence rates and trends of pilocytic astrocytoma in children in the United States; SEER-based analysis [J]. J. Neurol. Sci. 400, 148–152 (2019).
Wang, Z. et al. Development and validation of a nomogram with an autophagy-related gene signature for predicting survival in patients with glioblastoma. Aging (Albany NY). 11(24), 12246–12269. https://doi.org/10.18632/aging.102566 (2019).
Lin, S. et al. Clinicopathological characteristics and survival outcomes in breast carcinosarcoma: A SEER population-based study [J]. Breast 49, 157–164 (2020).
Cai, Z. et al. Prognosis and risk factors for malignant peripheral nerve sheath tumor: A systematic review and meta-analysis [J]. World J. Surg. Oncol. 18(1), 257 (2020).
Wang, W. Increased incidence of second primary malignancy in patients with malignant astrocytoma: a population-based study [J]. Biosci. Rep. 39(6), 1 (2019).
Acknowledgements
We thank SEER database for its openness and accessibility to data, and thank the corresponding author for his help in the process of writing and revising the paper.
Author information
Authors and Affiliations
Contributions
R.W. (first author): design, original draft preparation, software analysis, data analysis and interpretation, manuscript writing, final approval of manuscript. J.C.: data interpretation, final approval of manuscript. Y.D.: collection data, final approval of manuscript. C.J.: methodology, final approval of manuscript. Y.C.: methodology, final approval of manuscript. X.L.: corresponding author, guide the revision of the article, final approval of manuscript. X.L.: corresponding author, guide the revision of the article, final approval of manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, R., Cui, J., Diao, Y. et al. Risk factor analysis and nomogram establishment and verification of brain astrocytoma patients based on SEER database. Sci Rep 13, 7754 (2023). https://doi.org/10.1038/s41598-023-33537-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-33537-w
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.