Introduction

Colorectal cancer (CRC) is a common disease worldwide, tending to rise uniformly with increasing human development index in many countries [1, 2]. In Brazil, it is the second most common cause of cancer for men and women [3]. The rise of CRC incidence rates observed in the last decade is due to population aging, increasing smoking rates, poor dietary habits, low physical activity, and the absence of widespread screening programs [4, 5]. The Brazilian CRC mortality rate is increasing for both sexes when comparing data across the last decades and data from Latin American countries [6].

CRC is a heterogeneous molecular disease, which leads to distinct clinicopathological features and patient outcome [4]. The cumulative acquisition of genetic alterations leads to a progressive tumorigenesis process [4]. These alterations are linked to three main molecular groups: chromosomal instability (CIN), microsatellite instability (MSI), and CpG island methylation phenotype (CIMP), and these genetic pathways are involved in the development of CRC affecting oncogenes, tumor suppressor genes, and DNA repair mechanisms [7, 8].

Nowadays, CRC has also been subtyped molecularly in four consensus molecular subtypes (CMS) based on gene expression with distinguishing features: CMS1 (microsatellite instability immune, 14%), hypermutated, microsatellite unstable and robust immune activation; CMS2 (canonical, 37%), epithelial, marked WNT and MYC signaling activation; CMS3 (metabolic, 13%), epithelial and evident metabolic dysregulation; and CMS4 (mesenchymal, 23%), prominent transforming growth factor-beta activation, stromal invasion and angiogenesis [9].

Microsatellite instability (MSI) is a crucial feature of a subset of CRC [4, 8] and is a molecular marker for defects in the mismatch repair system. It occurs when Mismatch repair (MMR) proteins (MLH1, MLH3, MSH2, MSH3, MSH6, PMS1, and PMS2) are absent due to mutations or promoter hypermethylation in hereditary and sporadic cancer, respectively [8, 10]. These proteins are essential to repair base-base mismatches occurring during DNA replication; thus, their loss guides DNA replication errors accumulation, mostly in microsatellites genomic areas [11]. As alterations occur in a random matter, findings suggest a different progression of each MSI-positive (MSI+) tumor, a model in which MSI mutator phenotype develops in gradual steps by successive alterations of different MSI-target genes [12, 13].

MSI + CRCs are associated with clinical features, such as proximal location, poorly differentiated histology, intense lymphocytic infiltration, favorable prognosis, and lower risk of metastasis [4, 11,12,13]. Moreover, evidence suggests that MSI+ patients are less responsive to 5-fluorouracil-based chemotherapy (5-FU), and are more responsive to irinotecan-based regimens [4, 11, 14]. Importantly, MSI was the first agnostic biomarker approved by the FDA (Food and Drug Administration) to select patients to be treated with immunotherapy treatments [15].

It is reported that African Americans with CRC are typically diagnosed at a younger age than European Americans and display high mortality rates even at early stages of CRC [16, 17]. Likewise, our group recently assessed the genetic ancestry profile of 1000 patients and observed that patients with high African genetic ancestry proportions developed cancer at a younger age [18]. A recent meta-analysis of MSI frequency and ethnicity in CRC did not observe significant differences among the North American population [19].

Few studies have comprehensively described and characterized the main clinicopathological features of highly admixture populations, such as Brazilian CRC patients [18, 20], especially concerning MSI status and its clinical impact. Therefore, this study aimed to evaluate the long-term outcome of MSI status of 1002 Brazilian CRC and associate it with genetic ancestry, molecular and clinicopathological features.

Methods

Participants

The present study included 1002 patients diagnosed between 2010 and 2014 with CRC at Barretos Cancer Hospital. The median follow-up of our cases was 62.0 months. During the inclusion period, patients diagnosed with Lynch Syndrome were excluded [21]. Clinicopathological and treatment data was recently reported [18]. AJCC (American Joint Committee on Cancer) cancer staging system, 8th Edition classification was applied. The study was evaluated and approved by Institutional Ethics Committee (protocol number: 600/2012 - CAAE: 02468812.30000.5437).

DNA isolation and microsatellite instability

Tumor DNA was isolated using QIAamp DNA Micro Kit (Qiagen, Hilden, Germany). MSI analysis of the current series was recently reported by our group [22]. Briefly, MSI evaluation was performed using a multiplex PCR comprising six quasi-monomorphic mononucleotide repeat markers (BAT25, BAT26, NR21, NR24, NR27, and HSP110). Cases with two or more markers out of the quasimonomorphic variation range (QMVR) were classified as MSI-positive (MSI+), and cases without markers out of QMVR were classified as MSI-negative (MSI−), as reported [23].

MSI-target genes

We analyzed 23 MSI-target genes that contained microsatellite regions in their constitution and were previously reported to be important in CRC carcinogenesis [12, 13]. Multiplex PCR was performed for evaluation in MSI+ cases in a 23-gene panel: TCF4, XRCC2, MBD4, MRE11, ATR, MSH3, RAD50, MSH6, BAX, DNAPkc, BRCA1, BRAC2, WISP3, BLM, PTEN, ATM, TGFBR2, XPO5, TRBP2, EGFR, ABCC5, ROCK1, and GLYR1, as previously described [24].

For multiplex PCR, 5 µL of Qiagen multiplex PCR master mix (Qiagen, Germany), 1 µL of primer mix (1 µM), 3 µL of water, and 1 µL of DNA with a concentration of 50 ng/µL were used, with a 10 µL final volume. Cycling was carried out under the following conditions: 95 °C for 15 min for initial DNA denaturation, followed by 30 cycles of 94 °C for 30 s, 55 °C for 90 s and 72 °C for 45 s that provided denaturation, annealing, and extension, respectively. The final extension was provided by the 72 °C stages for 40 min and a standby temperature of 4 °C.

PCR products were then prepared for capillary electrophoresis by adding 1 µL of the amplified product, 8.7 µL of Hi-Di Formamide (Applied Biosystems, Foster City, CA, USA), and 0.3 µL of GeneScan 500 ROX size standard (Applied Biosystems, USA). The presence of alterations in microsatellite regions of these genes was assessed by fragment analyzes in 3500 Genetic Analyzer (Applied Biosystems, USA). Data generated by 3500 Genetic Analyzer equipment were analyzed using GeneMapper software version 4.0 (Applied Biosystems, USA). No-altered status was given for tumor samples whose fragment size was similar to that of normal tissue. Altered status was given for tumor samples whose fragment size was different (expansion or contraction in microsatellite) than normal tissue.

Ancestry analysis

Ancestry determination of the present series was recently reported by our group [18]. Briefly, it was performed using 46-plex ancestry-informative markers (AIMs) among the most informative INDELs for four major population groups (African, European, Eastern Asian, and Amerindian) as previously described [25]. Ancestry proportions were then assessed using Structure v2.3.4 software [26, 27], considering each major population group as possible contributors to the current genetic makeup of Brazilians. Supervised analysis was performed to estimate ancestry membership ratios of individuals using HGDP-CEPH panel data as a reference for ancestral populations.

Statistical analysis

The sample was characterized using frequency and/or contingency tables for qualitative variables, and for quantitative variables were used measures of central tendency and dispersion (mean, median, standard deviation, minimum and maximum).

Chi-Square or Fisher’s Exact Tests were used to verify MSI status association with demographic data, clinical-pathological characteristics, and genomic ancestry and the multiple comparisons between the columns were performed by Bonferroni method. Variables considered significant (p < 0.20) were selected to fit the Multiple Logistic Regression Model, through which we estimated the Odds Ratio by the final model that was composed of all variables that remained significant together with a level of significance of 5%.

Survival probability was estimated using the Kaplan-Meier method, and comparison between curves was performed using the Log-Rank test. Characteristics considered significant (p < 0.20) in that test were used to adjust Cox Proportional-Hazards Model by which we estimate Hazard Ratio (HR). To compose the final model, characteristics that remained significant (p < 0.05) were used together. Analyzes were performed using SPSS software version 27 (IBM Corp, Armonk, NY, USA).

Results

Association analyses between MSI status and clinicopathological features

Herein, we analyzed the association of MSI status with patients’ clinicopathological and ancestry features from 105 MSI+ and 897 MSI− CRC patients (Table 1). In a univariate analysis, we observed that MSI+ was significantly associated with tumors sited in the right colon, mucinous histological type, clinical stage II, histological grade III/Undifferentiated, no recurrence disease, and live cases without cancer (Table 1). No association of MSI status with genetic ancestry components (European, African, Eastern Asian, and Amerindian) nor with Brazilian origin of patients was observed (Table 1 and Fig. 1).

Table 1 Description of clinicopathological features for MSI status and association analyses between MSI status and clinicopathological features for CRC.
Fig. 1: Genetic ancestry analysis.
figure 1

Graphical representation of the genetic ancestry component of each case separated by MSI status (MSI-positive and MSI-negative). The ancestry analysis was performed using a set of 46 AIMs among the most informative INDELs for each population group and using the genetic data from the HGDP-CEPH panel as a reference. A supervised analysis was performed to estimate ancestry proportions of the individuals. Structure software runs considering K = 4 consisted of 100 000 burning steps followed by 100 000 Markov Chain Monte Carlo iterations. The option ‘Use population Information to test for migrants’ was used with the Admixture model, considering allele frequencies correlated, and updating allele frequencies using only individuals with POPFLAG = 1. The proportion in percentage of each ancestry component for each patient (columns) is represented by colors (Y axis): red – African, green – European, blue – Eastern Asian and yellow – Amerindian.

Next, a multivariate analysis was performed with all variables with a p < 0.2, obtained after univariate analysis with MSI status as the outcome. We observed that MSI+ was significantly associated with tumors in the right colon, histological grade III/undifferentiated, and clinical stage IV (Supplementary Table I).

MSI-target genes

In MSI+ cases, we further evaluate the status of a panel of 23 MSI-target genes (Fig. 2 and Supplementary Fig. 1). These MSI-target genes can be stratified by their function in DNA repair (XRCC2, MBD4, MRE11, MSH3, MSH6, RAD50, DNAPkc, BRCA1, BRCA2, BLM, ATM, and ATR), apoptosis (BAX), cell signaling (EGFR, PTEN, TCF4, and TGFBRII), microRNA regulation (TRBP2 and XPO5), oxi-reduction (GLYR1), adhesion/cytoskeleton (WISP3 and ROCK1) and transport (XPO5 and ABCC5). The five most altered genes were: ATM, EGFR, MRE11, ROCK1, and TGFBRII (Fig. 2), corresponding gene functions related to DNA repair, adhesion/cytoskeleton, and cell signaling. We also found the absence of alterations in BRCA1, BRCA2, XPO5, and XRCC2 (Fig. 2).

Fig. 2: MSI-target gene analyses.
figure 2

Upper panel: the columns represent the MSI+ patients, and the lines represent each analyzed microsatellite region of the MSI-target genes. At right, the percentage of altered cases are shown for contraction or expansion (dark blue) in the analyzed microsatellite region of each MSI-target gene stratified by gene function (at left). Light blue and white rectangles represent not-altered and inconclusive cases, respectively. The most altered genes are at the top of each gene function section. Lower panel: the columns represent the MSI+ patients analyzed for MSI-target gene, and the lines represent the distribution of clinicopathological features (primary tumor site, histological type, and clinical stage AJCC) for each sample.

Impact of MSI status in patient survival

The 5-years cancer-specific survival (CSS) analyses of CRC cases were stratified by MSI status. MSI+ patients had significantly higher 5-years probability of survival than MSI- cases, 77.7%, and 60.5%, respectively (Supplementary Table II and Supplementary Fig. 2).

Several significant associations were observed between CSS and MSI + CRC patients’ features, including better survival probability for patients with clinical stage II and III, absence of angiolymphatic and perineural invasions and no recurrence of disease (Table 2). For MSI- cases, better survival probability was observed, including female patients, clinical stage 0/I and II, tumors with mucinous histological type and I/II histological grade, absence of angiolymphatic and perineural invasions, no recurrence of disease, and patients underwent treatments such as neoadjuvant and adjuvant chemotherapy and radiotherapy (Table 2).

Table 2 Kaplan–Meier estimates of cancer-specific survival for colorectal cancer patients and for MSI status (n = 1002).

Multivariate analysis for CSS showed that in MSI+ cases, tumors with clinical stage IV, presence of perineural invasion, and recurrence of disease were associated with an increased relative risk of death by cancer (p < 0.05). Additionally, patients with age of diagnoses in years between ≥50 to <75 were associated with a lower risk of death (Table 3). For MSI− cases, the analysis showed that: tumors with clinical stage II, III, and IV, presence of angiolymphatic invasion, and recurrence of disease were associated with an increased relative risk of death by cancer (p < 0.05), whereas female gender cases and the adjuvant chemotherapy treatment were associated with a lower risk of death (Table 3).

Table 3 Multivariate analysis of cancer-specific survival associated with different clinicopathological characteristics and treatment of patients with colorectal cancer by MSI status.

Given the importance that the clinical stage has as a significant prognostic factor and treatment determinant, estimates of CSS stratified by MSI status for all stages were performed. Overall, there was a statistically significant difference for clinical stage III, where MSI+ patients showed a better 5-years probability than MSI− patients (Table 4 and Supplementary Fig. 3). Regarding ancestry components, Kaplan–Meier estimates of CSS for patients by MSI status were performed, and no statistically significant differences were observed (Supplementary Table III).

Table 4 Kaplan–Meier estimates of cancer-specific survival for Clinical stage (AJCC) of colorectal cancer patients by MSI status.

Then, we analyzed the impact of the MSI-target genes on patient survival. For these analyses, altered cases were considered those that showed alteration in at least one of the MSI-target genes that make up each gene function. MSI+ patients with altered MSI-target genes related to gene functions: DNA repair, cell signaling, and adhesion/cytoskeleton were more likely to survive for five years when compared with MSI− patients (Supplementary Table IV).

Finally, based on the reported role of MSI in 5-fluorouracil (5-FU) and the high use of 5-FU and oxaliplatin in CRC treatment, we estimate the CSS stratified by MSI status for neoadjuvant and adjuvant protocols in cases with a clinical-stage II and III. We observed no impact in patient survival for any of the treatments used when comparing the MSI status (Supplementary Table V).

Discussion

The current study performed the most extensive characterization of clinicopathological aspects from Brazilian CRC patients. Regarding MSI status, it analyzed MSI mutator phenotype in alterations of different MSI-target genes, and identified whether MSI status could influence patients’ clinicopathological features and disease outcomes.

With the advent of personalized medicine, the need for and importance of prognostic or diagnostic biomarkers has increased. MSI status has been used as a biomarker for several purposes in CRC, such as (I) screening hereditary cases; (II) prognostic marker, where cases with MSI have a better prognosis than those who do not; (III) resistance to 5-FU therapy and sensitivity to irinotecan and recently, (IV) immunotherapy response.

The frequency of MSI+ among CRC varies in the literature, ranging from 6% to 20% [11, 19, 28,29,30]. According to TCGA (The Cancer Genome Atlas Network) data for CRC, 16% of analyzed samples showed MSI+ [31]. We observed a frequency of MSI+ in approximately 10% of sporadic CRC patients, like those reported in other studies. This discrepancy of MSI+ proportion among our results and other works can be caused by several factors, such as methodological differences used in MSI determination; inclusion of hereditary and sporadic CRC cases; distinct AJCC staging inclusion of patients; different patients’ ethnicities, and environmental criteria that may affect the presence of MSI in CRC. The present work conclusive elucidates the MSI frequency by analyzing a large number of Brazilian CRC, originated from distinct Brazilian regions and representing all clinical stages.

The MSI status association with clinicopathological features showed significant associations in multivariate analyses, such as tumors in the right colon, histological grade III/undifferentiated, and clinical stage IV. These results are in accordance with international literature [11, 14, 29, 30, 32, 33], highlighting the distinct genetic relation about tumor location at colon and reinforcing the role of MSI as a prognostic biomarker in CRC.

The survival analysis of the present study showed a 5-years probability of 62.3% for CRC patients. When we stratified the cases regarding the MSI status (MSI− and MSI+), we observed a statistically significant difference between the two curves, with a 5-years probability of 60.5% for the MSI− and 77.7% for MSI+ patients. There are reports of significant differences between MSI statuses in the international literature, as Yoon et al. demonstrated that MSI+ patients had higher disease-free survival rates and overall survival than those MSI− [29]. Differently, there was no statistical difference between the survival curves when Nam et al. analyzed cases of advanced CRC stratified according to MSI status [33]. MSI status is often associated with survival in CRC. The meta-analysis by Guastadisegni et al. showed that MSI+ CRC was associated with a 40% higher survival rate than MSI- CRC [34]. Another study with >7000 cases reported that MSI+ patients have a significantly better prognosis than those with MSI− tumors [35].

Significant associations were observed between survival and MSI+ CRC patients’ features, including better survival probability for patients with a clinical-stage II and III, absence of angiolymphatic and perineural invasions, and no recurrence of the disease. The clinical-stage variable showed a significant difference between the survival curves in both MSI statuses. Given the importance of staging as a prognostic factor, estimates of survival stratified by MSI status for stages 0/I, II, III, and IV were performed. In general, there was a statistically significant difference for clinical stage III, where MSI+ patients showed a better 5-years probability than MSI− patients. These results are consistent with studies that state that MSI+ patients have a better prognosis when paired by stage than MSI− [36, 37]. However, this data is not consensual, and other studies had found no differences when patients in stages II and III were considered separate according to the location of the tumor [29, 38].

MSI status association with chemotherapy response is unclear. In the present study, we estimated survival stratified by MSI status and treatment based on 5-FU and oxaliplatin. There was no statistically significant difference for any of the analyzes. In a clinical trial with stage II–III colon cancer patients, those who were MSI+ had a better prognosis, but there was no association between MSI status and the benefits of chemotherapy [39]. The 5-FU treatment is used to treat CRC, being recommended in high-risk stage II cases and as a first-line for stages III and IV [40]. Several studies analyze the influence that MSI has on the response to 5-FU [37] and have described that MSI status is not a predictive biomarker of response to 5-FU [41, 42]. Webber et al. carried out a meta-analysis involving 9212 patients treated or not with 5-FU and concluded that the therapeutic regimen improved disease-free survival and overall survival, but the status of MSI did not influence the response to 5-FU-treatment [40]. Moreover, Alex et al. suggest that MSI+ phenotype is predictive of resistance to oxaliplatin-based chemotherapy, suggesting biological heterogeneity within the MSI+ CRC metastatic patients [43]. Currently, at Barretos Cancer Hospital, stage II MSI positive cases are not treated with 5-FU-base chemotherapy, following NCCN (National Comprehensive Cancer Network) guidelines and Ribic et al. [44].

MSI phenotype may affect any microsatellite region across the genome leading to a characteristic mutation burden, and each altered microsatellite region contributes to MSI+ CRC heterogeneity. The present work was the first to analyze the frequency of MSI-target genes in a representative number of Brazilian CRC patients. We observed that the five most altered genes were: ATM, EGFR, MRE11, ROCK1, and TGFBRII, and the absence of alterations in BRCA1, BRCA2, XPO5, and XRCC2. Mutation in MRE11, namely at the microsatellite tract of 11(T) located at intron 4, is observed in approximately 80% of MSI tumors and leads to aberrant splicing and a truncated protein [45]. Interestingly, Vilar et al. showed that MRE11 deficiency could increase sensitivity to PARP inhibitors [46]. Another reported MSI-target gene is the TGFBRII, reported mutated in 69–90% of cases [47, 48]. Inactivation of TGFB signaling is one of the steps in CRC progression. A possible predictive role of prognosis for TGFBRII has also been reported when observing that CRC patients with stage III, MSI+, and mutation in TGFB treated with chemotherapy based on 5-FU had a better prognosis [49, 50]. We also estimated the cancer-specific survival for MSI-target gene function by MSI status. We observed a statistically significant difference between the survival curves for MSI-target genes involved with gene functions: DNA repair, cellular signaling, and adhesion/cytoskeleton. MSI+ cases with altered MSI-target genes linked to these gene functions were more likely to survive in 5 years when compared to MSI− cases. In general, these gene functions are crucial for tumors development/progression, genes present in them have been used for the development of new drugs that contribute to a better response and patient survival, and data about changes in these pathways are essential for guidance in clinical conduct, which demonstrates the presence and importance of genetic heterogeneity among MSI cases.

The genetic ancestry of the present cases was previously analyzed and great admixture in composition was observed: African 12.7% (SD = 15.7%), European 74.2% (SD = 20.6%), Eastern Asian 6.5% (SD = 11.3%), and Amerindian 6.6% (SD = 7.1%) [18]. In the present study, there was no correlation between the different ancestry proportions and MSI status and neither association of MSI status frequencies among Brazilian regions of origin from patients despite the divergence of ancestral components present in different country regions.

The present pioneering study determined the association of MSI status on 5-year survival and association of clinicopathological and molecular features in >1000 Brazilian CRC patients. We observed 10% of MSI+ frequency, tumors preferentially localized in the right colon, clinicopathological characteristics associated with less aggressiveness, and we observed a significant difference in the survival of these patients. MSI+ cases showed changes in several MSI-target genes, being the most altered related to functions like DNA repair, DNA damage sensor, and cellular signaling. The present study demonstrated the genetic heterogeneity present in MSI+ CRC patients and may contribute to the clinical management strategies of these patients.