A clinico-pathological and molecular analysis reveals differences between solitary (early and late-onset) and synchronous rectal cancer

Rectal cancer (RC) appears to behave differently compared with colon cancer. We aimed to analyze existence of different subtypes of RC depending on distinct features (age of onset and the presence of synchronous primary malignant neoplasms). We compared the clinicopathological, familial and molecular features of three different populations diagnosed with RC (early-onset RC [EORC], late-onset RC, and synchronous RC [SRC]). Eighty-five RCs were identified and were evaluated according to their microsatellite instability, CpG Island Methylator Phenotype (CIMP) and chromosomal instability, as assessed by Next Generation Sequencing and microarray-based comparative genomic hybridization approaches. The results were subjected to cluster analysis. SRCs displayed the most specific characteristics including a trend for the development of multiple malignant neoplasms, a greater proportion of CIMP-High tumors (75%) and more frequent genomic alterations. These findings were confirmed by a clustering analysis that stratified RCs according to their genomic alterations. We also found that EORCs exhibited their own features including an important familial cancer component and a remarkable rate of mutations in TP53 (53%). Together, heterogeneity in RC characteristics by age of disease-onset and SRC warrants further study to optimize tailored prevention, detection and intervention strategies—particularly among young adults.


Scientific Reports
| (2021) 11:2202 | https://doi.org/10.1038/s41598-020-79118-z www.nature.com/scientificreports/ 24%, for individuals younger than 50 years of age, between 50 and 64, and older than 64 years, respectively 13 . No data have been published analyzing Synchronous CRCs. Synchronous colorectal cancer (SCRC) represents a disease profile wherein more than one primary CRC are detected in a single patient at the time of diagnosis 14 . Several studies have demonstrated molecular concordance between tumors within the same individual although definitive results have not yet been achieved [15][16][17][18][19] . Indeed, our group recently published the possible clonal origin of a subset of SCRCs diagnosed in patients without hereditary forms of CRC 20 , and highlighting the significance of epigenomic patterns in the development of multiple primary neoplasms 21 . Because pathogenesis and clinical features largely depend on the location of the tumor in the colonic mucosa, a correlation between the hypothetical molecular basis of SCRC and the anatomical location in the colon has been proposed 22 .
Although some studies include RC within the left-sided tumors, we have recently confirmed the importance of defining three different locations for CRC (right side, left side and rectum) and the key-role of age at diagnosis 23 . Given the importance of anatomic location on CRC presentation as well as prognosis, together with distinct disease types, including early-onset rectal cancers (EORC) and synchronous tumors, we hypothesize that molecular and clinical patterns will vary among RC cases by age of disease-onset and synchronous malignant neoplasms. The purpose of our study was to define clinical, pathologic, demographic/familial and molecular patterns of RC by age of onset and among synchronous rectal tumors, and adding a clustering analysis as confirmation, in order to understand more deeply RC.

Methods
Families, samples and data collection. We defined three different subsets of RC at a single institution in Spain between January 2002 and December 2008 diagnosed: early-onset RC (EORC; individuals diagnosed with RC at an age < 50 years), late-onset RC (LORC [diagnosis at ≥ 69 years] and synchronous RC (SRC; 2 or more histologically distinct CRC identified in the same patient at the same time or in a period less than six months after the first diagnosis, with at least one of them located in the rectum). Cases with severely dysplastic tumors with 'in-situ carcinoma' , hereditary polyposis and inflammatory bowel disease were excluded.
Demographic and clinicopathological information was obtained for each patient with EORC, LORC and SRC. Regarding SRC, tumor staging was defined by the tumor with the highest stage at diagnosis and tumor location was defined as previously reported 14 , and tumor relapse was defined either as regrowth at the anastomosis site (± 5 cm) or as the detection of new metastatic disease Family history of cancer (including at least three generations) and tumor tissue of paraffin-embedded samples were obtained from all patients. Regarding family history of cancer, all families were classified into four groups as we previously published 4 . All patients or a first-degree relative in the case of death of the index case provided written consent and the study was approved by the Ethics Committee of the "12 de Octubre" University Hospital. Disease-free survival (DFS) was defined as time from diagnosis to first event: disease recurrence or death from any cause. Recurrence risk, defined as the time from diagnosis to date of recurrence was also assessed. Recurrence diagnosis criteria included histological confirmation or radiologic evidence with subsequent clinical progression. Date of recurrence was defined as date of confirmatory imaging, or date of biopsy, as applicable. All cases had a complete colonoscopy at disease diagnosis, or if not possible (e.g. stenotic tumor), an intraoperative colonoscopic exploration. After primary surgical resection of the rectal tumor, patients underwent an endoscopic procedure within 9-12 months.

Assesment of microsatellite instability (MSI) and CpG Island Methylator Phenotype (CIMP) status.
Microscopic inspection of paraffin-embedded samples was performed by a pathologist. The acceptable proportion of tumor cells in the neoplastic material as well as the protocol for DNA isolation was previously reported 24 . We used the Bethesda panel to investigate the MSI status, considering MSI+ when 2 or more markers were altered. For the evaluation of CIMP, we studied the methylation status of the promoter regions of CAC-NA1G, CDKN2A, CRABP1, IGF2, MLH1, NEUROG1, RUNX3 and SOCS1 genes, and patients were categorized as CIMP-High, CIMP-Low or CIMP-0. The procedures for the evaluation and classification of MSI and CIMP statuses have been previously described 14 . Mutational status analysis by Next Generation Sequencing (NGS). We used the Ion Torrent PGM platform with a commercial panel including 207 amplicons from 50 oncogenes and tumor suppressor genes (Supplementary Table S1). The protocols for the NGS library preparation, emulsion PCR, sequencing analysis, bioinformatics processing and data analysis were performed as previously reported 20 . For SRC cases, only rectal tumors were included for sequencing.
Chromosomal instability. The analysis of copy number alterations (CNA) was carried out using two microarray-based comparative genomic hybridization (aCGH) platforms. Thus, from the total of 67 samples that could be satisfactorily processed, 17 were hybridized to the OncoScan Affymetrix array (SRC) and 50 (17 EORC, and 33 LORC) to the NimbleGen Human Whole Genome CGH array (Roche NimbleGen, Inc., Reykjavik, Iceland). Similar to NGS analysis, we excluded tumors not located in the rectum from the SRC group.
The protocol employed for the assessment of CNA for the NimbleGen platform has been previously described 24 . Data from these arrays were analyzed using the NimbleScan software (v.2.6) performing a spatial correction using the LOESS method, normalizing data values by qspline fit normalization and finally extracting the log2 ratio feature values. Both datasets have been included in the gene expression omnibus (GEO): for LORC (GSE108166) and EORC (GSE108220). The other 17 SRCs were analyzed using the OncoScan CNV FFPE Assay Kit according to the manufacturer's instructions. Intensities from the OncoScan array were normalized and scaled using the ChAS console (v. 4 20 and have been also included in GEO database (GSE110026). In both platforms, log2 ratio data were segmented through the Piecewise Constant Fits segmentation method and then adjusted to reduce the outlier effect though Winsorization using the copy number package (v.1.22.0) in R (v.3.5.1). Segments from all the samples analyzed with both microarray platforms were combined using the intersect option of the bedtools toolset (v.2.17.0). The association between the three groups and the copy number status was assessed using the Fisher's test in R through the function from the stats package (v.3.5.1).
Multidimensional scaling was performed in the SIMFIT (v.7.5.1) statistical program using the Euclidean distance as the distance measure and the group average as the linkage method. Data coordinates retrieved from this plot were used to group samples through the MClust package (v. 5.4.5). The model with the best log-likelihood value was model VII (spherical with varying volume) with 5 components.
Statistical analyses. Continuous variables were expressed as mean values plus/minus standard deviation (SD) and categorical variables were expressed as number of cases and their percentage. For the assessment of associations between age-of-onset and discrete variables either Pearson's Chi Square (χ 2 ) (parametric variables) or Fisher's Exact Test (non-parametric variables) were used. Comparison of continuous variables was performed using Student's t-test. Statistical analysis was performed using SPSS 17.0 (SPSS Inc., Chicago, IL, USA) and differences were considered statistically significant when p < 0.05. In order to identify potentially differentially altered regions for each population group, univariate analyses were carried out by performing an unconditional logistic regression for each candidate region. A false discovery rate (FDR) was calculated for each p value and regions with an FDR < 0.05 were considered statistically significant.
Ethical approval. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent. Written informed consent was obtained from all individual participants included in the study or a first-degree relative in the case of death of the index case. Comparative analysis between all the subset of rectal cancers. When we compared the three groups, SRCs exhibited the most different clinicopathological features. Overall, these patients were diagnosed at earlier tumor stages (75% in stage I; p < 0.001) and with a higher frequency of polyps (93%; p = 0.004), along with a higher mean of associated polyps (9.8; p = 0.004). Moreover, this subset of patients also possessed the highest rate of development of metachronous CRCs (MCRC) (33%; p < 0.001). As expected, the EORC group had the most frequent family history of cancer including Lynch-related and unrelated neoplasms with a concurrent lower frequency of polyps during follow-up (43%). Finally, the LORC group resembled EORC except for the higher sporadic component (92%; p < 0.001) and a higher association with the presence of polyps. Additional clinicopathological and familial characteristics of studied cases are shown in Table 1.
Comparative analysis of mutational status. Sixty-five samples were successfully processed by NGS (33 LORC, 17 EORC and 15 SRC) to explore mutation patterns of RC across groups. Considering all tumors, the most frequent mutated genes were the KRAS (38%), APC (32%), and TP53 (28%). Interestingly, the KRAS mutation rate was significantly higher in the SRC group (60%; p = 0.0048) whereas TP53 showed the highest rate of mutation in EORC (53%; p = 0.012) across groups. It is also worth noting that SMAD4, which was differentially altered in the three populations, had a higher mutation frequency in SRC (20%; Table 2).
Comparative analysis of chromosomal instability. A total of 67 samples were analyzed by aCGH. The most recurrently altered regions are shown in Supplementary Table S2. The group showing the highest number of genomic changes was SRC, where the most recurrent alterations were losses at 1q21 (65%) and 1p36 (59%), and gains at 8q24 and 20q13 (65%). The other two groups, on the contrary, seemed to be more heterogeneous; thus, losses at 9p13, 9q12 and 9q13 (all 41%) and gains at 19p13 and 19q11 (both 47%) were frequent in EORC, and gains at 9p13, 9q12, 19p12 and 19q11 (all 42%) were frequent in LORC. With respect to the altered regions occurring with statistically significantly different frequencies in the three groups, it is important to emphasize that most of them belonged to the SRC group, being gains at 8q24.3 and 20q13.12 (p = 0.001 and p = 0.004, respectively), and losses at 1q21.1 (p < 0.001), among others, those most notable (Tables 3 and 4). There were no clinical differences between groups except for the distribution of SRC. Thus, all cases with both tumors confined in the rectum were categorized within G-III. Considering EORC and LORC collectively, 2 of the 5 RC patients that had synchronous high-degree dysplasia in other colonic locations were also included within G-III. Interestingly, from the 7 RCs showing synchronous cancers in organs other than the colon, the only  www.nature.com/scientificreports/ 2 associated with digestive tract malignancies (gastric cancers) and were categorized within G-III. All tumors stratified within G-IV showed aneuploidy (defined as alteration-gain or loss-of whole chromosomes: at least a 50% of any arm of each chromosome) while this feature was present in 87%, 78%, 78% and 20% of G-II, G-V, G-III and G-I cases, respectively (Supplementary Table S3). With regards to genomic alterations, we observed a progressive increase as we moved from G-I to G-V (Supplementary Table S4). Accordingly, G-I exhibited the lowest percentage of genomic alterations (only 15% of the cases) as well as the lowest mutation rate obtained by NGS (Supplementary Table S5), whereas G-V was the group harboring the highest frequency of recurrent chromosomal alterations with gains at 3q26-q29 being the most prevalent. On the contrary, G-II and G-IV were very homogeneous groups in terms of number of recurrent alterations, prevailing gains at 19p and 19q in G-II, and gains at 7q, 16p, 19p, 19q and 20q in G-IV. Regarding G-III, it was a group with some specific alterations, although with low frequencies, which may be indicating its larger sporadic component. Recurrent chromosomal abnormalities with different percentage of affectation in each group are detailed in Table 5.

Discussion
In contrast with the progressive decrease in diagnoses in older populations, the incidence of CRC in patients aged < 50 years has increased during the last two decades. In the context of EOCRC, the proportion of patients having cancer within the rectum seems to be experiencing an even more remarkable increase since RC represents up to 18% of all CRC diagnosed in young patients 12 . Several studies have highlighted the importance of the anatomical location in the pathogenesis and therapeutic responses suggesting that colorectal tumors should be evaluated as separate entities depending on the location of the lesion 25,26 . Moreover, some studies have focused specifically on RC subtypes and suggested that EORC may differ from LORC on a biological basis and in response to multimodality therapy 27 . In view of this premise and considering the high prevalence of RC in young adults, comparative analyses of rectal tumors are required as a suitable approach to go forward in the individualized management of RC. From the three subsets of RCs studied in our series (EORC, LORC and SRC), SRCs presented with the most dissimilar characteristics-both from a clinical and molecular viewpoint. In comparison with the other groups, SRCs were characterized by an earlier age of diagnosis possibly because of the greater number of symptoms in these patients. In contrast, EORC was the group with the greatest delay in the diagnosis probably due to the low suspicion rate derived from the infrequency of CRC in young adults 28 . Another interesting point regarding the SRC group was its higher association with the development of polyps during the follow-up as well as with the development of MCRCs, what would be supporting the hypothesis that a field effect is operative in SCRC 29,30 . In this sense, also the large rate of CIMP-High tumors within this population (75%) might confirm the contribution of epigenetic alterations to field effects previously proposed 29,31 . These findings are also in accordance with our previous studies in which we concluded that CIMP+ tumors were more frequent in patients diagnosed with SCRC than in patients with an isolated CRC or MCRC 20,21 . Interestingly, we observed some molecular alterations that appeared to be linked with the SRC subset. Losses in 1q21 have been previously described in relation with EOCRC and gains in 8q24, a locus associated with CRC susceptibility polymorphisms which harbors MYC, an important proto-oncogene over-expressed in numerous tumors and related with the presence of synchronous adenomas elsewhere in the colon 24,[32][33][34] . Finally, it is worth mentioning the locus 20q13 for being related with worse prognoses in RC 35 , and the locus 3q whose gains have been correlated with some types of neoplasia including anal cancers when there is association with human papillomavirus infection 36,37 .
In our series, the subset of RC diagnosed at an early age also showed some specific features such as a remarkable absence of MSI. In accordance with this, the EORC group revealed an absence of Lynch syndrome cases despite the familial cancer component underlying the group confirms this risk factor for the development of CRC even without a known hereditary component 38 . This group also displayed the higher rate of P53 mutations and the lower rate of KRAS mutations. We have recently published the relationship between EOCRC and P53 mutations in patients without associated polyps 39 proposing that P53 mutations may be an indicator of worse prognosis, in line with what has been said by other authors specifically for RC 40 . Regarding KRAS mutations, our findings point out that young population might need different therapeutic approaches, and corroborate the variability in the mutation rate of this gene as other studies according to the age and tumor location has also seen 41 .
Apart from the already known reality that RC may be understood as a different entity from CC, our findings suggest that there might be even different types of RC when criteria such as the age of diagnosis or the presence of multiple synchronous CRCs are considered. In this sense, in our series SRC appears to be different from other RCs. From a clinical point of view, SRC showed a higher tendency to develop malignant neoplasms in other colonic and extracolonic locations. Molecularly, SRC showed different molecular features which included a larger proportion of CIMP-High tumors as well as a greater number of genomic alterations, some of which seemed to be specific for this type of RC. Although EORCs also demonstrated their own features, they did in a less pronounced way, being the most striking findings their greater family component and the high rate of mutations in P53.
Despite the sample size of this study is limited, our findings may serve as a starting point for larger studies. Two points should be underlined regarding our findings: the persistent results that position SRC as a separate group of RC since these tumors were mainly confined within two of the five groups obtained by clustering analysis; and the higher possibility of developing future malignancies (including not only CRC but also other digestive tumors) in those patients with multiple primary neoplasia may serve as another good approach to this type of cases. Further studies with larger sample size would be recommended to confirm and substantiate our findings, especially regarding Synchronous rectal tumors.