Introduction

Patients with ulcerative colitis (UC) are at an increased risk of developing and dying from colorectal cancer (CRC), which increases as a function of the extent, duration, and severity of inflammation of the colorectum [1]. The intensity of microscopic inflammation in colonic biopsy specimens has been implicated as another independent risk factor for the development of CRC [2, 3]. Furthermore, male sex, young age at UC onset and concomitant primary sclerosing cholangitis have been reported to increase CRC risk [4, 5].

Key phenotypic characteristics of UC-CRC, as opposed to sporadic CRC, include younger patient age at diagnosis, a higher fraction of carcinomas with mucinous and/or signet ring cell histology, and a higher incidence of synchronous or metachronous (precursor) lesions [6]. Although colonoscopic surveillance and more effective treatment options have reduced incidence rates of advanced UC-CRC, current numbers still claim a 38% increased risk for developing a CRC with a 25% higher stage-corrected mortality for UC patients compared to non-UC patients [7].

Despite advanced endoscopic imaging techniques, not all dysplastic lesions are endoscopically detectable and histologic identification of dysplasia is still a major factor for cancer surveillance in UC patients [8, 9]. However, microscopic appearance of dysplasia can be heterogeneous ranging from the “conventional” intestinal-type histology to so-called “non-conventional” morphological patterns such as hypermucinous and serrated lesions (see [10] for review) and stringent criteria for the identification and grading of dysplasia are lacking [11].

The multistep histologic progression from UC to dysplasia and ultimately invasive carcinoma is driven by the accumulation of genomic alterations, in analogy to the sporadic adenoma-carcinoma sequence, though differences in frequency and timing of genomic alterations exist [12, 13]. In silico modeling suggests that half or more of the somatic mutations present in a tumor may originate prior to tumor initiation, i.e., may occur in a histologically occult manner before the development of morphologically recognizable lesions [14]. In other words, a significant proportion of the genomic evolution of a tumor may take place in histologically normal appearing tissue without histologic evidence of dysplasia. Compared to sporadic CRC, UC-CRCs are characterized by higher frequencies of TP53 mutations (70% vs. 61%) and lower frequencies of KRAS (34% vs. 43%) and APC mutations (12% vs. 81%) [15,16,17,18,19]. TP53 mutations appear to be early events in UC-related colorectal carcinogenesis, in contrast to KRAS mutations, which have rarely been identified as founder events [20, 21]. In UC, the process of CRC development might also occur simultaneously and independently in multiple locations in the colon of UC patients as inferred by distinct mutational landscapes [22]. Another prevalent genomic characteristic of UC-CRC is aneuploidy, i.e., chromosomal copy number alterations (CNAs) [23, 24]. Aneuploidy can already be present in non-dysplastic, inflamed colonic epithelium, and has the potential to trigger dysplasia and/or cancer development [25, 26]. This indicates a role for chromosomal instability in UC-CRC development and a possibility to utilize the presence of CNAs as marker for early detection of histologically occult neoplastic clones. Despite genomic instability in cancer genomes [27], cancer cell populations as a whole faithfully maintain tumor-specific patterns of genomic imbalances, which are strongly influenced by the tissue of origin [28,29,30,31]. Also, etiology-driven distinctive genomic alterations have been reported such as MYC amplification in radiation-induced angiosarcomas [32]. Therefore, the landscape of CNAs in UC-CRC warrants investigation. However, unlike analyses of mutations by targeted (or exome) sequencing, only few studies have investigated genome-wide CNA patterns in UC-CRC [24, 33, 34]. In particular, studies using high resolution cytogenetic techniques such as array-based comparative genomic hybridization (aCGH) or low-pass whole genome sequencing are sparse [35].

Here, we analyzed formalin-fixed paraffin-embedded (FFPE) tissue samples from 19 patients with long-standing UC (median 18 years, range 3–34) who had developed CRC as a consequence of ongoing chronic inflammation of the large intestine. We performed microsatellite instability testing, copy number analysis by aCGH, mutation analysis by targeted next generation sequencing (48-gene panel) and TP53 immunostaining. The results suggest an etiology-driven entity-specific landscape of mutations and CNAs in UC-CRC, which is distinct from sporadic CRC.

Materials and methods

Patients and tissue samples

We collected FFPE tissue samples from 19 patients with UC-CRC diagnosed between 2003 and 2016 from the archive of the Institute of Pathology in Mannheim. All samples were bowel resection specimen, either partial colectomy or proctocolectomy. UC-related etiology of a CRC was assumed if (i) the patient had long-standing UC at the time of CRC diagnosis, (ii) inflammation in the colon involved the large bowel segment, in which the CRC was located, and (iii) if inflammation and/or chronic inflammatory changes in adjacent mucosa were visible. All intestinal tissue samples were screened for inflammation, regenerative, i.e., inflammatory, changes, and dysplasia according to the histopathologic criteria defined by the Inflammatory Bowel Disease-Dysplasia Morphology Study Group [36]. In addition to the initial diagnosis, all samples were re-evaluated by two pathologists (DH, TG). Tumor staging was performed according to the current American Joint Committee on Cancer (AJCC)/Union for International Cancer Control staging system (8th edition) [37]. The study was approved by the local ethics committee of the Medical Faculty Mannheim of Heidelberg University (2016-819R-MA) and the National Institutes of Health (OHSRP#13229/MTA#41436).

DNA isolation

One to three representative blocks were selected per tumor, depending on tumor size and tumor block availability. DNA was isolated from 20 µm sections from the FFPE blocks after determining the tumor areas on H&E sections, avoiding foci of inflammation as well as foci of high mucin and low cellular content. Samples were deparaffinized in 300 µl of mineral oil (cat # 69794, Sigma-Aldrich, St. Louis, MO, USA) in a thermomixer at 800 rpm at 65 °C overnight before 100 µl of ATL lysis buffer (Qiagen, Hilden, Germany) and 20 µl Proteinase K was added followed by agitation at 450 rpm at 56 °C. After 8 h, another 20 µl of Proteinase K was added followed by incubation overnight. When digestion of the tissue pieces was complete, 120 µl of the lysate, i.e., the aequous phase, was transferred to a fresh 1.5 ml tube, treated with 2 µl RNAse A (100 mg/ml; Qiagen) for 5 min at room temperature and mixed with 600 µl PM buffer (Qiagen) and 10 µl of 3 M sodium acetate (pH 5.2; cat # R1181, Thermo Fisher Scientific, Waltham, MA, USA). Subsequent spin column purification was done using the QIAquick PCR Purification kit (Qiagen) according to the manufacturer’s protocol with minor modifications. DNA concentration and purity were measured by spectrophotometry (NanoDrop 1000 Spectrophotometer, NanoDrop products, Wilmington, DE). In addition, double-stranded DNA was quantified by a Qubit 3.0 fluorometer (Life Technologies, Thermo Fisher Scientific, Waltham, MA, USA) using the Qubit dsDNA BR (Broad Range) Assay Kit (Life Technologies). Biomed2 multiplex-PCR assay of differently sized amplicons (targeting 400, 300, 200, and 100 bp fragments of AFF1, PLZF, RAG1, and TBXAS1, respectively) was performed to visualize DNA fragment size distribution [38]. In addition, amplifiability of the isolated FFPE DNA was assessed by a qPCR assay (Illumina FFPE QC Kit for TruSeqAmplicon—Cancer Panel, Illumina Inc., San Diego, CA, USA) and delta Cq values were determined according to the manufacturer’s instructions.

Microsatellite PCR

Microsatellite PCR was done using a panel of five mononucleotide markers (BAT25, BAT26, NR-21, NR-24, MONO-27; cf. MSI Analysis System, Promega, Madison, WI, USA), and a panel of two mononucleotide (BAT25, BAT26) and three dinucleotide markers (D5S346, D2S123, D17S250; so-called Bethesda panel) as described previously [39].

Array-based comparative genomic hybridization

Array-based comparative genomic hybridization (aCGH) was performed as previously described using ULS labeling (Agilent) and SurePrint G3 CGH 4X180K microarrays (Agilent) [40]. Data were visualized and analyzed using Nexus Copy Number software version 9.0 (BioDiscovery, Inc., El Segundo, CA, USA). Arm-level somatic copy number alterations were defined as single alteration or an aggregate of alterations encompassing half or more (≥50%) of a chromosome arm [29].

The Cancer Genome Atlas (TCGA) copy number data retrieval and processing

SNP array-based copy number data from The Cancer Genome Atlas (TCGA) colorectal adenocarcinoma cohort [41] were retrieved for cases, for which the histotype was known (intestinal-type versus mucinous) and which were microsatellite stable/non-hypermutated (n = 144; n = 129 intestinal-type and n = 15 mucinous). Detailed selection criteria have been described previously [39]. Copy number data were subsequently analyzed and visualized using Nexus Copy Number software version 9.0 (BioDiscovery, Inc., El Segundo, CA, USA).

Targeted next generation sequencing

Libraries of FFPE isolated DNA were prepared using the TruSeq Amplicon—Cancer Panel (Illumina). TruSeq Amplicon—Cancer Panel spans mutational hotspots in >35 kilobases (kb) of target genomic sequence in the following genes: ABL1, AKT1, ALK, APC, ATM, BRAF, CDH1, CDKN2A, CSF1R, CTNNB1, EGFR, ERBB2, ERBB4, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, JAK2, JAK3, KDR, KIT, KRAS, MET, MLH1, MPL, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RB1, RET, SMAD4, SMARCB1, SMO, SRC, STK11, TP53, VHL. The PhiX control library (Illumina) was spiked in each run at a final concentration of 10 pM to estimate the sequencing error rate. The pooled libraries were paired-end (2 × 151) sequenced on a MiSeq instrument (Illumina). The mean read depth for targeted regions (mean coverage) was 642X. Alignment and variant calling were done using the BaseSpace TruSeq Amplicon App, Version 3.0.0 (Illumina) using the somatic variant caller. Annotation was done based on RefSeqGene (https://www.ncbi.nlm.nih.gov/refseq/rsg/) and variants were further evaluated based on dbSNP 147 (NCBI; https://www.ncbi.nlm.nih.gov/projects/SNP/), COSMIC database (Catalogue of Somatic Mutations in Cancer; http://cancer.sanger.ac.uk/cosmic) and ClinVar database [42,43,44]. Only non-synonymous variants were considered. Specific filtering criteria included (1) passed variant caller filters, (2) read depth ≥10 with a fraction of alternative reads ≥0.1, (3) ExAC frequency <0.001%. Reads were visualized using the Integrative Genomics Viewer (IGV, Broad Institute) [45, 46]. Mutation data was used to determine the phylogenetic relationship between different tumor samples from individual patients as described previously [39, 47]. All phylogenetic trees were drawn with a common trunk, representing the normal, i.e., diploid, genome.

Immunohistochemistry

Immunohistochemical staining of tumors and adjacent mucosa was performed as described previously [39]. Briefly, the following primary antibodies were used: MLH1 (1:25; clone ES05, cat # M3640, Dako, Agilent Pathology Solutions, Agilent), MSH2 (ready-to-use; clone FE11, cat # IR085, Dako), MSH6 (ready-to-use; clone EP49, cat # IR086, Dako), PMS2 (1:50; clone EP51, cat # M3647, Dako) and TP53 (1:50; clone DO-7, cat # M7001, Dako). Detection was done using the EnVision Detection System, Peroxidase/DAB, Rabbit/Mouse (cat # K5007, Dako). Immunohistochemical stainings were evaluated by two pathologists (DH, TG). Tumor samples lacking nuclear staining for one or more of the following four DNA mismatch repair proteins (MLH1, MSH2, MSH6, PMS2) were considered microsatellite instable while tumor samples with retained expression of all four markers in the tumor cells were considered microsatellite stable. In carcinomas and dysplastic lesions, strong or completely absent nuclear positivity of tumor cells was considered mutant while weak and heterogeneous positivity of tumor cell nuclei was considered a wild-type pattern. In non-dysplastic mucosa, TP53 staining was classified according to nuclear staining intensity and distribution of positive cells as published by Sato et al. [48], and Noffsinger et al. [49], respectively. Microscopy images were acquired with a digital slide scanner (MoticEasyScan One, Motic, Hongkong).

Results

Clinical and pathologic characteristics of UC-CRCs

Our cohort comprised 19 patients with UC who had developed CRC (Table 1). UC patients were diagnosed with CRC at a relatively young age (median 45 years) and had a long history of UC (median 18 years). Four of the 19 patients had synchronous CRCs, resulting in a total of 24 UC-CRCs (Supplementary Table S1). The UC-CRCs were distributed relatively evenly throughout the colorectum (42% right hemicolon, 25% left hemicolon, 33% rectum). Mucinous and/or signet ring cell histology was observed in 7 of 24 (29%) of the UC-CRCs (Supplementary Fig. S1). All UC-CRCs in our cohort were microsatellite stable as assessed by immunohistochemistry and PCR, respectively.

Table 1 Clinico-pathologic characteristics of cases of UC-associated CRCs.

Landscape of chromosomal aneuploidies in UC-CRCs

To determine genome alterations that characterize UC-CRCs, we performed aCGH and sequence analysis of a panel of 48 cancer-related genes. All 23 UC-CRC samples that could be successfully analyzed by aCGH showed genomic imbalances (Fig. 1a). The fraction of copy number altered genome in UC-CRC was comparable to sporadic CRC based on microsatellite stable/non-hypermutated cases with comparable histology, i.e., intestinal or mucinous type (P = 0.51; Mann–Whitney U test), but however, significantly higher compared to CD-CRC (P = 0.0012; Mann–Whitney U test) (Fig. 1b) [39]. The most common arm-level copy number gains mapped to chromosome arms 20q (17 of 23, 74%), 7p (14 of 23, 61%), 13q (13 of 23, 57%), 5p (12 of 23, 52%), 7q (10 of 23, 43%) and 8q (8 of 23, 35%) (Fig. 1c, Supplementary Fig. S2). Recurrent arm-level losses were observed on chromosome arms 18q (17 of 23, 74%), 22q (17 of 23, 74%), 17p (14 of 23, 61%), 5q (12 of 23, 52%), 8p (11 of 23, 48%), 4p (9 of 23, 39%), 4q (9 of 23, 39%) and 17q (9 of 23, 39%). Independent from histology (intestinal-type or mucinous), the gain of 5p (harbors e.g., TERT) was usually not observed in sporadic CRC (21 of 144, 15%), making it a distinctive feature of inflammation-related CRC (FDR adjusted P = 0.006; Fisherʼs exact test) (Supplementary Fig. S2). Also, losses of 5q (harbors e.g., APC) were less frequent in sporadic CRC (27 of 144, 19%) than in UC-CRC (FDR adjusted P = 0.04; Fisher’s exact test).

Fig. 1: Extent and pattern of genome-wide copy number alterations in ulcerative colitis–associated colorectal carcinomas.
figure 1

a Per sample analysis of CNAs in UC-CRCs, represented as heatmap showing arm-level gains (red) and losses (blue). All UC-CRCs were microsatellite stable. b Fraction of copy number altered genome of UC-CRCs compared to CRCs (both intestinal-type and mucinous) from the TCGA cohort [41] and CD-CRCs from our previous study [39]. All samples included in this comparison were microsatellite stable/non-hypermutated. **P ≤ 0.01; Mann–Whitney U test. c Cumulative copy number frequencies for UC-CRCs compared to CRCs from the TCGA cohort. All samples shown are microsatellite stable/non-hypermutated. Numbers below the graph (x-axis) denote chromosomes, frequency (y-axis) denotes the proportion of samples with a gain or a loss at the respective chromosomal position. CD Crohn’s disease, CNA copy number alteration, CRC colorectal carcinoma, UC ulcerative colitis.

Landscape of somatic mutations in UC-CRCs

Our panel-based sequencing analysis revealed TP53 as the most frequently mutated gene in 87% (20 of 23) of UC-CRCs (Fig. 2a). APC, PIK3CA, KRAS, and SMAD4 mutations were detected in 22% (5 of 23) of UC-CRCs, respectively. Mutations in other genes were detected in less than 10% of samples. Of note, one sample harbored an IDH1 mutation (p.R132C). The number of mutations per tumor varied from 1 up to 6 (median: 2 mutations per tumor). The majority of TP53 mutations occurred as missense mutations, which were located predominantly in the DNA binding domain of the protein (Supplementary Fig. S3). In line with the sequencing results, TP53 missense mutations showed a strong nuclear positivity for TP53 while a TP53 truncating mutation presented with a complete absence of staining (Fig. 2b). Taken together, the mutational landscape of UC-CRC was dominated by TP53 mutations while mutations in other genes were rare.

Fig. 2: Mutational landscape of ulcerative colitis–associated colorectal carcinomas.
figure 2

a “Oncoprint” showing all genes, in which mutations could be detected. Each row represents a clinicopathologic feature or gene, each column denotes an individual carcinoma sample. b Immunohistochemical staining of TP53 confirms TP53 mutation status. Abnormal overexpression, i.e., strongly intense staining in tumor cell nuclei (UC01CA, UC02CAis, UC07CA, UC18CA), or abnormal complete absence of expression within tumor cell nuclei (UC04CA) indicates TP53 mutation. AJCC American Joint Committee on Cancer, CRC colorectal cancer, UC ulcerative colitis.

Tumor evolution in ulcerative colitis

Using the four patients with synchronous UC-CRC from our cohort as an example, we aimed to gain further insight into the evolution of synchronous UC-CRCs and potential underlying field defects. In addition to genome-wide copy number profiling and targeted sequencing, we performed TP53 immunostaining of the transition zone from carcinoma to adjacent colorectal mucosa and mucosa samples from other colonic sites if available (Supplementary Fig. S4). This allowed us to confirm the carcinomas’ TP53 mutational status and to evaluate the presence of TP53 aberrant fields in adjacent and spatially separated colonic mucosa.

Interestingly, eight of nine synchronous UC-CRCs from the four individual patients had a mutation in TP53 though the specific TP53 mutations of the respective tumors were distinct with the exception of patient UC08 (Supplementary Fig. S5). This patient (44-year-old female, exact UC duration not known) had two carcinomas located in the transverse colon (Fig. 3). Both carcinomas shared an identical truncating mutation in TP53 (p.R213*), indicating that the two carcinomas emerged from the same mutant field. Carcinoma 1 (UC08CA1) subsequently acquired another TP53 mutation, this time as a missense mutation (p.R273H), which led to a nuclear accumulation of TP53. Nuclear accumulation could be visualized not only in the carcinoma itself but also in adjacent dysplasia (with villous transformation) and inflamed mucosa, again indicating tumor progression through field cancerization. Carcinoma 2 (UC08CA2) did not acquire another TP53 mutation but instead a mutation in APC. In contrast to patient UC08, carcinomas from the other patients with synchronous UC-CRCs (UC05, UC10, UC12) did not share any mutation, indicating multifocal disease and independent tumor progression, despite prevalent convergent evolution for mutations in TP53 (Supplementary Fig. S5). For instance, patient UC05 (41-year-old male, UC duration 21 years) harbored three carcinomas in the right hemicolon (Fig. 4). Two of the three carcinomas displayed a TP53 mutation (UC05CA1: p.P177L, UC05CA2: p.V197E) along with other mutations involving the TGF-beta, PI3K and RTK/RAS signaling pathways. In contrast, carcinoma 3 (UC05CA3) was solely characterized by an APC mutation targeting the WNT signaling pathway. Patient UC10 (45-year-old female, UC duration 26 years) had two carcinomas located in the rectum and sigmoid colon which both were TP53-mutated though the mutations were distinct (Fig. 5). While carcinoma 2 (UC10CA2) apparently emerged from a preexisting inflamed and dysplastic TP53 mutant field as revealed by histology and TP53 immunostaining, carcinoma 1 (UC10CA1) appeared to have developed de novo (i.e., without preceding dysplasia) or alternatively, might be fast growing and rapidly displaced its original localized dysplastic field. The latter would be consistent with the aggressive histologic phenotype of a primarily solid growth pattern and an invasive margin of the infiltrative type with tumor budding. Patient UC12 (58-year-old male, UC for 33 years) also harbored two carcinomas with distinct TP53 mutations in the rectum and rectosigmoid junction 1.2 cm apart from each other (Fig. 6). In carcinoma 2 (UC12CA2), small remnants of the original localized TP53 mutant field could be identified at the transition zone from carcinoma to adjacent mucosa, while in carcinoma 2 no abnormal TP53 staining was seen in the adjacent mucosa. Of note, 3.5 cm distal from carcinoma 2 (UC12CA2) another TP53 altered field could be revealed by TP53 immunostaining which has no direct contact with either carcinoma 1 or 2. This highlights the prevalence of multifocal tumor progression in UC patients.

Fig. 3: Tumor heterogeneity: spatial and inferred phylogenetic relationship of lesions of patient UC08.
figure 3

a (Left) The anatomic location of the analyzed samples is depicted along the colon frame. Selected somatic mutations and genome-wide copy number alterations are displayed with the respective histology. The outer circle indicates arm-level chromosomal gains (red) and losses (blue); chromosomes are labeled at the centromere. (Right) Corresponding inferred phylogenetic relationship of analyzed lesions. The different colors in the circles correspond to the color of the lesions on the left part. b Representative images of histology (H&E, left) and TP53 immunohistochemistry (right) of carcinoma 1 with adjacent dysplasia with villous architecture and inflamed mucosa, showing a ‘nested/diffuse pattern’ characterized by strongly positive nuclei aggregated in most areas of the crypts, sometimes confined to the basal half of the crypts, indicating TP53 mutation. Colors correspond to panel (a). CA carcinoma, DYS dysplasia, INF inflamed colonic mucosa.

Fig. 4: Tumor heterogeneity: spatial and inferred phylogenetic relationship of lesions of patient UC05.
figure 4

a (Left) The anatomic location of the analyzed samples is depicted along the colon frame. Selected somatic mutations and genome-wide copy number alterations are displayed with the respective histology. The outer circle indicates arm-level chromosomal gains (red) and losses (blue); chromosomes are labeled at the centromere. (Right) Corresponding inferred phylogenetic relationship of analyzed lesions. The different colors in the circles correspond to the color of the lesions on the left part. b Representative images of histology (H&E, left) and TP53 immunohistochemistry (right) of inflamed, non-dysplastic colonic mucosa from the left hemicolon, showing a “sporadic/scattered pattern” with only a few positive nuclei dispersed/focused in the crypts, indicating TP53 wild-type status. Arrows (→) indicate Paneth cells. CA carcinoma.

Fig. 5: Tumor heterogeneity: spatial and inferred phylogenetic relationship of lesions of patient UC10.
figure 5

a (Left) The anatomic location of the analyzed samples is depicted along the colon frame. Selected somatic mutations and genome-wide copy number alterations are displayed with the respective histology. The outer circle indicates arm-level chromosomal gains (red) and losses (blue); chromosomes are labeled at the centromere. (Right) Corresponding inferred phylogenetic relationship of analyzed lesions. The different colors in the circles correspond to the color of the lesions on the left part. b Representative images of histology (H&E, left) and TP53 immunohistochemistry (right) of carcinoma 1 and 2 with adjacent mucosa, showing a “diffuse pattern” characterized by strongly positive nuclei aggregated in most areas of the crypts, indicating TP53 mutation and emergence of invasive carcinoma 2 from an associated TP53 aberrant field. Colors correspond to panel (a). CA carcinoma, DYS dysplasia.

Fig. 6: Tumor heterogeneity: spatial and inferred phylogenetic relationship of lesions of patient UC12.
figure 6

a (Left) The anatomic location of the analyzed samples is depicted along the colon frame. Selected somatic mutations and genome-wide copy number alterations are displayed with the respective histology. The outer circle indicates arm-level chromosomal gains (red) and losses (blue); chromosomes are labeled at the centromere. (Right) Corresponding inferred phylogenetic relationship of analyzed lesions. The different colors in the circles correspond to the color of the lesions on the left part. b Representative images of TP53 immunohistochemistry of carcinoma 1 and 2 with adjacent mucosa, showing a ‘diffuse pattern’ characterized by strongly positive nuclei aggregated in most areas of the crypts, indicating TP53 mutation; remnants of single dysplastic crypts with abnormal TP53 staining indicate emergence of carcinoma 2 from a TP53 aberrant field. c Rectal mucosa with aberrant TP53 staining characterized by a ‘diffuse pattern’ with strongly positive nuclei aggregated in most areas of the crypts, however, without direct contact to carcinoma 1 or 2. CA carcinoma, DYS dysplasia, INF inflamed mucosa.

Similarly, we looked for field defects based on TP53 immunostaining in patients with single UC-CRCs (UC02, UC03, UC06, UC07, UC09, UC18). In all of these patients except for patient UC09, we could also observe gland formations with aberrant TP53 staining patterns in the colonic mucosa in spatial proximity to the invasive carcinoma (Supplementary Figs. S6–S11). Aberrant TP53 staining pattern presented as either ‘nested’ pattern, i.e., an aggregation of moderately to strongly positive cells confined to the basal half of the glands, or as ‘diffuse’ pattern, i.e., the presence of strongly positive cells in most areas of the glands, as described by Sato [48] and Noffsinger [49]. Taken together, we detected an aberrant TP53 staining in tumor adjacent colonic mucosa with or without dysplasia in eight of ten (80%) UC-CRC patients from our cohort. Unfortunately, for the other nine patients no tissue samples from non-dysplastic, inflamed mucosa was available for immunohistochemical analysis.

Discussion

While genomic alterations during carcinogenesis have been thoroughly studied in sporadic CRC, less is known about genomic alterations in UC-CRC development. In contrast to sporadic CRC, which typically arises from a well-circumscribed polypoid adenoma, UC-CRC usually develops from flat dysplastic lesions with ill-defined margins in a background of inflammation accompanied by epithelial regeneration, scarring and pseudopolyposis [12]. The macroscopic and microscopic heterogeneity of inflammatory and reactive changes complicates endoscopic detection and histological diagnosis of potential UC-CRC precursors or early malignant lesions [8, 9]. Several molecular (bio)markers have been proposed for early detection of neoplastic processes in UC including TP53 mutations [50], nuclear aneuploidy [25], chromosomal copy number gains and losses (i.e., chromosomal aneuploidies) [26], telomere shortening [51], epigenetic alterations [52] and microRNA alterations [53]. However, as of now, none of these markers is routinely used in the clinic and a deeper knowledge of the molecular alterations underlying UC-CRC development is needed to refine potentially useful molecular diagnostic markers.

Here, we applied high-resolution aCGH and targeted sequencing to analyze the genomic landscape of UC-CRCs. In addition, TP53 immunostaining was used to visualize the extent of morphologically occult tumor evolution.

While the presence of nuclear aneuploidy in non-dysplastic colonic mucosa from UC patients has been associated with an increased risk of progression to UC-CRC [25], specific CNAs or CNA burden might allow an improved risk assessment. The advent of molecular cytogenetics such as CGH [54, 55], spectral karyotyping [56] and multiplex fluorescence in situ hybridization [57] unraveled that most solid tumors do not only display nuclear aneuploidy, but moreover, are characterized by tumor-type specific non-random distributions of chromosomal aneuploidies, despite ongoing genomic instability [28, 30]. Chromosomal aneuploidies, i.e. losses and gains of chromosomes or chromosome arms, were present in all UC-CRC samples of our cohort. Overall, the distribution of arm-level CNAs resembled the spectrum in sporadic CRC and chromosomal aneuploidies mainly clustered on the same chromosomes as described for sporadic CRC [41, 58], although some differences in the frequency of individual CNAs were observed. Specifically, the gain of chromosome arm 5p was more prevalent in UC-CRC (12 of 23, 52%), often combined with a loss of 5q. In contrast, a gain of chromosome arm 5p was rarely observed in sporadic CRC (21 of 144, 15%) and the gain of 5p appears to be a distinctive feature of inflammation-related colorectal carcinogenesis. The gain of 5p is prevalent in both UC- and CD-related carcinogenesis and can already be detected in dysplastic lesions with reported frequencies of up to about 50% [15, 35, 59] and sometimes even be observed in non-dysplastic mucosa from IBD patients [39]. Sporadic adenomas, on the other hand, do not only have a lower CNA burden compared to IBD-related dysplastic lesions but in particular do virtually not display gains of chromosome arm 5p (<5%) [35, 60, 61]. Therefore, the tumor type specific non-random chromosomal aneuploidy patterns in tumors appear not only to be influenced by the tissue of origin [30, 31, 62] but also by the mode of tumor induction, i.e., disease etiology, such as chronic inflammation or radiation, and metabolic stress [63]. In line with older data on nuclear aneuploidy [25], preliminary low-pass whole genome sequencing data indicate that CNAs appear to be a potent predictor of UC progression risk from low-grade dysplasia to invasive carcinoma [64].

In addition to chromosomal aneuploidies, another frequently encountered genetic alteration in UC progression to UC-CRC is mutation of TP53. Our genetic profiling of UC-CRC development confirmed TP53 mutations as a key event of chronic inflammation-driven colorectal carcinogenesis occurring in 87% (20 of 23) of UC-CRC samples. This is in concordance with previous next generation sequencing based publications reporting TP53 mutation frequencies in UC-CRC of 57% (n = 7) [15], 63% (n = 16) [16], 53% (n = 15) [17], and 86% (n = 29) [18], respectively. Comparable TP53 mutation frequencies have also been observed in CD-CRC with 100% (n = 3) [15], 88% (n = 8) [16], 76% (n = 29) [39], 69% (n = 13) [17] and 94% (n = 18) [18], respectively. Mutations in other genes were infrequent in both UC-CRC and CD-CRC. Therefore, mutational alterations in UC and CD centered on the TP53 pathway while other pathways such as the WNT (e.g., APC, FBXW7, CTNNB1), TGFß (e.g., SMAD4), and RTK/RAS (e.g., KRAS, BRAF) were rarely affected.

Aneuploidy is a widespread phenomenon in IBD-related carcinogenesis and can sometimes be observed in colonic mucosa before histologically evident dysplasia [25]. Furthermore, it is closely correlated with TP53 mutation [50] and we presume that the need for TP53 alterations may be driven by the disruption of the G1 checkpoint by TP53 inactivation which permits survival of cells with chromosomal aneuploidies [65, 66].

Multifocal neoplasia is a common phenomenon in UC-CRC with ~38% of patients having synchronous CRCs and 16% having a spatially distinct dysplastic lesion in the colon resection specimen [67]. In our cohort, four of 19 patients (21%) had synchronous UC-CRCs. Although we only provide a limited picture of the genetic make-up based on panel sequencing, our data indicate that synchronous UC-CRCs from individual patients display a considerable genetic heterogeneity. In only one of the four patients, synchronous UC-CRCs shared a common ancestral mutation even though in all four patients the synchronous tumor lesions were located within the same or within neighboring colon segments, i.e., in relatively close spatial proximity. Despite the lack of a common ancestral mutation in three of the four patients, synchronous UC-CRCs showed a predilection for mutations in TP53, which were detected in eight of nine synchronous UC-CRCs. To summarize, we largely observed two different evolutionary patterns of genetic CRC progression in UC, which resemble our previous results from CD [39]. While some patients develop large precancerous fields with a common ancestral mutation (field cancerization) from which tumors can emerge, other patients develop multifocal tumors without shared mutations (tumor progression through clonal mosaicism). Despite multifocality, we observed a strong convergent evolution for TP53 mutations shaping the genomic alterations in UC-CRC in addition to chromosomal aneuploidies, underlining the critical role of TP53 in UC-related colorectal carcinogenesis. Interestingly, synchronous CRCs not associated with IBD show more variation in terms of affected driver genes, most frequently affecting APC, KRAS, TP53, and PIK3CA, and molecular alterations of lesions from the same patient are usually distinct, indicating independent development of synchronous non-IBD-related CRCs [68,69,70].

Immunostaining for TP53 confirmed TP53 mutation status of UC-CRCs. In addition, TP53 immunostaining allowed to screen for the presence of potential histologically occult tumor clones in non-dysplastic mucosa. Aberrant TP53 staining pattern according to Sato/Noffsinger [48, 49] was observed in colonic mucosa without dysplasia or with low-grade dysplasia in eight of ten UC-CRC patients from whom tissue samples were available for immunostaining. Among UC patients who underwent immediate colectomy due to dysplasia, the chance for identification of endoscopically undetected invasive UC-CRC is about 20–25% for low-grade and 40–50% for high-grade dysplasia [67, 71]. This highlights the multifocality of CRC development in UC, presumably as a consequence of field cancerization and increased genomic instability in colonic mucosa of these patients.

In conclusion, our data show that inflammatory bowel disease (IBD)-related CRC, whether in a background of UC or CD, follow similar oncogenetic pathways. The major genomic alterations represent TP53 mutations along with cancer-type specific chromosomal aneuploidies including the gain of chromosome arm 5p, which is prevalent in inflammation-related colorectal carcinogenesis. As described above, mounting evidence demonstrates that in IBD progression to cancer molecular changes precede the morphological changes associated with dysplasia. This offers the chance for improving current diagnostic algorithms and may better assist planning colonoscopic surveillance intervals than conventional histology alone [72]. However, how these molecular field defects may best be detected in UC patients remains to be established. In addition to UC-CRC-typical early genomic alterations (e.g., TP53) [73], more neutral markers identifying clonal expansions such as polyguanine tract genotyping have been proposed [74]. In the foreseeable future, whole genome sequencing will provide a comprehensive picture of both random and selected genomic alterations, and their clonal expansion underlying UC progression to cancer. For now, though not perfect, TP53 immunohistochemistry can be a useful, broadly available, easy to implement tool in routine diagnostics to support histopathological discrimination between UC-associated neoplasia and inflammatory regenerative colonic epithelium and to determine the extent of potential histologically occult neoplastic fields.