Genomic instability genes in lung and colon adenocarcinoma indicate organ specificity of transcriptomic impact on Copy Number Alterations

Rao, Chinthalapally V.; Xu, Chao; Zhang, Yuting; Asch, Adam S.; Yamada, Hiroshi Y.

doi:10.1038/s41598-022-15692-8

Download PDF

Article
Open access
Published: 11 July 2022

Genomic instability genes in lung and colon adenocarcinoma indicate organ specificity of transcriptomic impact on Copy Number Alterations

Chinthalapally V. Rao^1,3^nAff4,
Chao Xu²^nAff5,
Yuting Zhang¹^nAff6,
Adam S. Asch³^nAff7 &
…
Hiroshi Y. Yamada^1,3^nAff8

Scientific Reports volume 12, Article number: 11739 (2022) Cite this article

1691 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Genomic instability (GI) in cancer facilitates cancer evolution and is an exploitable target for therapy purposes. However, specific genes involved in cancer GI remain elusive. Causal genes for GI via expressions have not been comprehensively identified in colorectal cancers (CRCs). To fill the gap in knowledge, we developed a data mining strategy (Gene Expression to Copy Number Alterations; “GE-CNA”). Here we applied the GE-CNA approach to 592 TCGA CRC datasets, and identified 500 genes whose expression levels associate with CNA. Among these, 18 were survival-critical (i.e., expression levels correlate with significant differences in patients’ survival). Comparison with previous results indicated striking differences between lung adenocarcinoma and CRC: (a) less involvement of overexpression of mitotic genes in generating genomic instability in the colon and (b) the presence of CNA-suppressing pathways, including immune-surveillance, was only partly similar to those in the lung. Following 13 genes (TIGD6, TMED6, APOBEC3D, EP400NL, B3GNT4, ZNF683, FOXD4, FOXD4L1, PKIB, DDB2, MT1G, CLCN3, CAPS) were evaluated as potential drug development targets (hazard ratio [> 1.3 or < 0.5]). Identification of specific CRC genomic instability genes enables researchers to develop GI targeting approach. The new results suggest that the “targeting genomic instability and/or aneuploidy” approach must be tailored for specific organs.

Landscape of transcriptome variations uncovering known and novel driver events in colorectal carcinoma

Article Open access 16 January 2020

Integrated genomics and comprehensive validation reveal drivers of genomic evolution in esophageal adenocarcinoma

Article Open access 24 May 2021

Landscape of somatic single nucleotide variants and indels in colorectal cancer and impact on survival

Article Open access 20 July 2020

Introduction

Genomic instability in cancer affects cancer development and evolution, causing drug resistance and poor prognosis, thus impacting therapy outcomes in clinic^1,2,3. Hence, the “targeting genomic instability and/or aneuploidy for cancer therapy” concept has been proposed⁴. For contemporary targeted drug development, genomics information is critical⁵. Although some signatures for genomic instability in select organs were identified [e.g.,⁶], genes involved in genomic instability in cancer have been elusive, preventing researchers from designing specific agents for targeted therapies. Gene expression analysis of pan-cancer datasets indicated that mitotic signature increases and immune signature decreases were characteristics of high CNA cancers⁷, suggesting the roles of mitotic mis-regulation in generating CNA and of immune functions in antagonizing cancer cells with CNA. Although the notion of immunosurveillance of genomic instability and aneuploidy has long been proposed, few involved genes have been identified and the molecular mechanisms remain to be determined^8,9.

Results with transgenic mouse models from our and other laboratories have indicated dual effects of genomic instability in the body on cancer, for both tumor suppression and oncogenesis^10,11. Mitosis-targeting genomic instability models (Chromosome instability [CIN] models; e.g., Mad2, BubR1, Sgo1) have demonstrated the role of genomic instability as a disease modifier, resulting in tumor proneness in organs including the colon, lung, and liver later in life^{12,13,14,15,16,17}. Although genomic instability is prevalent in most solid tumors, based on the tumor profile in genomic instability transgenic mice, we hypothesized that genomic instability has prominent effects for cancer development and/or disease modification in the colon, liver, and lung¹⁸. To identify specific genes involved in genomic instability in human lung adenocarcinoma, we developed a novel data mining strategy, GE-CNA, which is an approach to identify all genes whose expression associates with increased or decreased tumor CNA¹⁸. Pathway analysis revealed that (a) amplification/insertion CNA is facilitated by over-expressions of DNA replication stressors and suppressed by a broad range of immune cells (T-, B-, NK-cells, leukocytes), and (b) deletion CNA is facilitated by over-expressions of mitotic regulator genes and suppressed predominantly by leukocytes guided by leukocyte extravasation signaling. Among the 39 CNA- and survival-associated genes, purine metabolism (PPAT, PAICS), immune-regulating CD4-LCK-MEC2C and CCL14-CCR1 axes, and ALOX5 emerged as survival-critical pathways. These pathways/genes are potential therapy drug targets for lung adenocarcinoma¹⁸.

With the lung cancer results, we continued the GE-CNA analysis with cancers in liver and colon, anticipating similar gene profile, thus common genes for targeting genomic instability, would emerge. As naturally-occurring polyploidization in liver complicating the CNA datasets and analysis, we focused on colon cancer. In the United States, colorectal cancer (CRC) is expected to cause about 52,580 deaths during 2022, and is the second most common cause of cancer deaths when cancer deaths for men and women are combined¹⁹. Thus, CRCs remain a major target for prevention and therapy development. In CRCs, tumor development is associated with progressive mutational accumulation, as indicated in the “Vogelgram”²⁰. Functional analysis of the frequently mutated genes indicated that each of the mutations in the gene (e.g., APC, TP53, FBXW7/hCDC4, PI3K-PTEN, K-RAS) can cause genomic instability, directly or indirectly²¹. Thus, a part of genomic instability in CRCs is linked to mutations in key oncogenic/tumor-suppressing genes. In addition, epigenetic modulations, environmental challenges from microbiota, and transcriptomic and microRNA changes, which are also suggested to affect genomic instability, were reported [e.g.,^{22,23,24,25,26}]. Among these events impacting genomic instability, transcriptomic alterations, especially over-expressions, are most feasible to manipulate with drugs, while restoring mutated genes is technically difficult. However, transcriptomic alterations associated with genomic instability in CRCs have not been comprehensively identified, and our understanding of the impact of the transcriptomic landscape on genomic instability in CRCs remains incomplete. Hence, we set out to apply the GE-CNA data mining approach to identify genes and pathways involved in genomic instability in CRCs via transcriptomic mis-regulations.

Materials and methods

GE-CNA analysis

We downloaded the Colorectal Adenocarcinoma (TCGA, PanCancer Atlas, 2018) datasets from cBioportal (https://www.cbioportal.org/study/summary?id=coadread_tcga_pan_can_atlas_2018)^27,28, a publicly available database. All following methods were carried out in accordance with relevant guidelines and regulations. The datasets included survival and clinical data for 594 patients. Among these patients, we also collected the available the gene expression profile and copy number alterations of 592 patients, and whole exome sequencing (WES) mutation profile of 528 patients. The batch normalized gene expression Z-scores by RSEM²⁹ from Illumina HiS-eq_RNASeqV2 were used. The downloaded copy-number alteration (CNA) was estimated by GISTIC 2.0³⁰. Neutral or no change CNA was indicated by 0. Gain/amplification CNA was indicated by a positive value, while a negative value indicated deletion CNA. Amplification CNAs and deletion CNAs were analyzed jointly and separately.

In the gene expression file, we had 20,471 genes of 592 subjects. We excluded 3073 genes that were missing in more than 1/3 of subjects. The included genes were complete in all subjects. We sorted each gene by its expression in all subjects and selected the top 10 and bottom 10 subjects. The selected subjects were assigned to a high expression group and a low expression group, accordingly. Next, we extracted the subjects’ CNA counts in the high and low expression groups from the CNA file. Student’s t-test was used to examine the difference in CNA counts in the high group vs. the low group. Multiple-testing was corrected by q-value³¹. The significance level was 0.05.

Further, we divided the significant genes into two groups: higher expression that resulted in more CNAs and higher expression that resulted in fewer CNAs. We employed the bioinformatics tool IPA (Ingenuity Pathway Analysis, QIAGEN, Inc., https://www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis) to conduct the gene set enrichment analyses³². The Benjamini–Hochberg corrected p-value³³ provided by IPA was reported and evaluated at the significance level of 0.05. Also, we presented the pathway graphs from IPA.

The survival analysis of the gene alteration with regard to the overall survival was examined by the Cox Proportional-Hazards (CoxPH) Model. Age and tumor stage were adjusted as covariates, which were selected by their univariate CoxPH analysis p-value < 0.05. All available variables, such as age, sex, race, and tumor stage, were considered. The race groups with small numbers of patients were combined. The race variable analyzed in CoxPH model had two levels: White and Other. The sub-levels of tumor stage under each stage of stages 1 to 4 were combined, which resulted in four levels used in the analysis. We excluded patients with incomplete data. The Hazard Ratio (HR) and p-value of the gene were reported. The definitions of “altered” and “unaltered” subjects were from cBioportal. Briefly, an altered subject was a subject having any type of high-level CNA amplification, CNA homozygous deletion, or WES mutation. Otherwise, a subject was considered an unaltered subject. We compared the difference in gene expression levels in the altered and unaltered groups using the Wilcoxon rank sum test. The significance level was 0.05. We presented the survival curves and boxplots by altered/unaltered group. We implemented all statistical analyses using R (v4.0.3) and R packages.

The major reason to only use extreme high and low gene expression groups is to increase the statistical power by enriching the presence and increasing the effect size of the causal genetic factors. 592 is not a large sample size to separate, thus we use all samples to maximize the study power.

To estimate the magnitude of HR, we employed the following categories: small (not trivial, but possibly inconsequential), medium (likely consequential), and large (very likely consequential) HRs comparing 2 groups would be approximately 1.3, 1.9, and 2.8, respectively³⁴.

Availability of data and materials

We obtained original tumor data from the cBioportal (https://www.cbioportal.org/study/summary?id=coadread_tcga_pan_can_atlas_2018)^27,28, which is a publicly available database. The data were openly available for download. Main data generated or analyzed during this study are included in this published article and its supplementary information files. All the datasets used and/or analyzed during the current study will be available from the corresponding author on reasonable request.

Results

We applied GE-CNA to 592 CRCs in the TCGA database (Fig. 1). Supplementary Table 1 shows 247 genes whose high expression associates with high tumor CNA, and thus are annotated as CNA facilitators. Functional denotation and pathway analysis indicated that (i) the genes are functionally diverse and (ii) there was no statistically significant enrichment (corrected P < 0.05) of a specific pathway. The lack of specific enrichment is a major difference from the previous results from lung adenocarcinoma that showed enrichment on mitotic regulators and DNA replication pathways¹⁸.

Supplementary Table 2 shows 253 genes whose high expression associates with low tumor CNA, and thus are annotated as CNA suppressors. The enriched pathways (corrected P < 0.05) were: Interferon Signaling (BAK1, BCL2, IFIT3, IFNG, JAK2, STAT2), Antigen Presentation Pathway (CLIP, MHC II-alpha), Heme Biosynthesis II (ALAS1, CPOX, FECH), Natural Killer Cell Signaling (HSPA5, IFNG, IL15, JAK2, KIR2DL4, MAP2K1, MTOR, NCR1, ULBP3), Retinoic acid Mediated Apoptosis Signaling (TRAIL-R, PARP), JAK/Stat Signaling (JAK2, MAP2K1, MTOR, PIAS2, SOCS6, STAT2), Glucocorticoid Receptor Signaling (HSP90, HSP70, NCOR, TFIIA, OXPHOS), Heme Biosynthesis from Uroporphyrinogen-III I (CPOX, FECH), and Glutathione Redox Reactions II (GSR, PDIA3) (Fig. 2. pathway analysis of CNA suppressors). The functions of the pathways are (i) immune function and its regulation (Interferon signaling, Antigen Presentation, Natural Killer cell signaling); (ii) growth signaling (JAK/STAT, Glucocorticoid receptor); (iii) apoptosis (Retinoic acid); (iv) Heme biosynthesis II (ALAS1, CPOX, FECH); and (v) Glutathione redox signaling.

To obtain further mechanistic insight on CNA generation/suppression in CRC, we questioned whether amplification/insertion CNA and deletion CNA are differentially affected by different sets of genes. In lung adenocarcinoma, amplification/insertion CNA was facilitated by 161 genes whose main functions are involved in the DNA replication and repair pathways, suggesting that amplification/insertion CNA is predominantly driven by MIN or CIN caused by DNA replication stress¹⁸. In contrast, deletion CNA was associated with 187 genes that were enriched with known mitotic regulators, suggesting a link between mitotic errors and deletion CNA in lung adenocarcinoma. In CRCs, we identified 28 genes associated with amplification/insertion CNA increases (Amp/ins CNA facilitators; Supplementary Table 3), and 20 genes associated with deletion CNA increases (Deletion CNA facilitators; Supplementary Table 4). The number of identified genes is several-fold fewer than those in the lung, and the genes were not significantly concentrated in particular pathways, nor were the same genes identified in lung adenocarcinoma, indicating organ specificity in the profile. Yet, there are limited similarities; a few of the genes in Supplementary Table 3 and 4 are indeed involved in DNA metabolism and/or mismatch repair. For example, ASTE1/HT001 encodes a nuclease associated with MIN^35,36,37. Recently, ASTE1 was identified as a downstream effector of the shieldin complex and a structure-specific DNA endonuclease that specifically cleaves single-stranded DNA and 3′ overhang DNA³⁸. DNASE1 encodes Deoxyribonuclease1, which may be involved in clearance of cell-free DNA that serves as circulating tumor marker as well as playing a role in SLE pathogenesis³⁹. Genes involved in RNA metabolism are also noted. DDX27 encodes a putative RNA helicase. PRPF6 encodes pre-mRNA processing factor 6. RPS6KA6 encodes ribosomal protein S6 kinase A6, a kinase downstream to the ERK/MAPK pathway, and is being investigated as an inhibition target for various cancers ⁴⁰. SMG5 encodes SMG5 nonsense-mediated mRNA decay factor, which is thought to provide a link to the mRNA degradation machinery involving exonucleolytic pathways ⁴¹. Therefore, nucleic acid metabolism emerged as a factor affecting CNA in CRC.

The CNA suppressor genes in Supplementary Table 2 were further subcategorized to amplification/insertion CNA suppressors (Supplementary Table 5) and deletion CNA suppressors (Supplementary Table 6). Supplementary Table 5 includes only 23 genes, and Supplementary Table 6 includes 253 genes, suggesting that CRC cells with amplification/insertion CNA and deletion CNA may be suppressed through different modalities, which agrees with results from lung adenocarcinoma. Pathway analysis indicated that (a) amplification/insertion CNA suppressor genes show enrichment in Maturity Onset Diabetes of Young (MODY) Signaling (FABP2, GAPDH), NADH Repair (GAPDH), and Heme Biosynthesis from Uroporphyrinogen-III I (FECH) pathways; and (b) deletion CNA suppressor genes show enrichment in Antigen Presentation Pathway (Fig. 2A), Interferon Signaling (Fig. 2B), Heme Biosynthesis II, Natural Killer Cell Signaling, Retinoic acid Mediated Apoptosis Signaling, JAK/Stat Signaling (Fig. 2C), Glucocorticoid Receptor Signaling, Heme Biosynthesis from Uroporphyrinogen-III I, and Glutathione Redox Reactions II pathways. The enrichment profiles suggest that cells with amplification/insertion CNA are suppressed with metabolic modulations, while cells with deletion CNA are targeted by immune cells and/or by growth and cell death-related signaling, also affected by redox signaling.

The notable differences in pathway profiling results between lung adenocarcinoma and CRC led us to hypothesize that the total number of CNA is different between lung adenocarcinoma and CRC; one of the cancer types would show higher CNA. We compared total CNA numbers by cancer stages (Fig. 3A). In both cancers, cancer CNA increases over stages. In all types of CNA, in all stages, lung adenocarcinoma showed higher CNA than did CRC. The differences were significant in stages 1, 2, and 3 (corrected P < 0.05). Only in stage 4, due to an increase of CNA in CRC, did the gap in CNA numbers shrink to a non-significant level (Bonferroni corrected p-value = 0.13). The results were the same for amplification/insertion CNA (Fig. 3B) and for deletion CNA (Fig. 3C); CNA were consistently higher in lung adenocarcinoma than in CRC, regardless of the type. Based on the gene profile differences and CNA numbers between lung adenocarcinoma and CRC, we suspect that (a) major CNA generation mechanisms vary among cancers; (b) a transcriptome-driven mechanism is dominant in lung adenocarcinoma, while a mutation-driven mechanism is prominent in CRC; and (c) a transcriptome-driven mechanism of CNA generation is more aggressive than a mutation-driven mechanism.

The genes whose expression levels are associated with CNA are all potential targets to modulate genomic instability, which would affect therapy outcome. However, even if modulation of the gene expression can curtail genomic instability, if the modulation does not affect patients’ survival, the modulation approach would be futile. With this reasoning, we applied secondary screening, searching for genes whose expression levels are also significantly associated with survival rate of patients (P < 0.05). The secondary screening to identify genes whose expression levels were associated both with CNA and survival rate (i.e., “survival-critical”) yielded 11 genes from 247 CNA facilitators in Supplementary Table 1, and 16 genes from 253 CNA suppressors in Supplementary Table 2 (Table 1, Table 2). As indicated in Table 1, all the 27 select “survival-critical” genes showed significant differences in average CNA/CNV between high expressor and low expressor.

Table 1 Data for Gene Expression and Copy Number Alteration (GE-CNA) on initially-identified 27 “survival critical” genes.

Full size table

Table 2 List of 18 (27) survival critical genes.

Full size table

The 11 CNA facilitator-survival critical genes were CAPS, CCDC115, ATP6AP1. NBEAP1, SPANXC, TIGD6, C7ORF13, TMEM184A, F8A1, LZTS3, and OLMALINC. Notably, three of these (CAPS/calcyphosin, CCDC115/coiled-coil domain containing 115, ATP6AP1/ATPase + transporting accessory protein 1) are involved in ion transport and/or vacuolar ATPase (V-ATPase), and two (TMEM184A/Transmembrane protein 184A, F8A1/ Coagulation Factor VIII Associated 1) are involved in vesicle transport. Together, these genes suggest a novel survival-critical role of Golgi trafficking in CRC and in CNA management. Two (SPANXC/SPANX family member C, and C7ORF13 [LINC01006]/long intergenic non-protein coding RNA1006) are normally expressed in a testis-specific manner, and their expressions in gastric cancers are associated with EMT, migration, and metastasis^41,42,43. TIGD6 (Tigger Transposable Element derived 6) is a DNA-mediated transposon with similarity to a centromere component Cenp B. Based on the Cenp B homology, TIGD6 expression was suspected to interfere with mitotic fidelity and structural integrity of the genome. However, no strong centromere binding of TIGD6-EGFP fusion protein was observed, although binding on the chromosome arms and a low level of binding at centromeres were seen ⁴⁴. Thus, how TIGD6 affects genomic stability currently remains unclear.

The 16 CNA suppressor-survival critical genes were WARS, FOXD4L1, VWA5B2, DDB2, EPOR, ROBO3, PKIB, TMED6, APOBEC3D, B3GNT4, CLCN3, FOXD4, ZNF683, EP400P1, KLHDC7B, and MT1G. Among these, involvement of EPOR (Erythropoietin receptor; involved in JAK2-MAPK/ PI3K/ STAT signaling), DDB2 (Damage specific RNA binding protein 2; involved in UV damage repair and Xeroderma), ROBO3 (Roundabout guidance receptor 3; involved in migration or neurite outgrowth), and MT1G (Metallothionein 1G; involved in protection against oxidative stress and metals) in various cancers is well-documented with hundreds of publications. Three are transcription factors (FOXD4L1; Forkhead Box D4 Like 1, FOXD4; Forkhead Box D4, ZNF683; Zinc Finger Protein 683). Three are transmembrane proteins involved in trafficking (TMED6; Transmembrane p24 trafficking protein 6, B3GNT4; UDP glcNAc betaGal 1,3-N-acetylglucosaminyl transferase 4, CLCN3; Chloride voltage-gated channel 3). Three are immunomodulators (ZNF683, WARS; Tryptophanyl-tRNA synthase1, APOBEC3D; Apolipoprotein B mRNA editing enzyme catalytic subunit 3D). The APOBEC family of enzymes are single-stranded DNA (ssDNA) cytosine-to-uracil (C-to-U) deaminases and are involved in HIV-1 restriction and in mutational generation in cancer. As such, APOBEC enzymes have been proposed as targets for virus and cancer therapy via hypomutation, and small molecule inhibitors are under development⁴⁵. Four are involved in growth regulation (EPOR, PKIB, KLHDC7B, MT1G).

Next, we used tumor data to analyze expression alteration (“altered” vs. “not altered”; definition in Methods section) and hazard ratio (HR), and tested whether expression alteration correlates with survival (see Methods for estimate on HR magnitude³⁴. Generally, medium-large HR is > 1.3). The correlations were categorized as (a) lower altered expression with improved survival, (b) higher altered expression with improved survival, (c) lower altered expression with decreased survival, and (d) higher altered expression with decreased survival (Fig. 4). From the standpoint of drug development, developing inhibitor(s) for genes in category (a) or (d) would be most feasible, while developing enhancer(s) of a gene or its function to target categories (b) or (c) remains difficult. For category (a), decreased TIGD6 or TMED6 expression were each associated with improved survival (HR 1.16204E-07 [TMED6], 0.455 [TIGD6]) (Table1; Fig. 4A). For category (b), higher altered expression of DDB2 (HR 2.86E-06), WARS (HR 0.788), or KLHDC7B (HR 0.881) was associated with improved survival (Fig. 4B). As DDB2, WARS, and KLHDC7B are assessed functionally as CNA suppressors, increased expression may be antagonizing high genomic instability. For category (c), decreased MT1G (HR 2.478), CLCN3 (HR 3.564), or CAPS (HR 1.908) expression was associated with poorer survival (Fig. 4C). For category (d), with APOBEC3D (HR 4.55), EP400NL (HR 3.792), B3GNT4 (HR 2.354), ZNF683 (HR 1.957), FOXD4 (HR 1.788), FOXD4L1 (HR 1.426), or PKIB (HR 1.468), higher altered expression was associated with decreased survival (Fig. 4D). On the other hand, ROBO3 is a gene whose overexpression was consistently observed in CRC, and its possible involvement in EMT and malignant progression has been reported^46,47. Yet, overexpression of ROBO3 showed only small effects on survival in CRCs (HR 1.058). This finding suggests that the amount of ROBO3 expression alone may not be a strong indicator of benefit or disadvantage for survival in CRCs (Fig. 4E). Overall, this analysis identified nine potential target genes (medium-large HR [> 1.3]; TIGD6, TMED6, APOBEC3D, EP400NL, B3GNT4, ZNF683, FOXD4, FOXD4L1, PKIB) for inhibitor development, and four genes (DDB2, MT1G, CLCN3, CAPS) for enhancer development.

Discussion

At the onset of this project, we anticipated that a similar profile between lung and colon would emerge and a set of genomic instability genes common among cancers would be identified. This expectation was based on (a) pan-cancer analysis of oncogenes that indicated recurring sets of oncogenic pathways common among various cancers (e.g., kras, TP53), and (b) extrapolation from previous pan-cancer analysis of CNA-associated pathways⁷. However, the results were surprising: (a) less involvement of over-expressions of mitotic genes in generating genomic instability in the colon, and (b) the presence of CNA-suppressing pathways, including immune-surveillance, were only partly similar to those in the lung. The results suggest that generation and suppression mechanisms of tumor genomic instability depend on the organ, and that therapeutic modalities targeting genomic instability must be tailored for the target organ.

Although CNA suppression pathways were only partly similar, common to lung and colon were the Antigen Presentation, Interferon Signaling, and Natural Killer Cell Signaling pathways, suggesting the presence of both common/non-organ specific and organ-specific immune components for genomic instability surveillance. This observation may extend to a basis for developing highly organ-specific cancer immuno-prevention or therapies.

This study identified RNA metabolism regulators (e.g., DDX27, PRPF6, SMG5) as influencers of genomic instability in CRC. A mechanistic link between RNA regulators and genomic instability had not been fully explained. Recently, in pancreatic cancer, mRNA regulators/RNA-binding splicing factors were identified as methylation targets of PRMT1 (Protein Arginine Methyl Transferase 1). Inhibition of the methylation via specific inhibitor affects splicing site selection and functional protein expression of the downstream targets. Many of the downstream target proteins, including Cyclin D, were cell cycle and proliferation regulators. Thus, PRMT1 inhibition indirectly caused growth-static effects and genomic instability⁴⁸. We speculate that transcriptomic disturbance of RNA metabolism genes may affect genomic stability in CRC in a similar, indirect mechanism.

Suggesting the validity of this GE-CNA approach, many of the identified pathways are also pathways that have been identified in cancer (chemo) prevention and therapy studies, including apoptosis, Redox signaling, JAK-STAT signaling, and inflammation pathways. The Heme biosynthesis pathway, however, is under-investigated in cancer. As it is newly identified with this unbiased approach, further study is warranted. Regarding MODY signaling, the potential link between diabetes and cancer has been a subject of interest. Meta-analysis indicated that type 2 diabetes (T2D) was associated with incidence of several cancers, especially prostate and liver cancer, and with mortality from pancreatic cancer. In bias analyses, the proportion of studies with a true effect size larger than a RR of 1.1 (i.e., 10% increased risk in individuals with T2D) was nearly 100% for liver, pancreatic, and endometrial cancer; 86% for gallbladder cancer; 67% for kidney cancer; 64% for colon cancer; and 62% for colorectal cancer⁴⁹, indicating a modest level of positive association between CRC and diabetes. However, microsatellite instability was reported to be inversely associated with T2D in CRC⁵⁰. The inverse association between diabetes and MIN-CRC corroborates with our discovery of MODY signaling as suppressor of amplification/insertion CNA, a MIN trait.

Other genes/pathways of interest include APOBEC3 (HR4.6), due to the strong HR, and B3GNT4 (HR2.4), due to its relation to mucin function. APOBEC3D encodes double-domain deaminase and is a member of the APOBEC3 family genes⁵¹. APOBEC3 proteins form Apolipoprotein B Editing Complex and mediate intrinsic responses to infection by retroviruses [e.g., HIV⁵²,], but also can act as a strong mutagenic factor⁵³. In breast cancer, expression of APOBEC3B is increased and associated with mutation load and poor outcome, while high APOBEC3C-H expression was linked to favorable prognostic benefit for both cancer progression and mortality⁵⁴. A recent study showed causal relationship between APOBEC3B induction and DNA replication stress and CIN in early breast and lung cancer evolution⁵⁵. Our results with APOBEC3D likely indicate a parallel with APOBEC3B in breast cancer, a mutagenic activity of APOBEC3D in CRCs, and suggest survival benefit with a specific inhibitor of APOBEC3D.

B3GNT4 is a member of the B3GNT family, which is a transmembrane Golgi enzyme that catalyzes the transfer of N-acetyl glucosamine from UDP-GlcNAc onto Gal beta 3 (GlcNAc beta 6) GalNAc-mucin. The enzymes function in the elongation and branching of O-linked oligosaccharide chains of mucin glycoproteins, thus the complete functional maturation of mucins. Mucins play pivotal mucosal barrier functions in the intestine, and their dysfunction is associated with colitis and CRC^56,57. However, only limited reports portray the importance of mucin maturation enzymes or their value in cancer drug development⁵⁸. B3GNT3 was reported as a novel marker correlated with metastasis and poor clinical outcome in cervical cancer⁵⁹, but to our knowledge this is the first report of potential clinical significance for B3GNT4 in cancers.

Overall, the present study identified genomic instability genes via transcriptomic alterations in CRC, which is an unbiased portrait of genes that may or may not have been identified through previous hypothesis-driven studies. Indeed, this study identified CIN and MIN genes as predicted, as well as a number of genes whose mechanism of generating genomic instability is yet to be investigated. The new results from CRC allows us to compare the profile with that of lung adenocarcinoma. The comparison indicated organ specificity in genes influencing tumor genomic instability and suggests the value of a tailored approach for targeting genomic instability. We identified nine genes whose inhibition may lead to better survival (HR > 1.3; TIGD6, TMED6, APOBEC3D, EP400NL, B3GNT4, ZNF683, FOXD4, FOXD4L1, PKIB) and four genes for which an enhancer may benefit CRC patients’ survival (DDB2, MT1G, CLCN3, CAPS) via genomic instability modulation. These 13 genes with potential clinical relevance carry diverse functions, thus implicating multiple pathways leading to genomic instability rather than single central network affecting genomic instability. With promising target genes identified, further drug development is warranted.

References

McGranahan, N. & Swanton, C. Clonal heterogeneity and tumor evolution: Past, present, and the future. Cell 168, 613–628 (2017).
Article CAS PubMed Google Scholar
Bakhoum, S. F. & Cantley, L. C. The multifaceted role of chromosomal instability in cancer and its microenvironment. Cell 174, 1347–1360 (2018).
Article CAS PubMed PubMed Central Google Scholar
Turajlic, S., Sottoriva, A., Graham, T. & Swanton, C. Resolving genetic heterogeneity in cancer. Nat. Rev. Genet. 20, 404–416 (2019).
Article CAS PubMed Google Scholar
Santaguida, S. et al. Chromosome mis-segregation generates cell-cycle-arrested cells with complex karyotypes that are eliminated by the immune system. Dev. Cell 41(638–51), e5 (2017).
Google Scholar
Zeggini, E., Gloyn, A. L., Barton, A. C. & Wain, L. V. Translational genomics and precision medicine: Moving from the lab to the clinic. Science 365, 1409–1413 (2019).
Article ADS CAS PubMed Google Scholar
Ren, Z., Wang, Z., Gu, D., Ma, H., Zhu, Y., Cai, M., et al. Genome instability and long noncoding RNA reveal biomarkers for immunotherapy and prognosis and novel competing endogenous RNA mechanism in colon adenocarcinoma. Front. Cell Dev. Biol. 9, 740455. https://doi.org/10.3389/fcell.2021.740455 (2021).
Article ADS PubMed PubMed Central Google Scholar
Davoli, T., Uno, H., Wooten, E. C. & Elledge, S. J. Tumor aneuploidy correlates with markers of immune evasion and with reduced response to immunotherapy. Science 355, eaaf8399 (2017).
Article PubMed PubMed Central CAS Google Scholar
López-Soto, A., Gonzalez, S., López-Larrea, C. & Kroemer, G. Immunosurveillance of malignant cells with complex karyotypes. Trends Cell Biol. 27, 880–884 (2017).
Article PubMed CAS Google Scholar
Senovilla, L. et al. An anticancer therapy-elicited immunosurveillance system that eliminates tetraploid cells. Oncoimmunology 2, e22409 (2013).
Article PubMed PubMed Central Google Scholar
Shoshani, O. et al. Transient genomic instability drives tumorigenesis through accelerated clonal evolution. Genes Dev. 35, 1093–1108 (2021).
Article PubMed PubMed Central Google Scholar
Silk, A. D. et al. Chromosome missegregation rate predicts whether aneuploidy will promote or suppress tumors. Proc. Natl. Acad. Sci. 110, E4134–E4141 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Dai, W. et al. Slippage of mitotic arrest and enhanced tumor development in mice with BubR1 haploinsufficiency. Can. Res. 64, 440–445 (2004).
Article CAS Google Scholar
Schvartzman, J.-M., Sotillo, R. & Benezra, R. Mitotic chromosomal instability and cancer: Mouse modelling of the human disease. Nat. Rev. Cancer 10, 102–115 (2010).
Article CAS PubMed PubMed Central Google Scholar
Simon, J. E., Bakker, B., Foijer, F. CINcere modelling: What have mouse models for chromosome instability taught us? Recent Results Cancer Res. 200, 39–60. https://doi.org/10.1007/978-3-319-20291-4_2 (2015).
Article CAS PubMed Google Scholar
Yamada, H. et al. Systemic chromosome instability in Shugoshin-1 mice resulted in compromised glutathione pathway, activation of Wnt signaling and defects in immune system in the lung. Oncogenesis 5, e256-e (2016).
Article CAS Google Scholar
Yamada, H. Y. et al. Haploinsufficiency of SGO1 results in deregulated centrosome dynamics, enhanced chromosomal instability and colon tumorigenesis. Cell Cycle 11, 479–488 (2012).
Article CAS PubMed PubMed Central Google Scholar
Yamada, H. Y. et al. Tumor-promoting/progressing role of additional chromosome instability in hepatic carcinogenesis in Sgo1 (Shugoshin 1) haploinsufficient mice. Carcinogenesis 36, 429–440 (2015).
Article CAS PubMed PubMed Central Google Scholar
Rao, C. V. et al. Survival-critical genes associated with copy number alterations in lung adenocarcinoma. Cancers 13, 2586 (2021).
Article CAS PubMed PubMed Central Google Scholar
Siegel, R. L., Miller, K. D., Fuchs, H. E. & Jemal, A. Cancer statistics, 2022. CA: A Cancer J. Clin. 72(1), 7–33. https://doi.org/10.3322/caac.21708 (2022).
Article Google Scholar
Fearon, E. R. & Vogelstein, B. A genetic model for colorectal tumorigenesis. Cell 61, 759–67 (1990).
Article CAS PubMed Google Scholar
Rao, C. V. & Yamada, H. Y. Genomic instability and colon carcinogenesis: From the perspective of genes. Front. Oncol. 3, 130 (2013).
Article PubMed PubMed Central Google Scholar
Carethers, J. M. & Jung, B. H. Genetics and genetic biomarkers in sporadic colorectal cancer. Gastroenterology 149(1177–90), e3 (2015).
Google Scholar
Fiorentini, C. et al. Gut microbiota and colon cancer: A role for bacterial protein toxins?. Int. J. Mol. Sci. 21, 6201 (2020).
Article CAS PubMed Central Google Scholar
Grady, W. M. & Carethers, J. M. Genomic and epigenetic instability in colorectal cancer pathogenesis. Gastroenterology 135, 1079–1099 (2008).
Article CAS PubMed Google Scholar
Kerachian, M. A. & Kerachian, M. Long interspersed nucleotide element-1 (LINE-1) methylation in colorectal cancer. Clin. Chim. Acta 488, 209–214 (2019).
Article CAS PubMed Google Scholar
Wang, X., Yang, Y. & Huycke, M. M. Microbiome-driven carcinogenesis in colorectal cancer: Models and mechanisms. Free Radical Biol. Med. 105, 3–15 (2017).
Article CAS Google Scholar
Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signaling 6, l1 (2013).
Article CAS Google Scholar
Cerami, E. et al. The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2(5), 401–404. https://doi.org/10.1158/2159-8290.CD-12-0095 (2012).
Article PubMed Google Scholar
Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010).
Article PubMed CAS Google Scholar
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, 1–14 (2011).
Article CAS Google Scholar
Storey, J. D., Bass, A. J., Dabney, A., Robinson D. qvalue: Q-value estimation for false discovery rate control. R package version 2.10.0. http://github.com/jdstorey/qvalue (2015).
Krämer, A., Green, J., Pollard, J. Jr. & Tugendreich, S. Causal analysis approaches in ingenuity pathway analysis. Bioinformatics 30, 523–530 (2014).
Article PubMed CAS Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57, 289–300 (1995).
MathSciNet MATH Google Scholar
Azuero, A. A note on the magnitude of hazard ratios. Cancer 122, 1298–1299 (2016).
Article PubMed Google Scholar
Tougeron, D. et al. Tumor-infiltrating lymphocytes in colorectal cancers with microsatellite instability are correlated with the number and spectrum of frameshift mutations. Mod. Pathol. 22, 1186–1195 (2009).
Article CAS PubMed Google Scholar
Staffa, L. et al. Mismatch repair-deficient crypt foci in Lynch syndrome–molecular alterations and association with clinical parameters. PLoS ONE 10, e0121980 (2015).
Article PubMed PubMed Central CAS Google Scholar
Zhao, F. et al. ASTE1 promotes shieldin-complex-mediated DNA repair by attenuating end resection. Nat. Cell Biol. 23, 894–904 (2021).
Article CAS PubMed Google Scholar
Han, D. S. & Lo, Y. D. The nexus of cfDNA and nuclease biology. Trends Genet. 37, 758–770 (2021).
Article CAS PubMed Google Scholar
Fang, Y.-Y. et al. Clinicopathological significance of ribosomal protein S6 kinase A6 in lung squamous cell carcinoma: An immunohistochemical and RNA-seq study. Int. J. Clin. Exp. Pathol. 11, 1318 (2018).
PubMed PubMed Central Google Scholar
Boehm, V. et al. SMG5-SMG7 authorize nonsense-mediated mRNA decay by enabling SMG6 endonucleolytic activity. Nat. Commun. 12, 1–19 (2021).
Article CAS Google Scholar
Zhang, Y., Liu, H., Zhang, Q. & Zhang, Z. Long noncoding RNA LINC01006 facilitates cell proliferation, migration, and epithelial-mesenchymal transition in lung adenocarcinoma via targeting the MicroRNA 129-2-3p/CTNNB1 axis and activating Wnt/β-Catenin signaling pathway. Mol. Cell. Biol. 41, e00380-e420 (2021).
Article CAS PubMed Central Google Scholar
Yang, P., Huo, Z., Liao, H. & Zhou, Q. Cancer/testis antigens trigger epithelial-mesenchymal transition and genesis of cancer stem-like cells. Curr. Pharm. Des. 21, 1292–1300 (2015).
Article CAS PubMed Google Scholar
Song, Y., Wang, S. & Cheng, X. LINC01006 regulates the proliferation, migration and invasion of hepatocellular carcinoma cells through regulating miR-433-3p/CBX3 axis. Ann. Hepatol. 25, 100343 (2021).
Article CAS PubMed Google Scholar
Marshall, O. J. & Choo, K. Putative CENP-B paralogues are not present at mammalian centromeres. Chromosoma 121, 169–179 (2012).
Article CAS PubMed Google Scholar
Olson, M. E., Harris, R. S. & Harki, D. A. APOBEC enzymes as targets for virus and cancer therapy. Cell Chem. Biol. 25, 36–49 (2018).
Article CAS PubMed Google Scholar
Han, S. et al. ROBO3 promotes growth and metastasis of pancreatic carcinoma. Cancer Lett. 366, 61–70 (2015).
Article CAS PubMed Google Scholar
Jiang, Z. et al. Targeting the SLIT/ROBO pathway in tumor progression: Molecular mechanisms and therapeutic perspectives. Ther. Adv. Med. Oncol. 11, 1758835919855238 (2019).
Article CAS PubMed PubMed Central Google Scholar
Giuliani, V. et al. PRMT1-dependent regulation of RNA metabolism and DNA damage response sustains pancreatic ductal adenocarcinoma. Nat. Commun. 12, 1–19 (2021).
Article CAS Google Scholar
Ling, S. et al. Association of type 2 diabetes with cancer: A meta-analysis with bias analysis for unmeasured confounding in 151 cohorts comprising 32 million people. Diabetes Care 43, 2313–2322 (2020).
Article PubMed Google Scholar
Nakayama, Y. et al. Microsatellite instability is inversely associated with type 2 diabetes mellitus in colorectal cancer. PLoS ONE 14, e0215513 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ikeda, T., Yue, Y., Shimizu, R. & Nasser, H. Potential utilization of APOBEC3-mediated mutagenesis for an HIV-1 functional cure. Front. Microbiol. 12, 1417 (2021).
Article Google Scholar
Anderson, J. L. & Hope, T. J. APOBEC3G restricts early HIV-1 replication in the cytoplasm of target cells. Virology 375, 1–12 (2008).
Article CAS PubMed Google Scholar
Swanton, C., McGranahan, N., Starrett, G. J. & Harris, R. S. APOBEC enzymes: Mutagenic fuel for cancer evolution and heterogeneity. Cancer Discov. 5, 704–712 (2015).
Article CAS PubMed PubMed Central Google Scholar
Asaoka, M., Patnaik, S. K., Ishikawa, T. & Takabe, K. Different members of the APOBEC3 family of DNA mutators have opposing associations with the landscape of breast cancer. Am. J. Cancer Res. 11, 5111 (2021).
CAS PubMed PubMed Central Google Scholar
Venkatesan, S. et al. Induction of APOBEC3 exacerbates DNA replication stress and chromosomal instability in early breast and lung cancer evolution. Cancer Discov. 11, 2456–2473 (2021).
Article PubMed PubMed Central Google Scholar
Grondin, J. A., Kwon, Y. H., Far, P. M., Haq, S. & Khan, W. I. Mucins in intestinal mucosal defense and inflammation: Learning from clinical and experimental studies. Front. Immunol. 11, 2054 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pelaseyed, T. et al. The mucus and mucins of the goblet cells and enterocytes provide the first defense line of the gastrointestinal tract and interact with the immune system. Immunol. Rev. 260, 8–20 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cullen, P. J. Post-translational regulation of signaling mucins. Curr. Opin. Struct. Biol. 21, 590–596 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, W., Hou, T., Niu, C., Song, L. & Zhang, Y. B3GNT3 expression is a novel marker correlated with pelvic lymph node metastasis and poor clinical outcome in early-stage cervical cancer. PLoS ONE 10, e0144360 (2015).
Article PubMed PubMed Central CAS Google Scholar

Download references

Funding

This work was supported by the Kerley-Cade chair fund (OUHSC) to CVR and the research support fund (Stephenson Cancer Center) and the bridge grant (Presbyterian Health Foundation of Oklahoma City) to HYY.

Author information

Chinthalapally V. Rao
Present address: , 975 NE 10th St., BRC1203, Oklahoma City, OK, 73104, USA
Chao Xu
Present address: , 801 Northeast 13th Street, Room 321, P.O. Box 26901, Oklahoma City, OK, 73190, USA
Yuting Zhang
Present address: , 975 NE 10th St. BRC1209, Oklahoma City, OK, 73104, USA
Adam S. Asch
Present address: , 800 NE 10th St., 6th Floor, Oklahoma City, OK, 73104, USA
Hiroshi Y. Yamada
Present address: , 975 NE 10th St., BRC1207, Oklahoma City, OK, 73104, USA

Authors and Affiliations

Department of Medicine, Hematology/Oncology Section, Center for Cancer Prevention and Drug Development, University of Oklahoma Health Sciences Center (OUHSC), Oklahoma City, OK, USA
Chinthalapally V. Rao, Yuting Zhang & Hiroshi Y. Yamada
Hudson College of Public Health, University of Oklahoma Health Sciences Center (OUHSC), Oklahoma City, OK, USA
Chao Xu
Stephenson Cancer Center, University of Oklahoma Health Sciences Center (OUHSC), Oklahoma City, OK, USA
Chinthalapally V. Rao, Adam S. Asch & Hiroshi Y. Yamada

Authors

Chinthalapally V. Rao
View author publications
You can also search for this author in PubMed Google Scholar
Chao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Adam S. Asch
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Y. Yamada
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.V.R. (the acquisition, analysis; interpretation of data; substantively revised draft). C.X. (conception or design of the work; the acquisition, analysis; interpretation of data; the creation of new software used in the work; have drafted the work). Y.Z. (the acquisition, analysis). A.S.A. (the acquisition, analysis). H.Y.Y. (conception or design of the work; the acquisition, analysis; interpretation of data; the creation of new software used in the work; have drafted the work; substantively revised it).

Corresponding authors

Correspondence to Chinthalapally V. Rao or Hiroshi Y. Yamada.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Supplementary Information 3.

Supplementary Information 4.

Supplementary Information 5.

Supplementary Information 6.

Supplementary Information 7.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rao, C.V., Xu, C., Zhang, Y. et al. Genomic instability genes in lung and colon adenocarcinoma indicate organ specificity of transcriptomic impact on Copy Number Alterations. Sci Rep 12, 11739 (2022). https://doi.org/10.1038/s41598-022-15692-8

Download citation

Received: 21 March 2022
Accepted: 28 June 2022
Published: 11 July 2022
DOI: https://doi.org/10.1038/s41598-022-15692-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.