Abstract
Genomic instability (GI) in cancer facilitates cancer evolution and is an exploitable target for therapy purposes. However, specific genes involved in cancer GI remain elusive. Causal genes for GI via expressions have not been comprehensively identified in colorectal cancers (CRCs). To fill the gap in knowledge, we developed a data mining strategy (Gene Expression to Copy Number Alterations; “GE-CNA”). Here we applied the GE-CNA approach to 592 TCGA CRC datasets, and identified 500 genes whose expression levels associate with CNA. Among these, 18 were survival-critical (i.e., expression levels correlate with significant differences in patients’ survival). Comparison with previous results indicated striking differences between lung adenocarcinoma and CRC: (a) less involvement of overexpression of mitotic genes in generating genomic instability in the colon and (b) the presence of CNA-suppressing pathways, including immune-surveillance, was only partly similar to those in the lung. Following 13 genes (TIGD6, TMED6, APOBEC3D, EP400NL, B3GNT4, ZNF683, FOXD4, FOXD4L1, PKIB, DDB2, MT1G, CLCN3, CAPS) were evaluated as potential drug development targets (hazard ratio [> 1.3 or < 0.5]). Identification of specific CRC genomic instability genes enables researchers to develop GI targeting approach. The new results suggest that the “targeting genomic instability and/or aneuploidy” approach must be tailored for specific organs.
Similar content being viewed by others
Introduction
Genomic instability in cancer affects cancer development and evolution, causing drug resistance and poor prognosis, thus impacting therapy outcomes in clinic1,2,3. Hence, the “targeting genomic instability and/or aneuploidy for cancer therapy” concept has been proposed4. For contemporary targeted drug development, genomics information is critical5. Although some signatures for genomic instability in select organs were identified [e.g.,6], genes involved in genomic instability in cancer have been elusive, preventing researchers from designing specific agents for targeted therapies. Gene expression analysis of pan-cancer datasets indicated that mitotic signature increases and immune signature decreases were characteristics of high CNA cancers7, suggesting the roles of mitotic mis-regulation in generating CNA and of immune functions in antagonizing cancer cells with CNA. Although the notion of immunosurveillance of genomic instability and aneuploidy has long been proposed, few involved genes have been identified and the molecular mechanisms remain to be determined8,9.
Results with transgenic mouse models from our and other laboratories have indicated dual effects of genomic instability in the body on cancer, for both tumor suppression and oncogenesis10,11. Mitosis-targeting genomic instability models (Chromosome instability [CIN] models; e.g., Mad2, BubR1, Sgo1) have demonstrated the role of genomic instability as a disease modifier, resulting in tumor proneness in organs including the colon, lung, and liver later in life12,13,14,15,16,17. Although genomic instability is prevalent in most solid tumors, based on the tumor profile in genomic instability transgenic mice, we hypothesized that genomic instability has prominent effects for cancer development and/or disease modification in the colon, liver, and lung18. To identify specific genes involved in genomic instability in human lung adenocarcinoma, we developed a novel data mining strategy, GE-CNA, which is an approach to identify all genes whose expression associates with increased or decreased tumor CNA18. Pathway analysis revealed that (a) amplification/insertion CNA is facilitated by over-expressions of DNA replication stressors and suppressed by a broad range of immune cells (T-, B-, NK-cells, leukocytes), and (b) deletion CNA is facilitated by over-expressions of mitotic regulator genes and suppressed predominantly by leukocytes guided by leukocyte extravasation signaling. Among the 39 CNA- and survival-associated genes, purine metabolism (PPAT, PAICS), immune-regulating CD4-LCK-MEC2C and CCL14-CCR1 axes, and ALOX5 emerged as survival-critical pathways. These pathways/genes are potential therapy drug targets for lung adenocarcinoma18.
With the lung cancer results, we continued the GE-CNA analysis with cancers in liver and colon, anticipating similar gene profile, thus common genes for targeting genomic instability, would emerge. As naturally-occurring polyploidization in liver complicating the CNA datasets and analysis, we focused on colon cancer. In the United States, colorectal cancer (CRC) is expected to cause about 52,580 deaths during 2022, and is the second most common cause of cancer deaths when cancer deaths for men and women are combined19. Thus, CRCs remain a major target for prevention and therapy development. In CRCs, tumor development is associated with progressive mutational accumulation, as indicated in the “Vogelgram”20. Functional analysis of the frequently mutated genes indicated that each of the mutations in the gene (e.g., APC, TP53, FBXW7/hCDC4, PI3K-PTEN, K-RAS) can cause genomic instability, directly or indirectly21. Thus, a part of genomic instability in CRCs is linked to mutations in key oncogenic/tumor-suppressing genes. In addition, epigenetic modulations, environmental challenges from microbiota, and transcriptomic and microRNA changes, which are also suggested to affect genomic instability, were reported [e.g.,22,23,24,25,26]. Among these events impacting genomic instability, transcriptomic alterations, especially over-expressions, are most feasible to manipulate with drugs, while restoring mutated genes is technically difficult. However, transcriptomic alterations associated with genomic instability in CRCs have not been comprehensively identified, and our understanding of the impact of the transcriptomic landscape on genomic instability in CRCs remains incomplete. Hence, we set out to apply the GE-CNA data mining approach to identify genes and pathways involved in genomic instability in CRCs via transcriptomic mis-regulations.
Materials and methods
GE-CNA analysis
We downloaded the Colorectal Adenocarcinoma (TCGA, PanCancer Atlas, 2018) datasets from cBioportal (https://www.cbioportal.org/study/summary?id=coadread_tcga_pan_can_atlas_2018)27,28, a publicly available database. All following methods were carried out in accordance with relevant guidelines and regulations. The datasets included survival and clinical data for 594 patients. Among these patients, we also collected the available the gene expression profile and copy number alterations of 592 patients, and whole exome sequencing (WES) mutation profile of 528 patients. The batch normalized gene expression Z-scores by RSEM29 from Illumina HiS-eq_RNASeqV2 were used. The downloaded copy-number alteration (CNA) was estimated by GISTIC 2.030. Neutral or no change CNA was indicated by 0. Gain/amplification CNA was indicated by a positive value, while a negative value indicated deletion CNA. Amplification CNAs and deletion CNAs were analyzed jointly and separately.
In the gene expression file, we had 20,471 genes of 592 subjects. We excluded 3073 genes that were missing in more than 1/3 of subjects. The included genes were complete in all subjects. We sorted each gene by its expression in all subjects and selected the top 10 and bottom 10 subjects. The selected subjects were assigned to a high expression group and a low expression group, accordingly. Next, we extracted the subjects’ CNA counts in the high and low expression groups from the CNA file. Student’s t-test was used to examine the difference in CNA counts in the high group vs. the low group. Multiple-testing was corrected by q-value31. The significance level was 0.05.
Further, we divided the significant genes into two groups: higher expression that resulted in more CNAs and higher expression that resulted in fewer CNAs. We employed the bioinformatics tool IPA (Ingenuity Pathway Analysis, QIAGEN, Inc., https://www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis) to conduct the gene set enrichment analyses32. The Benjamini–Hochberg corrected p-value33 provided by IPA was reported and evaluated at the significance level of 0.05. Also, we presented the pathway graphs from IPA.
The survival analysis of the gene alteration with regard to the overall survival was examined by the Cox Proportional-Hazards (CoxPH) Model. Age and tumor stage were adjusted as covariates, which were selected by their univariate CoxPH analysis p-value < 0.05. All available variables, such as age, sex, race, and tumor stage, were considered. The race groups with small numbers of patients were combined. The race variable analyzed in CoxPH model had two levels: White and Other. The sub-levels of tumor stage under each stage of stages 1 to 4 were combined, which resulted in four levels used in the analysis. We excluded patients with incomplete data. The Hazard Ratio (HR) and p-value of the gene were reported. The definitions of “altered” and “unaltered” subjects were from cBioportal. Briefly, an altered subject was a subject having any type of high-level CNA amplification, CNA homozygous deletion, or WES mutation. Otherwise, a subject was considered an unaltered subject. We compared the difference in gene expression levels in the altered and unaltered groups using the Wilcoxon rank sum test. The significance level was 0.05. We presented the survival curves and boxplots by altered/unaltered group. We implemented all statistical analyses using R (v4.0.3) and R packages.
The major reason to only use extreme high and low gene expression groups is to increase the statistical power by enriching the presence and increasing the effect size of the causal genetic factors. 592 is not a large sample size to separate, thus we use all samples to maximize the study power.
To estimate the magnitude of HR, we employed the following categories: small (not trivial, but possibly inconsequential), medium (likely consequential), and large (very likely consequential) HRs comparing 2 groups would be approximately 1.3, 1.9, and 2.8, respectively34.
Availability of data and materials
We obtained original tumor data from the cBioportal (https://www.cbioportal.org/study/summary?id=coadread_tcga_pan_can_atlas_2018)27,28, which is a publicly available database. The data were openly available for download. Main data generated or analyzed during this study are included in this published article and its supplementary information files. All the datasets used and/or analyzed during the current study will be available from the corresponding author on reasonable request.
Results
We applied GE-CNA to 592 CRCs in the TCGA database (Fig. 1). Supplementary Table 1 shows 247 genes whose high expression associates with high tumor CNA, and thus are annotated as CNA facilitators. Functional denotation and pathway analysis indicated that (i) the genes are functionally diverse and (ii) there was no statistically significant enrichment (corrected P < 0.05) of a specific pathway. The lack of specific enrichment is a major difference from the previous results from lung adenocarcinoma that showed enrichment on mitotic regulators and DNA replication pathways18.
Supplementary Table 2 shows 253 genes whose high expression associates with low tumor CNA, and thus are annotated as CNA suppressors. The enriched pathways (corrected P < 0.05) were: Interferon Signaling (BAK1, BCL2, IFIT3, IFNG, JAK2, STAT2), Antigen Presentation Pathway (CLIP, MHC II-alpha), Heme Biosynthesis II (ALAS1, CPOX, FECH), Natural Killer Cell Signaling (HSPA5, IFNG, IL15, JAK2, KIR2DL4, MAP2K1, MTOR, NCR1, ULBP3), Retinoic acid Mediated Apoptosis Signaling (TRAIL-R, PARP), JAK/Stat Signaling (JAK2, MAP2K1, MTOR, PIAS2, SOCS6, STAT2), Glucocorticoid Receptor Signaling (HSP90, HSP70, NCOR, TFIIA, OXPHOS), Heme Biosynthesis from Uroporphyrinogen-III I (CPOX, FECH), and Glutathione Redox Reactions II (GSR, PDIA3) (Fig. 2. pathway analysis of CNA suppressors). The functions of the pathways are (i) immune function and its regulation (Interferon signaling, Antigen Presentation, Natural Killer cell signaling); (ii) growth signaling (JAK/STAT, Glucocorticoid receptor); (iii) apoptosis (Retinoic acid); (iv) Heme biosynthesis II (ALAS1, CPOX, FECH); and (v) Glutathione redox signaling.
To obtain further mechanistic insight on CNA generation/suppression in CRC, we questioned whether amplification/insertion CNA and deletion CNA are differentially affected by different sets of genes. In lung adenocarcinoma, amplification/insertion CNA was facilitated by 161 genes whose main functions are involved in the DNA replication and repair pathways, suggesting that amplification/insertion CNA is predominantly driven by MIN or CIN caused by DNA replication stress18. In contrast, deletion CNA was associated with 187 genes that were enriched with known mitotic regulators, suggesting a link between mitotic errors and deletion CNA in lung adenocarcinoma. In CRCs, we identified 28 genes associated with amplification/insertion CNA increases (Amp/ins CNA facilitators; Supplementary Table 3), and 20 genes associated with deletion CNA increases (Deletion CNA facilitators; Supplementary Table 4). The number of identified genes is several-fold fewer than those in the lung, and the genes were not significantly concentrated in particular pathways, nor were the same genes identified in lung adenocarcinoma, indicating organ specificity in the profile. Yet, there are limited similarities; a few of the genes in Supplementary Table 3 and 4 are indeed involved in DNA metabolism and/or mismatch repair. For example, ASTE1/HT001 encodes a nuclease associated with MIN35,36,37. Recently, ASTE1 was identified as a downstream effector of the shieldin complex and a structure-specific DNA endonuclease that specifically cleaves single-stranded DNA and 3′ overhang DNA38. DNASE1 encodes Deoxyribonuclease1, which may be involved in clearance of cell-free DNA that serves as circulating tumor marker as well as playing a role in SLE pathogenesis39. Genes involved in RNA metabolism are also noted. DDX27 encodes a putative RNA helicase. PRPF6 encodes pre-mRNA processing factor 6. RPS6KA6 encodes ribosomal protein S6 kinase A6, a kinase downstream to the ERK/MAPK pathway, and is being investigated as an inhibition target for various cancers 40. SMG5 encodes SMG5 nonsense-mediated mRNA decay factor, which is thought to provide a link to the mRNA degradation machinery involving exonucleolytic pathways 41. Therefore, nucleic acid metabolism emerged as a factor affecting CNA in CRC.
The CNA suppressor genes in Supplementary Table 2 were further subcategorized to amplification/insertion CNA suppressors (Supplementary Table 5) and deletion CNA suppressors (Supplementary Table 6). Supplementary Table 5 includes only 23 genes, and Supplementary Table 6 includes 253 genes, suggesting that CRC cells with amplification/insertion CNA and deletion CNA may be suppressed through different modalities, which agrees with results from lung adenocarcinoma. Pathway analysis indicated that (a) amplification/insertion CNA suppressor genes show enrichment in Maturity Onset Diabetes of Young (MODY) Signaling (FABP2, GAPDH), NADH Repair (GAPDH), and Heme Biosynthesis from Uroporphyrinogen-III I (FECH) pathways; and (b) deletion CNA suppressor genes show enrichment in Antigen Presentation Pathway (Fig. 2A), Interferon Signaling (Fig. 2B), Heme Biosynthesis II, Natural Killer Cell Signaling, Retinoic acid Mediated Apoptosis Signaling, JAK/Stat Signaling (Fig. 2C), Glucocorticoid Receptor Signaling, Heme Biosynthesis from Uroporphyrinogen-III I, and Glutathione Redox Reactions II pathways. The enrichment profiles suggest that cells with amplification/insertion CNA are suppressed with metabolic modulations, while cells with deletion CNA are targeted by immune cells and/or by growth and cell death-related signaling, also affected by redox signaling.
The notable differences in pathway profiling results between lung adenocarcinoma and CRC led us to hypothesize that the total number of CNA is different between lung adenocarcinoma and CRC; one of the cancer types would show higher CNA. We compared total CNA numbers by cancer stages (Fig. 3A). In both cancers, cancer CNA increases over stages. In all types of CNA, in all stages, lung adenocarcinoma showed higher CNA than did CRC. The differences were significant in stages 1, 2, and 3 (corrected P < 0.05). Only in stage 4, due to an increase of CNA in CRC, did the gap in CNA numbers shrink to a non-significant level (Bonferroni corrected p-value = 0.13). The results were the same for amplification/insertion CNA (Fig. 3B) and for deletion CNA (Fig. 3C); CNA were consistently higher in lung adenocarcinoma than in CRC, regardless of the type. Based on the gene profile differences and CNA numbers between lung adenocarcinoma and CRC, we suspect that (a) major CNA generation mechanisms vary among cancers; (b) a transcriptome-driven mechanism is dominant in lung adenocarcinoma, while a mutation-driven mechanism is prominent in CRC; and (c) a transcriptome-driven mechanism of CNA generation is more aggressive than a mutation-driven mechanism.
The genes whose expression levels are associated with CNA are all potential targets to modulate genomic instability, which would affect therapy outcome. However, even if modulation of the gene expression can curtail genomic instability, if the modulation does not affect patients’ survival, the modulation approach would be futile. With this reasoning, we applied secondary screening, searching for genes whose expression levels are also significantly associated with survival rate of patients (P < 0.05). The secondary screening to identify genes whose expression levels were associated both with CNA and survival rate (i.e., “survival-critical”) yielded 11 genes from 247 CNA facilitators in Supplementary Table 1, and 16 genes from 253 CNA suppressors in Supplementary Table 2 (Table 1, Table 2). As indicated in Table 1, all the 27 select “survival-critical” genes showed significant differences in average CNA/CNV between high expressor and low expressor.
The 11 CNA facilitator-survival critical genes were CAPS, CCDC115, ATP6AP1. NBEAP1, SPANXC, TIGD6, C7ORF13, TMEM184A, F8A1, LZTS3, and OLMALINC. Notably, three of these (CAPS/calcyphosin, CCDC115/coiled-coil domain containing 115, ATP6AP1/ATPase + transporting accessory protein 1) are involved in ion transport and/or vacuolar ATPase (V-ATPase), and two (TMEM184A/Transmembrane protein 184A, F8A1/ Coagulation Factor VIII Associated 1) are involved in vesicle transport. Together, these genes suggest a novel survival-critical role of Golgi trafficking in CRC and in CNA management. Two (SPANXC/SPANX family member C, and C7ORF13 [LINC01006]/long intergenic non-protein coding RNA1006) are normally expressed in a testis-specific manner, and their expressions in gastric cancers are associated with EMT, migration, and metastasis41,42,43. TIGD6 (Tigger Transposable Element derived 6) is a DNA-mediated transposon with similarity to a centromere component Cenp B. Based on the Cenp B homology, TIGD6 expression was suspected to interfere with mitotic fidelity and structural integrity of the genome. However, no strong centromere binding of TIGD6-EGFP fusion protein was observed, although binding on the chromosome arms and a low level of binding at centromeres were seen 44. Thus, how TIGD6 affects genomic stability currently remains unclear.
The 16 CNA suppressor-survival critical genes were WARS, FOXD4L1, VWA5B2, DDB2, EPOR, ROBO3, PKIB, TMED6, APOBEC3D, B3GNT4, CLCN3, FOXD4, ZNF683, EP400P1, KLHDC7B, and MT1G. Among these, involvement of EPOR (Erythropoietin receptor; involved in JAK2-MAPK/ PI3K/ STAT signaling), DDB2 (Damage specific RNA binding protein 2; involved in UV damage repair and Xeroderma), ROBO3 (Roundabout guidance receptor 3; involved in migration or neurite outgrowth), and MT1G (Metallothionein 1G; involved in protection against oxidative stress and metals) in various cancers is well-documented with hundreds of publications. Three are transcription factors (FOXD4L1; Forkhead Box D4 Like 1, FOXD4; Forkhead Box D4, ZNF683; Zinc Finger Protein 683). Three are transmembrane proteins involved in trafficking (TMED6; Transmembrane p24 trafficking protein 6, B3GNT4; UDP glcNAc betaGal 1,3-N-acetylglucosaminyl transferase 4, CLCN3; Chloride voltage-gated channel 3). Three are immunomodulators (ZNF683, WARS; Tryptophanyl-tRNA synthase1, APOBEC3D; Apolipoprotein B mRNA editing enzyme catalytic subunit 3D). The APOBEC family of enzymes are single-stranded DNA (ssDNA) cytosine-to-uracil (C-to-U) deaminases and are involved in HIV-1 restriction and in mutational generation in cancer. As such, APOBEC enzymes have been proposed as targets for virus and cancer therapy via hypomutation, and small molecule inhibitors are under development45. Four are involved in growth regulation (EPOR, PKIB, KLHDC7B, MT1G).
Next, we used tumor data to analyze expression alteration (“altered” vs. “not altered”; definition in Methods section) and hazard ratio (HR), and tested whether expression alteration correlates with survival (see Methods for estimate on HR magnitude34. Generally, medium-large HR is > 1.3). The correlations were categorized as (a) lower altered expression with improved survival, (b) higher altered expression with improved survival, (c) lower altered expression with decreased survival, and (d) higher altered expression with decreased survival (Fig. 4). From the standpoint of drug development, developing inhibitor(s) for genes in category (a) or (d) would be most feasible, while developing enhancer(s) of a gene or its function to target categories (b) or (c) remains difficult. For category (a), decreased TIGD6 or TMED6 expression were each associated with improved survival (HR 1.16204E-07 [TMED6], 0.455 [TIGD6]) (Table1; Fig. 4A). For category (b), higher altered expression of DDB2 (HR 2.86E-06), WARS (HR 0.788), or KLHDC7B (HR 0.881) was associated with improved survival (Fig. 4B). As DDB2, WARS, and KLHDC7B are assessed functionally as CNA suppressors, increased expression may be antagonizing high genomic instability. For category (c), decreased MT1G (HR 2.478), CLCN3 (HR 3.564), or CAPS (HR 1.908) expression was associated with poorer survival (Fig. 4C). For category (d), with APOBEC3D (HR 4.55), EP400NL (HR 3.792), B3GNT4 (HR 2.354), ZNF683 (HR 1.957), FOXD4 (HR 1.788), FOXD4L1 (HR 1.426), or PKIB (HR 1.468), higher altered expression was associated with decreased survival (Fig. 4D). On the other hand, ROBO3 is a gene whose overexpression was consistently observed in CRC, and its possible involvement in EMT and malignant progression has been reported46,47. Yet, overexpression of ROBO3 showed only small effects on survival in CRCs (HR 1.058). This finding suggests that the amount of ROBO3 expression alone may not be a strong indicator of benefit or disadvantage for survival in CRCs (Fig. 4E). Overall, this analysis identified nine potential target genes (medium-large HR [> 1.3]; TIGD6, TMED6, APOBEC3D, EP400NL, B3GNT4, ZNF683, FOXD4, FOXD4L1, PKIB) for inhibitor development, and four genes (DDB2, MT1G, CLCN3, CAPS) for enhancer development.
Discussion
At the onset of this project, we anticipated that a similar profile between lung and colon would emerge and a set of genomic instability genes common among cancers would be identified. This expectation was based on (a) pan-cancer analysis of oncogenes that indicated recurring sets of oncogenic pathways common among various cancers (e.g., kras, TP53), and (b) extrapolation from previous pan-cancer analysis of CNA-associated pathways7. However, the results were surprising: (a) less involvement of over-expressions of mitotic genes in generating genomic instability in the colon, and (b) the presence of CNA-suppressing pathways, including immune-surveillance, were only partly similar to those in the lung. The results suggest that generation and suppression mechanisms of tumor genomic instability depend on the organ, and that therapeutic modalities targeting genomic instability must be tailored for the target organ.
Although CNA suppression pathways were only partly similar, common to lung and colon were the Antigen Presentation, Interferon Signaling, and Natural Killer Cell Signaling pathways, suggesting the presence of both common/non-organ specific and organ-specific immune components for genomic instability surveillance. This observation may extend to a basis for developing highly organ-specific cancer immuno-prevention or therapies.
This study identified RNA metabolism regulators (e.g., DDX27, PRPF6, SMG5) as influencers of genomic instability in CRC. A mechanistic link between RNA regulators and genomic instability had not been fully explained. Recently, in pancreatic cancer, mRNA regulators/RNA-binding splicing factors were identified as methylation targets of PRMT1 (Protein Arginine Methyl Transferase 1). Inhibition of the methylation via specific inhibitor affects splicing site selection and functional protein expression of the downstream targets. Many of the downstream target proteins, including Cyclin D, were cell cycle and proliferation regulators. Thus, PRMT1 inhibition indirectly caused growth-static effects and genomic instability48. We speculate that transcriptomic disturbance of RNA metabolism genes may affect genomic stability in CRC in a similar, indirect mechanism.
Suggesting the validity of this GE-CNA approach, many of the identified pathways are also pathways that have been identified in cancer (chemo) prevention and therapy studies, including apoptosis, Redox signaling, JAK-STAT signaling, and inflammation pathways. The Heme biosynthesis pathway, however, is under-investigated in cancer. As it is newly identified with this unbiased approach, further study is warranted. Regarding MODY signaling, the potential link between diabetes and cancer has been a subject of interest. Meta-analysis indicated that type 2 diabetes (T2D) was associated with incidence of several cancers, especially prostate and liver cancer, and with mortality from pancreatic cancer. In bias analyses, the proportion of studies with a true effect size larger than a RR of 1.1 (i.e., 10% increased risk in individuals with T2D) was nearly 100% for liver, pancreatic, and endometrial cancer; 86% for gallbladder cancer; 67% for kidney cancer; 64% for colon cancer; and 62% for colorectal cancer49, indicating a modest level of positive association between CRC and diabetes. However, microsatellite instability was reported to be inversely associated with T2D in CRC50. The inverse association between diabetes and MIN-CRC corroborates with our discovery of MODY signaling as suppressor of amplification/insertion CNA, a MIN trait.
Other genes/pathways of interest include APOBEC3 (HR4.6), due to the strong HR, and B3GNT4 (HR2.4), due to its relation to mucin function. APOBEC3D encodes double-domain deaminase and is a member of the APOBEC3 family genes51. APOBEC3 proteins form Apolipoprotein B Editing Complex and mediate intrinsic responses to infection by retroviruses [e.g., HIV52,], but also can act as a strong mutagenic factor53. In breast cancer, expression of APOBEC3B is increased and associated with mutation load and poor outcome, while high APOBEC3C-H expression was linked to favorable prognostic benefit for both cancer progression and mortality54. A recent study showed causal relationship between APOBEC3B induction and DNA replication stress and CIN in early breast and lung cancer evolution55. Our results with APOBEC3D likely indicate a parallel with APOBEC3B in breast cancer, a mutagenic activity of APOBEC3D in CRCs, and suggest survival benefit with a specific inhibitor of APOBEC3D.
B3GNT4 is a member of the B3GNT family, which is a transmembrane Golgi enzyme that catalyzes the transfer of N-acetyl glucosamine from UDP-GlcNAc onto Gal beta 3 (GlcNAc beta 6) GalNAc-mucin. The enzymes function in the elongation and branching of O-linked oligosaccharide chains of mucin glycoproteins, thus the complete functional maturation of mucins. Mucins play pivotal mucosal barrier functions in the intestine, and their dysfunction is associated with colitis and CRC56,57. However, only limited reports portray the importance of mucin maturation enzymes or their value in cancer drug development58. B3GNT3 was reported as a novel marker correlated with metastasis and poor clinical outcome in cervical cancer59, but to our knowledge this is the first report of potential clinical significance for B3GNT4 in cancers.
Overall, the present study identified genomic instability genes via transcriptomic alterations in CRC, which is an unbiased portrait of genes that may or may not have been identified through previous hypothesis-driven studies. Indeed, this study identified CIN and MIN genes as predicted, as well as a number of genes whose mechanism of generating genomic instability is yet to be investigated. The new results from CRC allows us to compare the profile with that of lung adenocarcinoma. The comparison indicated organ specificity in genes influencing tumor genomic instability and suggests the value of a tailored approach for targeting genomic instability. We identified nine genes whose inhibition may lead to better survival (HR > 1.3; TIGD6, TMED6, APOBEC3D, EP400NL, B3GNT4, ZNF683, FOXD4, FOXD4L1, PKIB) and four genes for which an enhancer may benefit CRC patients’ survival (DDB2, MT1G, CLCN3, CAPS) via genomic instability modulation. These 13 genes with potential clinical relevance carry diverse functions, thus implicating multiple pathways leading to genomic instability rather than single central network affecting genomic instability. With promising target genes identified, further drug development is warranted.
References
McGranahan, N. & Swanton, C. Clonal heterogeneity and tumor evolution: Past, present, and the future. Cell 168, 613–628 (2017).
Bakhoum, S. F. & Cantley, L. C. The multifaceted role of chromosomal instability in cancer and its microenvironment. Cell 174, 1347–1360 (2018).
Turajlic, S., Sottoriva, A., Graham, T. & Swanton, C. Resolving genetic heterogeneity in cancer. Nat. Rev. Genet. 20, 404–416 (2019).
Santaguida, S. et al. Chromosome mis-segregation generates cell-cycle-arrested cells with complex karyotypes that are eliminated by the immune system. Dev. Cell 41(638–51), e5 (2017).
Zeggini, E., Gloyn, A. L., Barton, A. C. & Wain, L. V. Translational genomics and precision medicine: Moving from the lab to the clinic. Science 365, 1409–1413 (2019).
Ren, Z., Wang, Z., Gu, D., Ma, H., Zhu, Y., Cai, M., et al. Genome instability and long noncoding RNA reveal biomarkers for immunotherapy and prognosis and novel competing endogenous RNA mechanism in colon adenocarcinoma. Front. Cell Dev. Biol. 9, 740455. https://doi.org/10.3389/fcell.2021.740455 (2021).
Davoli, T., Uno, H., Wooten, E. C. & Elledge, S. J. Tumor aneuploidy correlates with markers of immune evasion and with reduced response to immunotherapy. Science 355, eaaf8399 (2017).
López-Soto, A., Gonzalez, S., López-Larrea, C. & Kroemer, G. Immunosurveillance of malignant cells with complex karyotypes. Trends Cell Biol. 27, 880–884 (2017).
Senovilla, L. et al. An anticancer therapy-elicited immunosurveillance system that eliminates tetraploid cells. Oncoimmunology 2, e22409 (2013).
Shoshani, O. et al. Transient genomic instability drives tumorigenesis through accelerated clonal evolution. Genes Dev. 35, 1093–1108 (2021).
Silk, A. D. et al. Chromosome missegregation rate predicts whether aneuploidy will promote or suppress tumors. Proc. Natl. Acad. Sci. 110, E4134–E4141 (2013).
Dai, W. et al. Slippage of mitotic arrest and enhanced tumor development in mice with BubR1 haploinsufficiency. Can. Res. 64, 440–445 (2004).
Schvartzman, J.-M., Sotillo, R. & Benezra, R. Mitotic chromosomal instability and cancer: Mouse modelling of the human disease. Nat. Rev. Cancer 10, 102–115 (2010).
Simon, J. E., Bakker, B., Foijer, F. CINcere modelling: What have mouse models for chromosome instability taught us? Recent Results Cancer Res. 200, 39–60. https://doi.org/10.1007/978-3-319-20291-4_2 (2015).
Yamada, H. et al. Systemic chromosome instability in Shugoshin-1 mice resulted in compromised glutathione pathway, activation of Wnt signaling and defects in immune system in the lung. Oncogenesis 5, e256-e (2016).
Yamada, H. Y. et al. Haploinsufficiency of SGO1 results in deregulated centrosome dynamics, enhanced chromosomal instability and colon tumorigenesis. Cell Cycle 11, 479–488 (2012).
Yamada, H. Y. et al. Tumor-promoting/progressing role of additional chromosome instability in hepatic carcinogenesis in Sgo1 (Shugoshin 1) haploinsufficient mice. Carcinogenesis 36, 429–440 (2015).
Rao, C. V. et al. Survival-critical genes associated with copy number alterations in lung adenocarcinoma. Cancers 13, 2586 (2021).
Siegel, R. L., Miller, K. D., Fuchs, H. E. & Jemal, A. Cancer statistics, 2022. CA: A Cancer J. Clin. 72(1), 7–33. https://doi.org/10.3322/caac.21708 (2022).
Fearon, E. R. & Vogelstein, B. A genetic model for colorectal tumorigenesis. Cell 61, 759–67 (1990).
Rao, C. V. & Yamada, H. Y. Genomic instability and colon carcinogenesis: From the perspective of genes. Front. Oncol. 3, 130 (2013).
Carethers, J. M. & Jung, B. H. Genetics and genetic biomarkers in sporadic colorectal cancer. Gastroenterology 149(1177–90), e3 (2015).
Fiorentini, C. et al. Gut microbiota and colon cancer: A role for bacterial protein toxins?. Int. J. Mol. Sci. 21, 6201 (2020).
Grady, W. M. & Carethers, J. M. Genomic and epigenetic instability in colorectal cancer pathogenesis. Gastroenterology 135, 1079–1099 (2008).
Kerachian, M. A. & Kerachian, M. Long interspersed nucleotide element-1 (LINE-1) methylation in colorectal cancer. Clin. Chim. Acta 488, 209–214 (2019).
Wang, X., Yang, Y. & Huycke, M. M. Microbiome-driven carcinogenesis in colorectal cancer: Models and mechanisms. Free Radical Biol. Med. 105, 3–15 (2017).
Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signaling 6, l1 (2013).
Cerami, E. et al. The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2(5), 401–404. https://doi.org/10.1158/2159-8290.CD-12-0095 (2012).
Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010).
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, 1–14 (2011).
Storey, J. D., Bass, A. J., Dabney, A., Robinson D. qvalue: Q-value estimation for false discovery rate control. R package version 2.10.0. http://github.com/jdstorey/qvalue (2015).
Krämer, A., Green, J., Pollard, J. Jr. & Tugendreich, S. Causal analysis approaches in ingenuity pathway analysis. Bioinformatics 30, 523–530 (2014).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57, 289–300 (1995).
Azuero, A. A note on the magnitude of hazard ratios. Cancer 122, 1298–1299 (2016).
Tougeron, D. et al. Tumor-infiltrating lymphocytes in colorectal cancers with microsatellite instability are correlated with the number and spectrum of frameshift mutations. Mod. Pathol. 22, 1186–1195 (2009).
Staffa, L. et al. Mismatch repair-deficient crypt foci in Lynch syndrome–molecular alterations and association with clinical parameters. PLoS ONE 10, e0121980 (2015).
Zhao, F. et al. ASTE1 promotes shieldin-complex-mediated DNA repair by attenuating end resection. Nat. Cell Biol. 23, 894–904 (2021).
Han, D. S. & Lo, Y. D. The nexus of cfDNA and nuclease biology. Trends Genet. 37, 758–770 (2021).
Fang, Y.-Y. et al. Clinicopathological significance of ribosomal protein S6 kinase A6 in lung squamous cell carcinoma: An immunohistochemical and RNA-seq study. Int. J. Clin. Exp. Pathol. 11, 1318 (2018).
Boehm, V. et al. SMG5-SMG7 authorize nonsense-mediated mRNA decay by enabling SMG6 endonucleolytic activity. Nat. Commun. 12, 1–19 (2021).
Zhang, Y., Liu, H., Zhang, Q. & Zhang, Z. Long noncoding RNA LINC01006 facilitates cell proliferation, migration, and epithelial-mesenchymal transition in lung adenocarcinoma via targeting the MicroRNA 129-2-3p/CTNNB1 axis and activating Wnt/β-Catenin signaling pathway. Mol. Cell. Biol. 41, e00380-e420 (2021).
Yang, P., Huo, Z., Liao, H. & Zhou, Q. Cancer/testis antigens trigger epithelial-mesenchymal transition and genesis of cancer stem-like cells. Curr. Pharm. Des. 21, 1292–1300 (2015).
Song, Y., Wang, S. & Cheng, X. LINC01006 regulates the proliferation, migration and invasion of hepatocellular carcinoma cells through regulating miR-433-3p/CBX3 axis. Ann. Hepatol. 25, 100343 (2021).
Marshall, O. J. & Choo, K. Putative CENP-B paralogues are not present at mammalian centromeres. Chromosoma 121, 169–179 (2012).
Olson, M. E., Harris, R. S. & Harki, D. A. APOBEC enzymes as targets for virus and cancer therapy. Cell Chem. Biol. 25, 36–49 (2018).
Han, S. et al. ROBO3 promotes growth and metastasis of pancreatic carcinoma. Cancer Lett. 366, 61–70 (2015).
Jiang, Z. et al. Targeting the SLIT/ROBO pathway in tumor progression: Molecular mechanisms and therapeutic perspectives. Ther. Adv. Med. Oncol. 11, 1758835919855238 (2019).
Giuliani, V. et al. PRMT1-dependent regulation of RNA metabolism and DNA damage response sustains pancreatic ductal adenocarcinoma. Nat. Commun. 12, 1–19 (2021).
Ling, S. et al. Association of type 2 diabetes with cancer: A meta-analysis with bias analysis for unmeasured confounding in 151 cohorts comprising 32 million people. Diabetes Care 43, 2313–2322 (2020).
Nakayama, Y. et al. Microsatellite instability is inversely associated with type 2 diabetes mellitus in colorectal cancer. PLoS ONE 14, e0215513 (2019).
Ikeda, T., Yue, Y., Shimizu, R. & Nasser, H. Potential utilization of APOBEC3-mediated mutagenesis for an HIV-1 functional cure. Front. Microbiol. 12, 1417 (2021).
Anderson, J. L. & Hope, T. J. APOBEC3G restricts early HIV-1 replication in the cytoplasm of target cells. Virology 375, 1–12 (2008).
Swanton, C., McGranahan, N., Starrett, G. J. & Harris, R. S. APOBEC enzymes: Mutagenic fuel for cancer evolution and heterogeneity. Cancer Discov. 5, 704–712 (2015).
Asaoka, M., Patnaik, S. K., Ishikawa, T. & Takabe, K. Different members of the APOBEC3 family of DNA mutators have opposing associations with the landscape of breast cancer. Am. J. Cancer Res. 11, 5111 (2021).
Venkatesan, S. et al. Induction of APOBEC3 exacerbates DNA replication stress and chromosomal instability in early breast and lung cancer evolution. Cancer Discov. 11, 2456–2473 (2021).
Grondin, J. A., Kwon, Y. H., Far, P. M., Haq, S. & Khan, W. I. Mucins in intestinal mucosal defense and inflammation: Learning from clinical and experimental studies. Front. Immunol. 11, 2054 (2020).
Pelaseyed, T. et al. The mucus and mucins of the goblet cells and enterocytes provide the first defense line of the gastrointestinal tract and interact with the immune system. Immunol. Rev. 260, 8–20 (2014).
Cullen, P. J. Post-translational regulation of signaling mucins. Curr. Opin. Struct. Biol. 21, 590–596 (2011).
Zhang, W., Hou, T., Niu, C., Song, L. & Zhang, Y. B3GNT3 expression is a novel marker correlated with pelvic lymph node metastasis and poor clinical outcome in early-stage cervical cancer. PLoS ONE 10, e0144360 (2015).
Funding
This work was supported by the Kerley-Cade chair fund (OUHSC) to CVR and the research support fund (Stephenson Cancer Center) and the bridge grant (Presbyterian Health Foundation of Oklahoma City) to HYY.
Author information
Authors and Affiliations
Contributions
C.V.R. (the acquisition, analysis; interpretation of data; substantively revised draft). C.X. (conception or design of the work; the acquisition, analysis; interpretation of data; the creation of new software used in the work; have drafted the work). Y.Z. (the acquisition, analysis). A.S.A. (the acquisition, analysis). H.Y.Y. (conception or design of the work; the acquisition, analysis; interpretation of data; the creation of new software used in the work; have drafted the work; substantively revised it).
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Rao, C.V., Xu, C., Zhang, Y. et al. Genomic instability genes in lung and colon adenocarcinoma indicate organ specificity of transcriptomic impact on Copy Number Alterations. Sci Rep 12, 11739 (2022). https://doi.org/10.1038/s41598-022-15692-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-022-15692-8
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.