Association of rs6983267 at 8q24, HULC rs7763881 polymorphisms and serum lncRNAs CCAT2 and HULC with colorectal cancer in Egyptian patients

The impact of HULC rs7763881 on colorectal cancer (CRC) susceptibility is not yet known. Also, the biological function of the cancer-related rs6983267 remains unclear. We investigated the association of these SNPs with the risk of CRC and adenomatous polyps (AP), their correlation with CCAT2 and HULC expression, and the potential of serum CCAT2 and HULC as biomarkers for CRC. 120 CRC patients, 30 AP patients, and 96 healthy controls were included. Genotyping and serum lncRNAs were assayed by qPCR. Studied SNPs were not associated with AP susceptibility. rs6983267 GG was associated with increased CRC risk, whereas rs7763881 AC was protective. rs7763881 and rs6983267 CT haplotype was protective. Serum CCAT2 and HULC were upregulated in CRC and AP patients versus controls and discriminated these groups by ROC analysis. rs6983267 GG and rs7763881 AA patients demonstrated higher serum CCAT2 and HULC compared with GT/TT and AC, respectively. rs6983267 and serum HULC predicted CRC diagnosis among non-CRC groups (AP + controls) by multivariate analysis. Studied SNPs or serum long noncoding RNAs weren’t correlated with nodal or distant metastasis. In conclusion, rs6983267 and rs7763881 are potential genetic markers of CRC predisposition and correlate with serum CCAT2 and HULC, two novel potential non-invasive diagnostic biomarkers for CRC.


Association of rs7763881 (A/C) and rs6983267 (G/T) with the risk of CRC and AP.
Genotyping was processed without knowledge of the subjects' case-control status. The concordance rates of repeated analyses using the same assay were 100% for the 2 SNPs. Minor allele frequencies (MAFs) in controls were similar to Ensembl GRCh37 release 89, 2017 for the 2 SNPs (Table S1). The distribution of the rs6983267 genotypes in the CRC group did not significantly deviate from Hardy-Weinberg equilibrium (P = 0.86) ( Table S2).
The genotype and allele frequencies for rs7763881 and rs6983267 are shown in Table 2. For rs7763881, the genotype and allele frequencies were not significantly different between the AP patients and controls (P > 0.05). In CRC patients, the frequency of rs7763881 AC genotype was significantly lower than in healthy controls (73.3% vs 87.5%, respectively, crude P = 0.011) and associated with decreased CRC risk (AC vs AA, adjusted OR = 0.335, 95% CI = 0.157-0.716, P = 0.005) with adjustment for age and sex. However, there was no significant difference in the A and C allele frequencies between CRC patients and healthy controls (C vs A, OR = 0.744, 95% CI = 0.505-1.097, P = 0.14).
For rs6983267, there were no significant differences in the genotype and allele frequencies between AP patients and controls (P > 0.05). In CRC, the genotype frequencies were significantly different between CRC and controls (P = 0.03), however, the allele frequencies were not significantly different between the two groups (T vs G, OR = 0.845, 95% CI = 0.574-1.245, P = 0.43). In the additive model, the GT genotype was protective against CRC (GT vs GG, adjusted OR = 0.39, 95% CI = 0.203-0.749, P = 0.005), while the minor homozygote TT genotype did not significantly affect the CRC risk (TT vs GG, adjusted OR = 0.863, 95% CI = 0.327-2.275, P = 0.766). In the dominant model, the common homozygote GG genotype was significantly associated with the risk of CRC (GG vs GT/TT, adjusted OR = 2.13, 95% CI = 1.146-3.937, P = 0.017). In the recessive model, the TT genotype was not a significant risk factor for CRC (TT vs GG/GT, adjusted OR = 1.5, 95% CI = 0.665-3.462, P = 0.41).
Joint effect and results of haplotype analysis. We examined the joint effect of studied gene polymorphisms in patients with CRC compared to control group (Table 4). Results revealed that the combined heterozygosity for rs7763881 and rs6983267 (AC + GT vs AA + GG, OR = 0.217, 95% CI = 0.057-0.817, P = 0.024) was a decreased risk factor for CRC. In addition, rs7763881 and rs6983267 CT haplotype was protective against CRC (CT vs AG, OR = 0.583, 95% CI = 0.388-0.877, P = 0.012).

Serum levels of HULC and CCAT2 in CRC and AP. Serum HULC was significantly upregulated in CRC
and AP patients compared with healthy controls with mean fold change of 6.76 (P < 0.0001) and 4.6 (P = 0.0002), respectively (Fig. 1A). Serum HULC levels were numerically higher in CRC than AP patients, but didn't reach statistical significance (P = 0.15). Serum HULC was significantly higher in CRC compared to non-CRC groups (AP + healthy controls) with a mean fold change of 4.1 (P = 0.007) (Fig. 1B). Serum CCAT2 was significantly upregulated in CRC and AP patients compared with healthy controls with mean fold change of 4.4 (P = 0.024) and 15.87 (P < 0.0001), respectively (Fig. 1A). Interestingly, serum CCAT2 levels were significantly higher in patients with AP than CRC patients (P < 0.0001), while serum CCAT2 was not significantly different between CRC and non-CRC groups (mean fold change 1.97, P = 0.1) (Fig. 1B).
Effect of rs7763881 and rs6983267 genotypes on serum HULC and CCAT2 expression. We found that serum HULC level was significantly higher in CRC patients with rs7763881 AA than those with AC genotype (P = 0.02) (Fig. 1C). CRC patients with rs6983267 GG genotype were found to have higher expression level of serum CCAT2 than those with GT genotype (P = 0.023) or GT/TT genotypes (P = 0.014), however, serum  CCAT2 levels were not significantly different between either patients with GG and TT or patients with GT and TT genotypes (P > 0.05) (Fig. 1D).

Results of logistic regression analysis.
We conducted univariate and multivariate logistic regression analyses to select the predictor variables associated with CRC risk among non-CRC groups (Table 5). Serum HULC, rs6983267 and rs7763881 genotypes were selected as significant predictor variables in the univariate analysis, with adjustment for age and sex. In multivariate analysis, only serum HULC and rs6983267 turned out to be significant independent predictors of CRC susceptibility.
Correlation of rs7763881 and rs6983267 genotypes, serum HULC and CCAT2 levels with clinicopathological data. We examined the prognostic role of studied SNPs and serum lncRNAs in CRC patients (Table 6). No significant correlations were found for rs7763881, rs6983267 genotypes, serum HULC and CCAT2 with tumor-related data, nodal and distant metastases, except a negative correlation between rs7763881   AC genotype and the presence of mucinous adenocarcinoma (adjusted OR = 0.152, 95% CI = 0.035-0.666, P = 0.016) after adjustment for age and sex. Serum CCAT2 was positively correlated with serum HULC (r = 0.67, P = 0.0007).

Discussion
The present study revealed association of rs6983267 at 8q24 and HULC rs7763881 SNPs with the susceptibility for CRC, but not adenomatous polyps. These SNPs were functionally correlated with serum CCAT2 and HULC expression, respectively. In addition, combined presence of these two polymorphisms in our studied population significantly altered the CRC susceptibility, confirming that CRC is predisposed by combination of low-penetrance susceptibility alleles. These results may add to the complex heterogeneity and pathology of CRC and implicate these SNPs, through functional modulation of lncRNAs expression, as potential genetic susceptibility markers for sporadic CRC.
To the best of our knowledge, we provided the first evidence of the association between HULC rs7763881 polymorphism with decreased CRC susceptibility. rs7763881 AC genotype was a decreased risk factor for CRC compared with the AA genotype. This protective role was evident among male patients and younger patients in our study. Similarly, HULC rs7763881 AC/CC genotypes conferred a lower risk for HBV-related hepatocellular carcinoma 17 , and the AC genotype was a protective factor against esophageal squamous cell carcinoma in Chinese population compared with the AA genotype 18 . However, the exact mechanism of rs7763881 and its role in regulating HULC expression was not known.
In this study, we are the first to investigate the functional role of the HULC rs7763881. We found that the AC genotype was associated with lower HULC levels than the AA genotype. This could explain the protective role of rs7763881 by reducing the oncogenic HULC level. These reduced levels could explain the observed negative association between the AC genotype with mucinous adenocarcinoma, a histologic variant characterized by huge amounts of extracellular mucus and is associated with advanced CRC 19 . Notably, HULC expression is regulated by several mechanisms, including promoter methylation, transcription factors, RNA destabilization, lncRNA-lncRNA interaction, and post-transcriptional regulation by miR-203 20 . Additionally, our results propose rs7763881 SNP as a new mechanism regulating HULC expression in cancer.
For rs6983267, homozygocity for the G allele contributed to increased CRC risk. Conversely, heterozygocity (GT) was protective, while the TT genotype wasn't a significant CRC risk candidate compared with the GG genotype in our study. The GG-associated risk was evident among male patients and older patients (>50 years) in a dominant model. In addition, the GG genotype independently predicted increased CRC risk among non-CRC groups by 2.5 fold in multivariate analysis. Similarly, rs6983267 GG was associated with increased sporadic CRC susceptibility in an Iranian population and individuals with GT genotype had lower risk for CRC 21 . The homozygous G allele of rs6983267 was also associated with a high CRC risk, while the homozygous T allele was a non-risk allele 22 . Furthermore, several studies showed that the rs6983267 GG is related to increased risk of various cancers  in several populations 7,23-25 . These findings suggest that the rs6983267 GG genotype is a multicancer susceptibility marker. Conversely, the rs6983267 GT was associated with an increased gastric cancer risk in Chinese population compared with the GG genotype 26 . This discrepancy may be due to differences in the population studied and the tissue type.
Although the rs6983267 has been established as a cancer-related SNP, its biological function remains unclear. An interaction of the risk GG allele with increased MYC expression has been described 10,13 . The GG genotype is highly homologous to the binding site of the transcription factor TCF4/LEF1, which enhances MYC transcription 13 , whereas the TT genotype cannot bind the TCF4/LEF protein 27 . However, other studies didn't clearly find this association 14,28 . Another correlation of rs6983267 with CCAT2 expression in CRC tissues was found 10,11 . However, one study unexpectedly demonstrated higher CCAT2 level with the non-risk TT genotype 11 . In our study, we further assessed the correlation of rs6983267 with CCAT2 expression. We found that rs6983267 GG was associated with higher serum CCAT2 than GT, TT or GT/TT variants, although a statistical significance wasn't reached for the TT. Our results are consistent with previously reported 10 and confirm that the SNP-conferred CRC risk may be through potential regulation of CCAT2.
Regarding the prognostic role of rs6983267, we found no association of this SNP with tumor-related characteristics, nodal and distant metastases. Our results are consistent with previous reports in CRC and prostate cancer 11,29,30 , while contrasting others that showed a correlation of rs6983267 with node metastasis in endometrial cancer and distant metastasis in inflammatory breast cancer 24,30 . It seems probable that the relation of the cancer-associated rs6983267 with tumor aggressiveness may depend on the tissue type.
Aberrantly expressed lncRNAs in tumor tissues have emerged as promising biomarkers for cancer diagnosis and prognosis 5 , but the invasive nature of biopsies may limit their use. Few studies have addressed circulating lncRNAs as non-invasive markers for CRC 31,32 . Herein, we demonstrated that serum CCAT2 and HULC were differentially expressed between CRC patients and controls and/or adenoma patients, and discriminated CRC from other groups with moderate sensitivity and specificity, suggesting serum HULC and CCAT2 as novel potential early biomarkers for CRC diagnosis. However, serum HULC, but not CCAT2, was significantly upregulated in CRC vs non-CRC groups and distinguished the two groups by ROC analysis. Interestingly, serum HULC independently predicted the risk of CRC diagnosis among non-CRC groups in multivariate analysis. These results implicate serum HULC as reliable non-invasive early biomarker and possible therapeutic target for CRC treatment. Perhaps combining HULC with other tumor markers may improve the early diagnosis of CRC, however‚ this needs further investigation.
The observed upregulation of serum CCAT2 and HULC in CRC is consistent with their oncogenic roles 10,16 . This upregulation agreed with previous reports in CRC tumor tissues and cell lines 10,16 , which could be reflected in the serum. Indeed, lncRNAs are packaged into secreted microparticles, specifically exosomes 33 . We also found a significant positive correlation between serum CCAT2 and HULC, suggesting their concomitant expression in CRC. However, we found no correlations between these lncRNAs and tumor-related data, nodal and distant metastases. While several studies reported that CCAT2 and HULC were associated with nodal and/or distant metastases in CRC and several cancers 10,16,34-37 , others found no association 11,38,39 . This controversy could be due to differences in the samples used: tissue or serum, the type of lesion analyzed: primary or metastatic, patients' tumor stages, and number of patients with metastasis. In our study, serum samples were collected from patients during CRC screening, where most tumors were limited, neither nodal (63.3%) nor metastatic (83.3%).
As sporadic CRC mostly develops in AP through the adenoma-carcinoma sequence 40 , we tested if the studied SNPs or lncRNAs contribute to CRC through the development of adenoma. We failed to find an association between neither rs6983267 nor rs7763881 with the susceptibility to AP. Conversely, rs6983267 was significantly associated with colorectal adenoma in a large case-control study of 1,477 individuals with colorectal adenoma and 2,136 controls 8 . However, we couldn't verify this association in the studied Egyptian population perhaps due to the comparably small sample size of our study. Intriguingly, serum HULC and CCAT2 were upregulated in AP patients compared with controls and were differentially expressed between CRC and AP groups, however, a statistical significance wasn't reached for HULC. These results implicate HULC and CCAT2 dysregulation as key pro-tumorigenic factors in CRC initiation in colorectal adenoma. AP is a precancerous condition, and its progression to cancer development requires additional molecular modifications, particularly increased cell proliferation 41 . Indeed, HULC and CCAT2 promote different pro-tumorigenic phenotypes, such as cell survival, proliferation and invasion in vitro and in vivo in many cancers 16,20,42,43 . Specifically, the paradoxically higher CCAT2 levels in AP than CRC in our study probably suggest that AP may need high level of carcinogenic factors like CCAT2 to convert to carcinoma.
Colonoscopy is the gold standard for CRC screening, but it is an invasive method. Our study proposed new blood-based non-invasive genomic markers based on PCR techniques which appear both reproducible and cost-effective. However, our study is limited by the relatively small sample size, and being a hospital-based case-control study, selection bias might have ineluctably occurred. Extended studies with independent larger samples are required to validate our results, and large-scale investigations across different populations should be undertaken. Finally, the interaction of the studied genes with environmental risk factors of CRC should be evaluated; nevertheless our results implicate rs6983267 at 8q24 and HULC rs7763881 as potential genetic markers of CRC susceptibility that correlate with CCAT2 and HULC expression, respectively. Serum CCAT2 and HULC could serve as novel potential early diagnostic biomarkers for CRC, with rs6983267 and serum HULC could predict the risk of CRC diagnosis among non-CRC groups. Our data have potential implications for CRC screening, genetic counseling, and hold promise for large-scale application.

Materials and Methods
Patients. This case-control study included 150 adult (>18 years old) Egyptian patients who attended the Gastrointestinal Endoscopy Unit in Kasr AL-Ainy Hospital, Cairo University and referred to colonoscopic examination for lower GIT symptoms‚ including chronic diarrhea, chronic constipation, alternating bowel habits and bleeding per rectum; alarming symptoms and signs for CRC, including significant unexplained weight loss and unexplained anemia; screening for CRC; and metastases proved to be adenocarcinoma and were suspected to have CRC.
Patients were divided into 120 CRC cases and 30 cases with AP based on positive colonoscopy and the diagnosis was confirmed by pathology results. All patients were subjected to full history taking and clinical examination, including routine laboratory investigations: complete blood count, erythrocyte sedimentation rate (ESR), carcinoembryonic antigen (CEA) assay, stool analysis, fecal occult blood test, and liver biochemical profile; full colonoscopy; and imaging using abdominal ultrasound and computed tomography to stage CRC according to American Joint Committee on Cancer (AJCC, 2010) 44 . Patients previously received chemo-and/or radiotherapy for CRC, diagnosed with inflammatory bowel disease (IBD), had cancer at any other site at the time of selection or a history of recurrent tumors were excluded.
A total of 96 apparently healthy controls were age-and sex-matched to the patient population. Controls had negative colonoscopy results for malignancy, polyps or IBD and had no personal or family history of familial adenomatous polyposis and hereditary non-polyposis CRC.
Written informed consent was obtained from all patients and controls. The study protocol and informed consent were approved by the ethics committee of the Faculty of Pharmacy, Cairo University (BC2058) and conformed to the ethical guidelines of Helsinki Declaration.
DNA extraction and genotyping. Genomic DNA was extracted from whole EDTA blood samples from all subjects using the QIAamp DNA MiniKit (Qiagen, Valenica, CA) according to the manufacturer's instructions. The yield was measured by NanoDrop2000 (Thermo scientific, USA). Genotyping was performed using real-time PCR with the TaqMan allelic discrimination assay using predesigned primer/probe sets for rs6983267 (G/T) Serum lncRNAs assay. Total RNA was extracted from serum by miRNeasy extraction kit (Qiagen, Valenica, CA) using QIAzol lysis reagent according to the manufacturer's instructions. RNA quality was determined using NanoDrop2000 (Thermo scientific, USA). Reverse transcription (RT) was carried out on 60 ng of total RNA in a final volume 20 µl RT reactions using RT 2 first strand Kit (Qiagen, Valenica, CA) according to the manufacturer's instructions. The RT products were diluted with 50 µl RNAase-free water before real-time PCR. Serum expression levels of the studied lncRNAs were evaluated using GAPDH as internal control using customized primers and Maxima SYBR Green PCR kit (Thermo, USA) according to the manufacturer's protocol. The primer sequences were as follows: CCAT2-forward 5′-CCCTGGTCAAATTGCTTAACCT-3′, CCAT2-reverse 5′-TTATTCGTCCCTCTGTTTTATGGAT-3′, and HULC-forward 5′-TCATGATGGAATTGGAGCCTT-3′, H U L C -r e v e r s e 5 ′ -C T C T T C C T G G C T T G C A G A T T G -3 ′ , a n d G A P D H -f o r w a r d 5′-CCCTTCATTGACCTCAACTA-3′, GAPDH-reverse, 5′-TGGAAGATGGTGATGGGATT-3′. Briefly, real-time PCR was done on 20 µl reaction mixture prepared by mixing 10 µl master mix, 1 µl forward primer, 1 µl reverse primer, 2.5 µl diluted cDNA, and 5.5 µl RNAase-free water using Rotor gene Q System (Qiagen) with the following conditions: 95 °C for 10 min, followed by 45 cycles at 95 °C for 15 s and 60 °C for 60 s. The cycle threshold (Ct) is the number of cycles required for the fluorescent signal to cross the threshold in real-time PCR. Gene expression relative to internal control (2 −∆Ct ) was calculated. Fold change was calculated using 2 −∆∆Ct for relative quantification 45 . Statistical analysis. Statistical analyses were performed using computer program Statistical Package for the Social Science (SPSS, Chicago, IL) software version-15 for Microsoft Windows and GraphPad Prism-5.0 (GraphPad Software, CA). Values were expressed as mean ± standard deviation (SD), mean (95% confidence interval, CI) or number (percentage) when appropriate. Categorical data were compared by Chi square or Fisher's exact test when appropriate. Continuous variables were compared using student's t-test or one way ANOVA followed by Tukey's post-hoc test when appropriate. The diagnostic accuracy of lncRNAs was evaluated by receiver-operating-characteristic (ROC) analysis and the area under the curve (AUC) was calculated. AUC <0.6 was considered non-significant, between 0.7-0.89 was considered potential discriminator, whereas AUC >0.9 was considered significant discriminator. Univariate and multivariate logistic regression analyses were done to identify predictor variables associated with the risk of CRC. For adjustment of the data to the confounding factors, age and sex were included as covariates. Significant predictor variables in the univariate analysis were included in a stepwise forward multivariate analysis (P < 0.05 for entering the model and P < 0.1 for removal from the model) to determine the final predictor variables for the probability of being diagnosed with CRC. An internal 10-fold cross-validation was conducted to confirm the reproducibility of the results. Correlations were determined by Pearson correlation. P < 0.05 was considered significant, with a 95% CI. Data availability. All data generated or analyzed during this study are included in this article and its supplementary information files.