Identification of oxytocin-related lncRNAs and assessment of their expression in breast cancer

Oxytocin is a neuropeptide released by the central nervous system. A number of studies have demonstrated the role of this neuropeptide in the pathogenesis of breast cancer. In the present project, we have identified mRNA coding genes and long non-coding RNAs (lncRNAs) that are associated with this pathway through an in-silico strategy, and measured their expression in a cohort of Iranian females affected with this type of malignancy. Expression levels of OXTR, FOS, ITPR1, RCAN1, CAMK2D, CACNA2D and lnc_ZFP161 were significantly down-regulated in breast cancer tissues compared with nearby non-cancerous tissues. On the other hand, expression of lnc_MTX2 was higher in breast cancer tissues compared with controls. Expression of lnc_TNS1 and lnc_FOXF1 were not different between these two kinds of samples. Expression of CACNA2D was associated with mitotic rate and PR status (P values = 3.02E−02 and 2.53E−02, respectively). Expression of other oxytocin-related genes was not associated with clinicopathological parameters. FOS and ITPR1 had the highest AUC value among the oxytocin-related genes. Combination of expression profiles of all oxytocin-related genes increased the AUC value to 0.75. However, the combinatorial sensitivity and specificity values were lower than some individual genes. In the breast cancer tissues, the most robust correlations have been detected between lnc_ZFP161/ lnc_FOXF1, CAMK2D/ lnc_ZFP161 and CAMK2D / lnc_FOXF1 (r = 0.86, 0.71 and 0.64 respectively). In the non-cancerous tissues, the strongest correlation was detected between lnc_FOXF1/lnc_MTX2 and lnc_ZFP161/CAMK2D respectively (r = 0.78 and 0.65). Taken together, oxytocin-associated genes have been dysregulated in breast cancer tissues. Moreover, the correlation ratio between these genes is connected with the existence of cancer.

Oxytocin is a neuropeptide secreted from the central nervous system and has similar functions with the antidiuretic hormone vasopressin 1 . In addition to its functions in the physiology of uterus and milk secretion, oxytocin has been shown to affect carcinogenesis 1 . A former in vitro study has demonstrated the mitogenic effects of oxytocin on MCF7 cells indicating the possible role of this neuropeptide in the growth of breast cancer cells 2 . Yet, another study in MCF7 and T47D breast cancer cells has shown the inhibitory effect of oxytocin on estrogen-associated cell growth. This neuropeptide has also been shown to promote the suppressive impact of tamoxifen on cell proliferation. Moreover, expression of oxytocin receptor has been detected in these cell lines and MDA-MB-231 cells 3 . Subsequent investigations have verified anti-proliferative effects of oxytocin and have demonstrated the role of cyclic adenosine monophosphate protein kinase A in the mediation of these effects 4 . Further experiments in animal models of breast cancer have also verified such effects 5 . As a G protein-coupled receptor, oxytocin receptor exemplifies a fascinating target for cancer treatment since it partakes in the development of in breast cancer and is expressed by numerous breast cancer cell lines 6 . Yet, the underlying mechanisms of involvement of oxytocin receptor and its related pathways are not completely understood. In the present project, we have identified mRNA coding genes and long non-coding RNAs (lncRNAs) which are associated with this pathway through an in-silico strategy, then measured their expression in a cohort of Iranian females affected with this type of malignancy 7 . We hypothesized that oxytocin-related lncRNAs are involved in the pathogenesis of different histopathological types of breast cancer.

Materials and methods
Bioinformatics methods. GSE54002 dataset was downloaded from Gene Expression Omnibus database and preprocessed in R version 3.6.1 using limma package version 3.40. 6. This dataset was selected as it contains expression data of an appropriate number of clinical samples prepared by laser capture microdissection (417 patients with breast cancer and 16 non-tumor tissues) (https ://www.ncbi.nlm.nih.gov/geo/query /acc. cgi?acc=GSE54 002). Gene expression matrix was obtained using the log2 values. Then, data was normalized using limma package. Differentially expressed genes (DEGs) between tumoral and normal tissues were assessed using Bayes methods and limma package. Raw P values were corrected using Benjamini and Hochberg methods. Cut-off criteria for identification of DEGs were P < 0.05 and logFC > 2 for up-regulated genes and logFC < -2 for down-regulated genes. Pathway Enrichment Analyses of DEGs were performed using https ://amp.pharm .mssm. edu/Enric hr and KEGG database. PPI network was depicted and hub genes were recognized using STRING (https ://strin g-db.org) and Cytoscape v3.8.1. Then, from the down-regulated genes, those being associated with oxytocin pathway were selected. Finally, lncRNAs associated with these genes were chosen based on the results of Khalil et al. study (GSE16226) 8 .
Enrolled individuals. Expression of oxytocin-related genes were assessed in 69 pairs of breast cancer specimens and their matched nearby tissues. Samples were gathered from Farmanieh and Sina hospitals during 2017-2020, Tehran, Iran. The study protocol was approved by the ethical committee of Shahid Beheshti University of Medical Science and the study protocol was performed in accordance with the relevant guidelines (IR.SBMU. MSP.REC.1398.1010). Patients' samples were excised before any chemotherapy or radiotherapy. Medical records were gathered to obtain histopathological and clinical data. Informed written consent forms were obtained from study participants.
Expression assays. All tissue sections were subjected to RNA extraction using the RiboEx kit (Gene-All, Seoul, South Korea). Afterwards, 70-100 ng of RNA was used for production of cDNA using the ExcelRT Reverse Transcription Kit II (SMOBIO, Taiwan). Expressions of genes in breast cancer samples and nearby noncancerous tissues were measured in the ABI step one plus PCR machine. Expression levels were normalized to transcripts of GAPDH. RealQ Plus 2 × PCR Master Mix (Ampliqon, Odense, Denmark) was used for making the reactions. Primers and amplicons characteristics are shown in Table 1.

Statistical analyses.
Statistical analyses were executed in the R environment. Transcript quantities of oxytocin-related genes were measured in relation to the HPRT1 reference gene using the equation:  OXTR (F)  GGA CGC CTT TCT TCT TCG TG  20  128 bp  OXTR (R)  CAT GTA GAT CCA GGG GTT GCAG  22   CAMK2D (F)  AGA AGA GAC TCG TGT GTG GC  20  100 bp  CAMK2D (R)  AAT ACA GGG TGG CTT GAT GGG  www.nature.com/scientificreports/ A comparison was made between non-cancerous and tumor tissues of patients, and the significance of the difference between mean values was appraised using the paired t-test. Correlations between expression levels of oxytocin-related genes were appraised through the calculation of Spearman correlation coefficients. In order to appraise of the diagnostic power of genes, receiver operating characteristic (ROC) curves were depicted. ROC curves were depicted using the methods described previously 9,10 . For this purpose, Bayesian Generalized Linear Model (BayesGLM), Generalized Linear Model (GLM), and Linear Discriminant Analysis (LDA) were used to compute the sensitivity and specificity of each model. GLM is a generalization of linear regression with no constraint on the distribution models of response variables. BayesGLM is an approach to GLM using Bayesian inference, and LDA aims to find a linear combination of features that separates two or more classes of objects or events. Log 2 values of transcript quantities of all genes were used as the predictive features to train three machine learning models with tenfold cross validation to avoid overfitting. Area under curve (AUC) metric was computed to pick the best model. Finally, BayesGLM model was selected based on the previous test, and the model was trained for each gene separately to test the distinguishing power of specific genes. Chi-square test was used to assess the association between demographic/clinical data and transcript levels of oxytocin-associated genes. Genes with log2FC ≥ 1 (tumor tissues vs. non-cancerous tissues) were considered as up-regulated and those with log2FC ≤ − 1 were considered as down-regulated. The level of significance was set at P value < 0.05.

Results
Bioinformatics step. The in-silico method has led to identification of a number of down-regulated genes in cancerous tissues compared with non-cancerous tissues (Fig. 1).
KEGG pathways analysis revealed oxytocin signaling pathway as the most significant enriched pathway of the down-regulated genes (Fig. 2). Figure 3 depicts the relative expression levels of oxytocin related genes in breast cancer samples and nearby non-cancerous tissues.

Expression assays.
Expression levels of OXTR, FOS, ITPR1, RCAN1, CAMK2D, CACNA2D and lnc_ZFP161 were significantly down-regulated in the breast cancer tissues compared with nearby non-cancerous tissues. On the other hand, expression of lnc_MTX2 was higher in breast cancer tissues compared with controls. Expressions of lnc_TNS1 and lnc_FOXF1 were not different between these two kinds of samples (Table 2).
Association between expression of genes and clinical data. Then, we appraised the association between expression levels of oxytocin-associated genes and a number of clinical and demographic data such as cancer stage and grade, age, mitotic rate, tumor size and hormone receptor status. Expression of CACNA2D was associated with mitotic rate and PR status (P values = 3.02E−02 and 2.53E−02, respectively). Expression of other oxytocin-related genes was not associated with these parameters (Table 3). Figure 4 demonstrates the efficacy of three predictive models in predicting the diagnostic power of oxytocin-related genes and the obtained AUC values for each gene. ROC curves were depicted using Log 2 values of transcript quantities of all genes as the predictive features to train three machine learning models www.nature.com/scientificreports/ (LDA, BayesGLM and GLM) with tenfold cross validation. AUC metric was computed to pick the best model. Finally, BayesGLM model was selected based on the previous test, and the model was trained for each gene separately to test the distinguishing power of specific genes. FOS and ITPR1 had the highest AUC value among the oxytocin-related genes. Combination of expression profile of all oxytocin-related genes increased the AUC value to 0.75. However, the combinatorial sensitivity and specificity values were lower than some individual genes (Table 4).

Discussion
Breast cancer is a complex disorder in which several molecular mechanisms are involved. Immunology regulations may also affect breast cancer development and immunodeficiency may promote adaptive alterations of host gut-and tissue-based microbiome 12 . LncRNAs can affect several aspects in this regard. Several lines of evidence such as the structural and genomic relation to vasopressin, co-expression of oxytocin and vasopressin, and the mitogenic effects of these hormones connected oxytocin to carcinogenesis 1 . Moreover, breastfeeding has been shown to decrease the risk of a number of cancers and particularly breast cancer, with elongated periods of breastfeeding being associated with a progressive reduction in the risk of this cancer 13,14 . Meanwhile, oxytocin has been shown to affect immune regulation 15 17 . The observed down-regulation of FOS in breast cancer samples is in line with the study of Fisler, which reported association between higher FOS expression and better survival of patients with breast cancer. Moreover, higher levels of FOS target apoptosis-effector gene have been associated with improved survival of these patients. Based on these results, authors have suggested that FOS is a pro-apoptotic protein 18 . In addition to the functional association with oxytocin-related pathways, ITPR1 has a regulatory role on autophagy and sensitivity to chemotherapeutic agents in cancer cells 19 . Therefore, its down-regulation in breast cancer cells might influence several aspects of breast carcinogenesis. RCAN1 has been suggested as a super-enhancer-driven tumor suppressor whose down-regulation enhances the malignant features of breast cancer cells 20 . CAMK2D is a kinase that regulates several cellular processes, such as proliferation, differentiation and apoptosis. Chi et al. have reported higher levels of CAMK2D expression and phosphorylation in breast cancer samples compared with non-cancerous samples 21 . This finding is in contrast with the reported expression pattern of CAMK2D mRNA in the current study. Further assessment of expression levels of this gene at both mRNA and protein levels is necessary for solving this controversy. We also detected down-regulation of the calcium channel coding gene CACNA2D in breast cancer samples and its association with mitotic rate and PR status. Former studies have reported that breast cancer cells can attain a selective growth advantage through modulating ion channel expression or function. These channels have also been shown to participate in the prominent features of this cancer 22 . However, the specific role of CACNA2D has not been elucidated. Future functional studies are required to clarify this point. We also assessed the diagnostic value of oxytocin-related genes in breast cancer. FOS and ITPR1 had the highest AUC value among the oxytocin-related genes. Combination of expression profile of all oxytocin-related genes increased the AUC value to 0.75. However, the combinatorial sensitivity and specificity values were lower than some individual genes. We recommend appraisal of expression of these genes in the peripheral blood of patients with breast cancer to unravel their diagnostic potential.
Finally, appraisal of correlation between expression levels of oxytocin-related genes has led to identification of specific patterns in cancerous and non-cancerous tissues. In breast cancer tissues, the most robust correlations have been detected between lnc_ZFP161/lnc_FOXF1, CAMK2D/lnc_ZFP161 and CAMK2D/lnc_FOXF1. In the non-cancerous tissues, the strongest correlation was detected between lnc_FOXF1/Lnc_MTX2 followed by lnc_ZFP161 and CAMK2D. Taken together, oxytocin-associated genes have been dysregulated in breast cancer tissues. Moreover, the correlation between these genes is influenced by the presence of cancer, as correlation coefficients between gene pairs were different in tumoral and non-tumoral tissues.
The current study used a combination of bioinformatics and gene expression methods. Bioinformatics methods have been extensively used to find appropriate targets for experimental assessment of gene expressions. A Table 2. Detailed parameters of expression analysis of oxytocin-related genes in breast cancer samples compared with nearby non-cancerous tissues.    www.nature.com/scientificreports/ common strategy is to collect all related public expression-profiling of microarray and RNA-sequencing data using appropriate criteria and to combine them to construct co-expression network to identify hub mRNA/ lncRNAs along with using PPI network analysis 23,24 . However, in the current study, we only selected one dataset. Although selection of this dataset was based on the appropriateness of included samples and methods, additional datasets could also be used for this purpose. So, we proposed future assessment of the results of this study using these datasets. Although deep learning method is a very promising way to predict prognosis for cancer based on biomarkers, an important prerequisite for efficient deep learning models is the large number of samples in proportion of the number of parameters in the model. Here, in the statistical part of the study, we aimed to validate the selected markers in a case-control study with 69 specimens. So, some simpler machine learning methods were used to examine the efficacy of markers. Finally, the potential causal effects behind the association of the oxytocinrelated lncRNA biomarkers with breast cancer should be verified using a statistical approach named Mendelian Randomization 25 .
Taken together, our study demonstrates abnormal expression levels of oxytocin-related genes in breast cancer tissues versus non-cancerous tissues and influence of cancer on the correlation network between these genes, potentiating these genes as biomarkers for breast cancer. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.