Long non-coding RNA-associated competing endogenous RNA axes in the olfactory epithelium in schizophrenia: a bioinformatics analysis

The etiology of schizophrenia (SCZ), as a serious mental illness, is unknown. The significance of genetics in SCZ pathophysiology is yet unknown, and newly identified mechanisms involved in the regulation of gene transcription may be helpful in determining how these changes affect SCZ development and progression. In the current work, we used a bioinformatics approach to describe the role of long non-coding RNA (lncRNA)-associated competing endogenous RNAs (ceRNAs) in the olfactory epithelium (OE) samples in order to better understand the molecular regulatory processes implicated in SCZ disorders in living individuals. The Gene Expression Omnibus database was used to obtain the OE microarray dataset (GSE73129) from SCZ sufferers and control subjects, which contained information about both lncRNAs and mRNAs. The limma package of R software was used to identify the differentially expressed lncRNAs (DElncRNAs) and mRNAs (DEmRNAs). RNA interaction pairs were discovered using the Human MicroRNA Disease Database, DIANA-LncBase, and miRTarBase databases. In this study, the Pearson correlation coefficient was utilized to find positive correlations between DEmRNAs and DElncRNAs in the ceRNA network. Eventually, lncRNA-associated ceRNA axes were developed based on co-expression relations and DElncRNA-miRNA-DEmRNA interactions. This work found six potential DElncRNA-miRNA-DEmRNA loops in SCZ pathogenesis, including, SNTG2-AS1/hsa-miR-7-5p/SLC7A5, FLG-AS1/hsa-miR-34a-5p/FOSL1, LINC00960/hsa-miR-34a-5p/FOSL1, AQP4-AS1/hsa-miR-335-5p/FMN2, SOX2-OT/hsa-miR-24-3p/NOS3, and CASC2/hsa-miR-24-3p/NOS3. According to the findings, ceRNAs in OE might be promising research targets for studying SCZ molecular mechanisms. This could be a great opportunity to examine different aspects of neurodevelopment that may have been hampered early in SCZ patients.


Scientific Reports
| (2021) 11:24497 | https://doi.org/10.1038/s41598-021-04326-0 www.nature.com/scientificreports/ delusions, affect and mood dysfunctions, self-disorder, neurocognitive deficits, and somatic symptoms. Nearly a half of SSD patients have functional impairments, which raise the likelihood of constant unemployment and the difficulty to form and sustain lasting relationships 2,3 . Neuronal, psychological, social, environmental, and genetic variables might lead to SSD formation and maintenance 3,5,6 . Modifications in the morphology of neurons and the brain are thought to be linked to SSD symptomatology 5,6 . Furthermore, studies show that being at younger ages at the onset of disease, experiencing suicide attempts, having a progressive disease onset, and facing difficulty adhering to treatment are all risk factors for recurrence in SSD people 3 . Based on animal studies and imaging data obtained from the patients, it is improbable that schizophrenia is caused by a traditional degenerative process 2 . Defective oligodendrocyte functions, synaptogenesis, and perhaps decreased neurogenesis, with accompanying deficiencies in structural and functional micro-and macroconnectivity, indicate a disruption in the human brain's regenerative potential in schizophrenia 2 . As regards possible underlying neurophysiological and genetic factors, it appears possible that SDD may be considered a failed neuro-regeneration 2 . Aberrant gene expression and protein production are associated with SCZ pathophysiology, and these alterations in SCZ patients occur in multiple brain regions and have temporal variation during disease progression [7][8][9] . More importantly, a growing body of evidence has indicated alterations in non-coding RNAs (ncRNAs) in SCZ patients 10 . These findings help elucidate the molecular mechanisms underlying the dysregulation of gene expression and protein production. The ncRNAs comprise various classes of RNA transcripts with different lengths 11 . Accumulating evidence has revealed the aberrant expression of microRNAs (miRNAs) (20-22 nucleotides) 12,13 and long non-coding RNAs (lncRNAs), with over 200 nucleotides, in the brain of SCZ patients, which implicates in the occurrence and development of SCZ 14,15 . Pandolfi et al. in 2011 proposed competing endogenous RNA (ceRNA) theory as a new regulatory mechanism. They suggested that cross-talk between coding and ncRNAs (including lncRNAs, circular RNAs (circRNAs), and pseudogenes) forms a massive regulatory network across the diverse components of the transcriptome through the miRNA response elements. The ceRNA theory posits that the expression level of two RNA transcripts inversely correlates with target miRNAs levels. Moreover, the expression levels of these two RNA transcripts correlate positively with each other 16 . Many studies have corroborated the ceRNA theory, as emerging evidence verified that ceRNA cross-talk imbalance associates with various diseases 17 .
The role of genetic in SCZ pathophysiology is still vague. The recently identified molecular mechanisms regulating gene transcription could help elucidate how the alterations in gene expression could affect SCZ development and progression. Thereby more efficient therapeutic and diagnostic approaches could be found. Comparative analysis of gene expression profiles between patients and controls provides insights into exploring pathophysiological mechanisms and helps identify potential biomarkers 18 . Uncovering the molecular mechanisms behind psychiatric disorders' pathogenesis is challenging due to the difficulty of accessing central nervous system tissues and cells from live patients. While post-mortem brain samples are invaluable for molecular studies, these samples do not appear to provide reliable molecular information related to the onset or course of cognitive deficits within the same living subjects. Thereby, reliable biological specimens and also samples that can be obtained longitudinally are required 19,20 . Although blood sample is frequently utilized and easy to obtain, it has been revealed that blood cells and brain cells differ in gene expression profile in SCZ studies. It is noteworthy that olfactory epithelium (OE) contains olfactory receptor neurons, which show similar expression patterns to developing brain cells [19][20][21] . One interesting element of employing OE tissues in the field of psychiatric illness is their link with the wider olfactory neurocircuitry: olfactory mucosa → olfactory bulb → olfactory cortex 22 . Alterations in the OE as a result of the disease may not arise in isolation and may represent brain abnormalities in the wider olfactory neurocircuitry. Many structural and functional changes linked with neuropsychiatric illnesses have been observed in most of these brain areas. Therefore, OE is a better choice for molecular studies of SCZ patients [19][20][21][22] .
In this study, we performed a bioinformatics analysis to identify lncRNA-associated ceRNA axes in the OE of live SCZ patients to elucidate molecular regulatory mechanisms related to the disease.

Methods
In the present study, we utilized a bioinformatics approach for data mining of the microarray dataset (GSE73129) with the olfactory epithelium (OE) biopsy samples from SCZ patients and matched controls. We intended to identify differentially expressed mRNAs (DEmRNAs) as well as lncRNAs (DElncRNAs) and construct lncRNAassociated ceRNA regulatory axes. Figure 1 summarizes the stages performed in the bioinformatics strategy.
Gene expression profile data collection. The gene expression profile mentioned above was collected from the NCBI Gene Expression Omnibus database (GEO, https:// www. ncbi. nlm. nih. gov/ geo/). The microarray dataset was based on the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array). The GPL570 contains both mRNA and lncRNA information. The GSE73129 dataset contains 19 OE tissues collected from SCZ patients and 19 OE tissues from healthy individuals 20 .

Data preprocessing and DEmRNAs and DElncRNAs identification. For background correction
and quantile normalization of all primary data records, Robust Multichip Average (RMA) was applied 23 . An interquartile range filter (IQR across the samples on the log base two scale greater than median IQR) was carried out, which was accompanied by an intensity filter (a minimum of > 100 expression signals in a minimum of 25% of the arrays) intending to eliminate insignificant probe sets that are not expressed 24 . For quality control, the AgiMicroRna bioconductor package was applied. We performed principal component analysis (PCA) to conduct a dimensional reduction analysis 25 , aiming to find similarities between each sample group using the ggplot2 package of R software version 4.0.3 (https:// www.r-proje ct. org/) 26 . Differential expression gene analysis (DEGA) was done between SCZ and normal samples using the linear models for microarray data (limma)  28 . We utilized the previously used approach to identify lncRNA probes 29 . Then the complete list of lncRNA genes with HUGO Gene Nomenclature Committee (HGNC) approved symbols were retrieved from the HGNC database (https:// www. genen ames. org/) 30 . Afterward, we compared the lncRNA gene list with our dataset gene symbols and chose the overlapped genes. We used the student t-test and the aberrantly expressed RNAs cut-off were set as follows: (1) a false discovery rate (adjusted P value) < 0.001, and (2) |log2 fold change (log2FC) |> 1.585. There is currently no gold standard for selecting fold change and adjusted P value cut-off. We filtered DEGs using stringent criteria to minimize false positives and analyze genes that had a drastic increase or decrease. It has been proven that biological changes caused by ceRNA regulation are only observable when the miRNA/ceRNA levels increase or decrease drastically in certain physiological states 31 . The Pheatmap and Enhanced Volcano R packages were used to draw the DEGs' heat map and volcano plot.

Prediction of RNA interaction pairs. The experimentally validated interactions between miRNAs and
DElncRNAs were identified using DIANA-LncBase v3 32 . Homo Sapiens "Species" and high "miRNA Confidence Levels" were considered as criteria for the DIANA-LncBase query. SCZ-related miRNAs were collected from the Human microRNA Disease Database (HMDD) v3.2 database 33 . Furthermore, we retrieved the interactions between miRNAs that were collected using the HMDD and target mRNAs from miRTarBase 34 , supported by strong experimental evidence. After comparing the retrieved mRNAs and the previously obtained mRNAs, the duplicated mRNAs were utilized to construct the DElncRNA-miRNA-DEmRNA regulatory axes.
Correlation analysis between DElncRNAs and DEmRNAs, and lncRNA-associated ceRNA axes construction. The Pearson correlation analysis was performed in order to assess positive correlations between DElncRNAs and DEmRNAs in the ceRNA regulatory axes. DELncRNAs, targeted DEmRNAs, and the interacted miRNAs were removed from the ceRNA network in the opposite expression pattern between the targeted DEmRNAs and DElncRNAs. The Hmisc and corrplot packages were applied to calculate the correlations and visualization. Pearson correlation coefficient > 0.5 and P < 0.001 were used as inclusion criteria. Cytoscape software (version 3.8.0) (https:// cytos cape. org/) 35 was applied to construct the ceRNA regulatory axes.

Results
DEmRNAs and DElncRNAs identification. Background adjustment, normalization, gene filtering, and batch adjustment were done before performing DEGA. To control the quality, the AgiMicroRna bioconductor package was used. Following normalizing, box plots for the gene expression data were illustrated to analyze data distribution (Supplementary File S1). Separate arrays in the box plots showed identical medians of expression level, indicating correct adjustment. Furthermore, PCA plot was used to show the spatial distribution of samples (Supplementary File S1). The details of the examined data structure are displayed in PCA. Also, it helps assess similarities between samples. Two control samples were removed due to being spatially far from other control samples.
Based on the stringent criteria (adjusted P value < 0.001, and (2) |log2 fold change (log2FC) |> 1.585), a total of 19 DElncRNAs and 303 DEmRNAs were identified in GSE73129 between SCZ and healthy control OE samples. Hierarchical clustering heatmap of DElncRNAs and volcano plot of DEmRNAs are shown in Fig. 2 Prediction of RNA interaction pairs. We applied the DIANA-LncBase v3 online tool and predicted DElncRNA-miRNA interaction pairs based on the DElncRNAs and consequently revealed that 13 of the 19 DEl-ncRNAs might target candidate miRNAs. Subsequently, miRTarBase was used to find the interactions between www.nature.com/scientificreports/ miRNAs that were collected using the HMDD and candidate mRNAs. Following a comparison between the candidate mRNAs with 304 DEmRNAs, we identified seven overlapping genes.
Correlation analysis between DElncRNAs and DEmRNAs, and lncRNA-associated ceRNA axes construction. The Pearson correlation analysis was performed between DElncRNAs and DEmRNAs to verify the ceRNA axes hypothesis, which mRNA expression is positively regulated by lncRNA through interaction with miRNA (Fig. 3). Based on the co-expression relationships and DElncRNA-miRNA-DEmRNA interactions,

Discussion
Emerging evidence shows deficits in olfactory function in various neuropsychiatric disorders such as SCZ, Alzheimer's, and Parkinson's disease. This might be related to cellular or molecular alterations in the OE 22 , which harbors neuronal lineage cells at various stages of maturation 36,37 . The OE model provides a lot of advantages. First, ex-vivo OE tissues were exposed previously to the in-vivo neurohormonal milieu, having in-vivo neurobiological hallmarks. Moreover, they can act as a reference for both in vitro and in vivo OE results, bridging the gap between the two techniques. Second, while restricted by the amount of tissue accessible, OE biopsies may safely be taken numerous times from the same patients and can be integrated with longitudinal clinical research designs. OE biopsies are collected in certain stages of the disease in this procedure, while patients' clinical circumstances are meticulously documented 22 . It is a unique chance in neuropsychiatric research to interpret neurobiological variables regarding clinical alterations. Examples of such applications comprise, but are not confined to, collecting and evaluating OE biopsies from people at risk for the onset of schizophrenia and their intact relatives prior to and after the development of disease 22 .
A number of studies have demonstrated the activity of ceRNA regulatory loops and the related networks in several pathological conditions and developmental processes, e.g., tumorigenesis, neurodegenerative diseases 38 , and mental disorders 39,40 . There may be various ceRNAs, including mRNAs, pseudogenes, circRNAs, and lncR-NAs, in a network 31 . LncRNAs, as a major group of RNAs within the ceRNA machinery, have a key role in the regulation of pathological and physiological cellular mechanisms. The expression of lncRNAs found to be affected under various mental conditions 41 . Interestingly, lncRNAs expression depends on developmental level, cell and tissue type. Subcellular distributions and tissue specificity indicate intensive regulation of lncRNAs expression 42 . According to the above-mentioned theoretical concepts, the lncRNA-related ceRNA regulatory network can substantially affect SCZ pathogenesis. Studies on SCZ-associated ceRNA regulatory loops remain yet to be extended, and it is necessary to further examine the corresponding mechanisms and patterns of expression in SCZ. In the present study, the OE sample expression profiles were downloaded from a public database to evaluate DEmRNAs and DElncRNAs in SCZ and normal tissues in order to construct DElncRNA-miRNA-DEmRNA regulatory loops. We identified six possible DElncRNA-miRNA-DEmRNA loops in the pathogenesis of SCZ: SNTG2-AS1/ hsa-miR-7-5p/SLC7A5, FLG-AS1/hsa-miR-34a-5p/FOSL1, LINC00960/hsa-miR-34a-5p/FOSL1, AQP4-AS1/hsa-miR-335-5p/FMN2, SOX2-OT/hsa-miR-24-3p/NOS3, and CASC2/hsa-miR-24-3p/NOS3.
A number of studies have suggested that dysregulation of lncRNAs might participate in the pathogenesis of SCZ 43 . The current study identified several DElncRNAs, among which only the association between SOX2-OT and SCZ was identified in previous studies. SOX2-OT is an evolutionarily conserved lncRNA. SOX2 gene, an www.nature.com/scientificreports/ essential embryonic stem cell pluripotency regulator, is embedded in intronic region of SOX2-OT. The SOX2-OT, as an important ceRNA, has been identified to influence the progression of multiple cancers. According to genome-wide association studies (GWAS), there are associations between mental disorders (e.g., general cognitive disorders, SCZ, eating disorders, insomnia, anorexia nervosa, and night sleep phenotypes) and SOX-OT-mapped single nucleotide polymorphisms (SNPs). Mental conditions account for over half of SOX2-OTrelated disorders 44 . FLG-AS1 and AQP4-AS1 are two other DElncRNAs that were identified by our analysis. Dysregulation of FLG-AS1 was reported in some cancers, but the specific function and detailed mechanisms of FLG-AS1 are still unknown 45,46 . A previous integrative analysis showed that FLG-AS1 acts as a ceRNA in adipose tissue from obese individuals with type 2 diabetes 47 . Earlier studies have found that structural variant within FLG-AS1 may play an important role in attention-deficit hyperactivity disorder (ADHD) development 48 . The AQP4-AS1 gene transcribes an antisense lncRNA with an unknown function 49 . AQP4-AS1 was identified as a ceRNA in gastric cancer via bioinformatics analysis 50 . Moreover, a previous study reported that AQP4-AS1 is associated with depression as a mental disorder 51 . To the best of our knowledge, the association between SNTG2-AS1, CASC2, and LINC00960 lncRNAs and mental disorders has not been studied thus far. SNTG2-AS1 is an antisense lncRNA, and its function is still unknown 52 . Since the host transcript could be regulated by the same number of antisense transcripts 53 , the contributions of this lncRNA may be realized by the nearby syntrophin gamma 2 (SNTG2) gene. SNTG2 encoded protein is a member of the syntrophin family. Syntrophins are crucial scaffolding proteins, due to binding to the dystrobrevin and dystrophin 54 . The interrupted 2p25.3 duplication encompassing SNTG2 was identified in two SCZ patients 55 . The biological functions of LINC00960, as a newly discovered lncRNA, in human diseases remain to be elucidated. It was indicated that LINC00960 acts as a ceRNA in diabetic nephropathy and pancreatic ductal adenocarcinoma 56,57 . As a lncRNA, CASC2 suppresses tumors in various tissues and influences multiple signaling pathways and genes. It plays a role as a ceRNA for some miRNAs and influences the activity of their targets 58 . MiRNAs bind to the untranslated region of the target gene, consequently controlling the target gene expression. It has been identified that miRNAs can affect signal transduction and biological pathways within the cell and induce SCZ progression 59 . The current study revealed that the key DElncRNAs could sponge four important miRNAs (hsa-miR-34a-5p, hsa-miR-7-5p, hsa-miR-335-5p, and hsa-miR- 24-3p) that are associated with SCZ, resulting in the regulation of key DEmRNAs. In line with our findings, increased expressions of hsa-miR-34a-5p and hsa-miR-7-5p in the periphery of SCZ patients [60][61][62] and decreased expressions of hsa-miR-24-3p in the prefrontal cortex of affected individuals 63 have been reported previously. Besides, previous studies of the miRNA-derived network analysis reported the fine-tuning of the genes involved in the SCZ biological pathway by hsa-miR-335-5p. Hence, they supported the ceRNA emerging theory 64 . The findings of the present work are supported by most of such works; however, molecular methods (e.g., PCR, co-immunoprecipitation assays, and luciferase reporter systems) are yet to be employed to validate the predicted ceRNA loops.
Four key DEmRNAs (FOSL1, SLC7A5, FMN2, and NOS3) were reported in this study. FOS, FOSB, FOSL1, and FOSL2 are the FOS family members. This family encodes the leucine zipper proteins, which have the capability of dimerization with JUN, JUND, and JUNB (the JUN family members) in order to form the AP-1 transcription factor. Hence, the FOS family regulates the proliferation, transformation, differentiation, and apoptotic death of cells 65 . Based on upstream transcription factor analysis and high-throughput gene studies, FOSL1 implicates in SCZ through various mechanisms, such as influencing accessible chromatin regions and connecting to neuroinflammation signaling cascades [66][67][68] . SLC7A5, also known as LAT1, encodes a large amino acid transporter located in the blood-brain barrier. It is essential in the maintenance of brain branched-chain amino acids' normal levels 69 . It was revealed that SCZ patients have an aberrant amino acid transport activities, such as the aberrant tyrosine transport through the plasma membrane 70 and excitatory amino acid transport 71 . Moreover, genetic and functional studies of the SLC7A5 gene have suggested an association between SLC7A5 SNP rs9936204 genotype with SCZ vulnerability in humans, and also, the SLC7A5 isoform was identified to be a significant transporter of tyrosine in SCZ patients 72 . The formin family is a significant effector class for not only microtubule regulation but also actin regulation. This family includes proteins containing the formin homology 1 (FH1) and formin homology 2 (FH2) domains. FMN2 is a member of the formin family and is highly expressed in several regions of the developing and adult brain as well as the spinal cord. The major characteristic of FMN2 is its contribution to actin dynamics regulation. It is an interesting regulator of actin in Wnt canonical pathway homeostatic regulation within neural progenitors. FMN2 not only regulates actin dynamics and bundling but also associates with axon growth cone maintenance and pathfinding. Previous findings have associated high FMN2 expression maintenance within differentiated neurons with the maintenance and plasticity of the synapse 73 . It has been reported that formin genes are involved in a number of neural disorders (e.g., amyotrophic lateral sclerosis 74 and SCZ 75,76 ). The FMN2 SNPs rs6050455 and rs6656902 were known to be associated with SCZ 77,78 . NOS3 belongs to the nitric oxide synthase (NOS) enzyme family and participates in the nitric oxide (NO) generation. Research has shown that NO contributes to SCZ pathogenesis. There is solid evidence of the effects of the NO metabolism on SCZ processes, including nerve cell migration, synapse formation and maintenance, N-methyl-D-aspartic acid receptor-mediated neurotransmission, cognitive abilities, membrane pathology, and hippocampal neurogenesis 79 . Moreover, there is evidence that the SNPs of NOS3 are associated with SCZ 80,81 .
It is noteworthy that several technical factors, such as different methodologies, patient characteristics, preparation of samples, analysis of data, and platforms, could affect the gene expression profiles. Of course, confirmative experimental works and comparisons to reanalysis modified microarray gene expression are required to validate our findings.