According to GLOBOCAN 2018, pancreatic cancer (PC) is the 4th leading cause of cancer-related death and is associated with high mortality and poor prognosis1. PC is a malignant condition with little/no warning signs before the disease reaches its ultimate stages2. The majority of patients with PC are already reached to either locally advanced or metastatic level in the asymptomatic phase before referring to the clinic and as many as 80% are categorized in unresectable group3. The average survival rate for the PC is reported to be less than one year. Some studies reported that the 5-year survival rate would increase notably if PC patients were diagnosed at initial stages and subjected to surgical resection followed by chemotherapy4,5. Currently early detection of PC is very difficult because most patients are found with non-specific symptoms leading to postponing the correct diagnosis6. On the other hand, sometimes pancreatic mass is indistinguishable from chronic pancreatitis or benign pancreatic cysts, so the results of pathological assessment of biopsies obtained from the lesion could be not informative7. Moreover, cytological analysis of the sample taken by endoscopic ultrasound-guided fine needle aspiration (EUS-FNA) may be non-precise because of sampling complications, inflammation coexistence or other conditions8. Therefore, finding possible non-invasive biomarkers at early stages of PC progression is crucial for evaluation of high-risk subjects to establish follow-up strategies and surgical resection of primary malignancy. In this regard, numerous scientists tend to identify biomarkers that could help gastroenterologists and pathologists in PC detection, and finding potential biomarkers with the possibility to be accessible in a less invasive method have become a research trend. The ultimate biomarkers must be easily detectable with fine sensitivity and specificity and also must discriminate PC from other benign pancreatic diseases9. Blood is a simply reachable and rather steady sample to find alerting biomarkers. Technological advances in the recent years have provided possibilities to detect circulating biomarkers based on “omics” research, relying on proteins, cell-free DNAs, non-coding RNAs, circulating tumor cells (CTCs), and exosomes molecular contents10. MicroRNAs (miRs) are small (~22 nucleotides) non-coding RNAs that have gene regulatory roles via targeting the 3′-untranslated region (3′-UTR) of their target mRNA and finally cause either translational repression or mRNA degradation11,12. Up to now, a number of biomarkers have been introduced as PC biomarkers such CEA, CA19-9, CA125 and CA72-4. Nonetheless, none of these tumor markers has shown efficient sensitivity or specificity for diagnosing PC at primary stages and have been used for post resection monitoring rather than earlier detection purposes. MiRNAs seem to be truly stable in blood and several authors reported that miRNAs show dysregulation in pancreatic diseases being able to differentiate PC from pancreatitis, pancreatic benign masses as well as normal subjects13. Furthermore, owing to advanced technologies in high-throughput molecular methods the understanding of the pathophysiology of pancreatic cancer have been improved. Various genome-wide mRNA and miRNA expression profiling studies using microarray-based and NGS approaches have provided important insights into the phenotypic characteristics of pancreatic cancer14. In this study, using multiple bioinformatics tools, we integrated various serum expression profiles of miRNAs to find the most significant potential miRNA signatures helpful in the diagnosis of PC and constructed a novel miRNA- mRNA regulatory network in PC using bioinformatics approaches. Next, we investigated the molecular mechanisms downstream of the captured miRNA signatures and their predicted target genes correlated to PC progression and analyzed them in a logistic model.

Material and method

Microarray datasets search

In order to find proper miRNA expression profiles in microarray datasets, we conducted a systematic search in Gene Expression Omnibus (GEO) database ( Using the keywords “Pancreatic cancer” and “Serum”, at the first step we reached 900 datasets. Then, we limited the results using ‘Homo sapiens’ and ‘Non-coding RNA profiling by array’ filters, so we reached to 16 datasets. Finally, by setting the sample count on more than 200 samples, 4 final datasets were achieved.

Differentially expressed miRNAs (DEMs) detection

The DEMs were obtained using the online tool GEO2R in the GEO database15, which makes evaluations using the GEOquery and limma R packages from the Bioconductor project to compare two or more groups of samples in a GEO dataset. Normalization has been carried out using the RMA algorithm. To keep away from the false-positive/negative and the differences between microarray platforms, the microarray gene-expression profiles of PC groups was compared to the normal groups of each dataset separately. |log2 (RMA signal intensity fold change) | ≥1 and p value > 0.01 was set as cut off to identify the significant DEMs. After that, significantly expressed DEMs in each study were listed respectively.

Combination of the data

A Venn diagram creator tool in Bioinformatics & Evolutionary Genomics source was used to combine all datasets and find the overlapping DEMs ( In order to cover more miRNAs with differential expression and to prevent missing of critical genes that may not have shown differences in expression in one study for any reason, it was decided to select the DEMs overlapped between at least 3 of datasets.

Area under curve (AUC) analysis

The expression values of all overlapping DEMs were extracted and imported to GraphPad Prism software. After normalization of the values, Receiver-operating characteristic (ROC) curves and area under the ROC curve (AUC) were used to assess the detection ability of each miRNA in discriminating PC patients from the control group based on the sensitivity and specificity of each DEM.

Hierarchical clustering analysis

Individual expression values of significantly up/down regulated DEMs in PC (27 DEMs) were logarithm transformed and were used as input values for the hierarchical clustering algorithm. The following criteria was applied: The distance chose “Pearson Correlation”, and the linkage selected “average”. The result is demonstrated as a Heatmap.

Expression correlation analysis

The expression values of the DEMs were clustered using the k-means method. The median of expression values of all miRNAs for each sample was used as the representative expression for the cluster. Using R programing software16 and the corrplot package17, Pearson’s correlation for 27 DEMs expression profiles were analyzed and demonstrated as a Corrplot.

MiRNA-miRNA interaction network

MiRNet is an online tool suite designed for precise analysis and functional interpretation of miRNAs and xeno-miRNAs18. This tool holds numerous high-quality science-base to link miRNAs to their targets and other correlated molecules. Network analysis was carried out using functional annotations based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, and by the use of the hypergeometric algorithm for enrichment analysis of loaded data, a table of validated miRNA target genes was achieved using MiRNet.

MiRNA-mRNA interaction

MirDIP, a miRNA target prediction online tool, supplies almost 152 million human miRNA–target predictions, which were gathered from 30 different resources. It also provides an integrative score, which was statistically concluded from the acquired predictions, and was assigned to each unique miRNA–target interaction to give a unified measure of confidence19. Using this tool we reached to predicted target genes lists of each considered miRNA.

Protein-protein interaction (PPI) network and functional enrichment analysis

Interactions between the target genes of selected miRNAs were predicted using the STRING database v 9.0.5. Confident interaction score was set on ≥0.7. The PPI networks were uploaded and visualized to Cytoscape software20 and top modules of PPI network were picked using Molecular Complex Detection (MCODE)21 with the inclusion criteria as follow: degree =2, node score =0.2, k-core=2, max. The average degrees of MCODE score and nodes in modules were chosen as the threshold, thus we set MCODE scores ≥9 and hub nodes ≥9 as criteria. Functional enrichment analysis was performed using DAVID for all targets and top modules respectively.

Multivariate regression analysis

The expression of each miRNA was in continuous format and described as mean and standard deviation (SD). Candidate miRNA expressions were examined on the basis of their univariable association with PC. The full model with the covariate effects was built according to five clusters obtained from bioinformatics analysis, using the stepwise inclusion method. The estimation of the coefficients in each regression model was checked by multicollinearity by analyzing variance inflation factor (VIF). A variable whose VIF values were more than 10 may need further investigation. The significance of covariates in each full model was further tested by the backward elimination method. The akaike information criterion (AIC) and bayesian information criterion (BIC) were computed each time a variable was included/excluded and the model with lower AIC or BIC was preferred. We also determined the performances of each model by examining measures of calibration and discrimination. Calibration points out how nearly the predicted probability of having PC agrees with the observed PC status. This was assessed by Hosmer-Lemeshow test. Discrimination expresses the ability of the clinical decision rule to differentiate between individuals with and without PC. This was assessed by calculating the area under the ROC curve (AUC) statistic. We considered an AUC value of 0.5 as no discrimination, and 1 as perfect discrimination. All analyses were performed by using Stata software (version 14). Results were statistically significant by p < 0.05 levels.


As a validation set, an independent cohort composed of serum miRNA expression profiles from patients with pancreatic cancer and healthy controls was provided. The subjects were chosen from two other GEO datasets (GSE112264 & GSE124158) consisting of serum miRNA profiles from PC patients and healthy controls (70 controls and 81 PC). The diagnostic performances of the considered models were checked through determining the combined AUCs for each considered panel. For this aim, the performances of each miRNA individually and also together as panels were analyzed and AUC scores and ROC curves for each model were obtained.


Microarray datasets search results

The flow chart of the datasets selection procedure and the features of the datasets are shown in Fig. 1. Some of the datasets contain a large set of expression profiles of various cancers, including PC. For these studies, a large number of healthy controls were included as normal controls, and we selected control samples based on the number of PC samples. GSE106817 consists of more than 2000 normal serum samples and 115 serum samples from PC patients (Yokoi et al., 2018). GSE113486 includes 200 normal and 40 PC serum samples (Usuba et al., 2019). GSE59856 includes 300 healthy controls and 100 serum samples from PC patients (Kojima et al., 2015). The fourth dataset, GSE85589 consisted 29 healthy subjects and 88 PC patients’ sera. It should be added that just the two later datasets included some available demographic features such as cancer stage, age, gender and CA19-9 levels of the patients, so it was not possible to evaluate the associations of the final DEMs to these kinds of features. Two other datasets (GSE112264 & GSE124158) consisting of serum miRNA profiles from PC patients and healthy controls (70 controls and 81 PC) were also considered as the validation set.

Figure 1
figure 1

The selection procedure and information of the datasets. Note: Due to the increased number of datasets in the GEO database from the date of this investigation (February, 2019), different results may be obtained by applying the same criteria.

Differentially expressed miRNAs (DEMs) and overlapped DEMs among 4 groups

A total of 1346, 1471, 127 and 93 miRNAs showed significant up/down regulation in GSE106817, GSE113486, GSE85589 and GSE59856 microarray datasets, respectively. After integration of the results, 105 miRNAs that were common in at least 3 of datasets were captured (Table 1). 5 miRNAs (hsa-let-7b-5p, hsa-miR-4721, hsa-miR-122-5p, hsa-miR-1290 and hsa-miR-125a-3p) were common between all four analyzed datasets. Figure 2 represents the number of miRNAs which are shared between the datasets.

Table 1 The overlapped DEMs among 4 different datasets.
Figure 2
figure 2

A VENN diagram representing the number of DEMs that are in common between different analyzed datasets.

AUC analysis

After extraction of the expression values for all 105 considered DEMs, in order to find the most reliable ones in discriminating PC from healthy controls, AUC analysis was performed and the ROC curves for all 105 DEMs were prepared. Finally, 27 DEMs showed AUC score >80% (Table 2) and the ROC curves of them is represented in Fig. 3. The list of those DEMs with AUC >80% are also listed in Table 3. The most promising miRNA, miR-1469, was able to distinguish the two groups with 91% specificity and 100% sensitivity. ROC curve analysis showed that the AUC value for this miRNA was 0.98 (95% CI 0.95–1.06) and this value for miR-4530 was 0.93 (95% CI 0.91–0.97). Based on these results, miR-1469 and miR-4530 may be the strongest individual signatures for differentiating PC patients from healthy controls.

Table 2 The results of AUC analysis for considered DEMs.
Figure 3
figure 3

The ROC curves for DEMs with AUC > 80%.

Table 3 GO Enrichment of Target DEGs in PC.

Co-expression correlations

Figure 4A demonstrates a hierarchically clustered Heatmap built up using the expression values of all 27 captured DEMs. Six miRNAs showed down-regulation and 21 miRNA showed up-regulation in PC patients compared with healthy subjects. Pearson’s correlation coefficients were determined among all the 27 miRNA captured signatures (Fig. 4B). Altogether, 6 miRNAs including miR-5100, miR-8073, miR-642b-3p, miR-1246, miR-1469 and miR-663a showed the strongest positive correlations. The highest positive correlation coefficient was found between miR-5100 and miR-8073 (Pearson’s correlation = 0.893, p < 0.001), followed by miRNA-642b-3p with miR-663a and miR-1469 (Pearson’s correlations = 0.864, p < 0.001 and 0.851, p < 0.001).

Figure 4
figure 4

(A) The hierarchically clustered Heatmap built up using the expression values of all 27 captured DEMs. (B) Pearson’s correlation plot for 27 DEMs expression profiles. Pearson’s correlations calculated for all 27 DEMs values are demonstrated as circles whose sizes are representative of the certain correlation value, with colors ranging from dark red (coefficient -1), to dark blue (coefficient 1), as described in the color scale.

MiRNA-miRNA interaction network

Using MiRNet online software, the interactions between the 27 DEMs were analyzed based on their target genes and downstream molecular pathways. After uploading the miRNA IDs and setting the cut off degree on 2, a networks based on the following parameters was acquired (Fig. 5A): number of queries: 27, number of nodes: 2931 (miRNAs: 29, Targets: 2826) and number of edges:18640 (Fig. 5A). A module consisting of PC related genes and their connected miRNAs was extracted from this network and the result is represented in Fig. 5B.

Figure 5
figure 5

(A) The miRNA-mRNA interactions among all 27 considered DEMs. The red circles represent all predicted target genes and the green ones represent the genes implicated in cancer related pathways. (B) Well-known target genes for pancreatic cancer in associations to the considered DEMs.

Target genes of miRNAs and functional enrichment analysis

Overall 1147 target genes for all 27 DEMs were predicted. The full list of those DEMs in addition to their predicted target genes are available in Supplementary File 1. Through DAVID online tool, GO and KEGG pathway enrichment of the identified target genes were performed. GO biological process (BP) analysis showed that the target genes were mostly involved in response to drug, regulation of translation and cellular membrane organization (Table 3). KEGG analysis showed that the target genes were mainly implicated in cancer-related pathways including chronic myeloid leukemia, glioma, prostate cancer and also pancreatic cancer (Table 4).

Table 4 KEGG pathway enrichment of Target DEGs in PC.

PPI network construction and top module selection

The PPI network was composed of 506 nodes and 10805 edges. Using the plug-in MCODE in Cytoscape software, three significant modules were selected with MCODE score ≥10 (Fig. 6). Functional enrichment analysis indicated that the genes of these modules were significantly enriched in ubiquitination, gene expression regulation, and spliceosome complexes.

Figure 6
figure 6

Three top modules (clusters) extracted from PPI network interactions between all of the target genes of considered miRNAs. Blue circles and the green diamonds represent the predicted target genes and the associated miRNAs respectively.

Top clustered DEMs and modules

Five clusters of most related miRNAs amongst all of the 27 considered DEMs were defined based on the different aspects of their trends in the expression and functional properties. The list of all clusters components is available in Supplementary File 2.

Multivariate regression models

The results of univariable logistic regression showed the significant role of each miRNA in the distinction of PC (Supplementary File 2, Crude model). Univariable logistic regression analysis in each cluster showed that all selected miRNAs significantly increase the risk of PC and these associations were statistically highly significant (P < 0.0001; Supplementary File 2, Crude Model).

The multivariable logistic regression of miRNA expression was based on five cultures that resulted by clustering method in bioinformatics analysis. In each cluster when all predictors were included in the model, the association of them with PC changed. In cluster 1, while the strength of association among miR-125a-3p, miR-92a-2-5p and miR-4530 and PC decreased, but remained statistically significant (P = 0.031, P < 0.0001 and P < 0.0001, respectively; Supplementary Table 1, Model A). However, the association between miR-125b-1-3p, miR-6893-5p and miR-4476 and PC were statistically non-significant and omitted in further analysis. In cluster 2, the associations between miR-8073 and miR-663a and PC were decreased but remained statistically significant (P = 0.005 and P = 0.032, respectively; Supplementary Table 1, Model A). Although the interaction between miR-5100 and miR-642b-3p were found to be non-significant (P = 0.082 and P = 0.061, respectively; Supplementary Table 1, Model A), but they were forced into the final diagnostic model because of their trend relevance. However, the association between miR-1246 and PC was statistically non-significant and omitted from further analysis. In cluster 3, while the strength of association among PC and miR-125a-3p (P = 0.008; Supplementary Table 2, Model A) was increased and associations were decreased for miR-5100 and miR-642b-3p (P = 0.003 and P = 0.005, respectively; Supplementary Table 1, Model A), but remained statistically significant. However, the associations between miR-606, miR-4668 and miR-3910 and PC were statistically non-significant and omitted for further analysis. In cluster 4, the strength of association between PC and miR-4668, miR-663a and miR-125a-3p were increased (P = 0.078, P = 0.058 and P = 0.080 respectively; Supplementary Table 2, Model A), but remained statistically non-significant. Yet, they were forced into the final diagnostic model because of their trend relevance. However, associations between PC and other miRNAs of the cluster were statistically non-significant and omitted frm further analysis. In cluster 5, although the strength of association between miR-8073, miR-92a-25p and miR-5100 and PC was decreased (P = 0.004, P < 0.0001 and P = 0.006 respectively; Supplementary 2. Table 2, Model A), it remained statistically significant. However, association between PC and miR-1246 was statistically non-significant and omitted from further analysis.

The results of univariable and multivariable logistic regression along with the AIC and BIC values corresponding to the inclusion/exclusion of each predictor (Supplementary Table 2) were used to select predictors of the full diagnostic (logistic) model.

The optimum model in each cluster was selected by both methods corresponded to the model consisting of different predictors (Table 5). The final prediction models in different clusters leading to better diagnosis of PC are presented in Table 3. The Hosmer-Lemeshow statistic suggested that fit of model was adequate for each cluster dataset (Table 5). In this regard, the cluster 3 was the most fitted model that the ROC analysis on the predicted probabilities of PC derived from this model yielded an AUC of 0.95, Sensitivity 0.98 and Specificity 0.97, which segregated PC patients from controls. Our suggested model, based on miRNA expression indices provides a molecular screening strategy, suitable for application prior to subsequent invasive methods of risk monitoring, such as surgery. However, it should be noted that this paper is basically bioinformatics research. Without any new sample being prospectively collected and examined to prove the validity of the proposed signatures, the suggested models will not be appropriate for clinical use.

Table 5 Full diagnostic (logistic) model for pancreatic cancer, including the intercept.


The performance evaluations of the resulted diagnostic models showed that model 2 demonstrates the best performance in discriminating PC patients from healthy controls (AUC: 0.978; sensitivity: 0.986; specificity: 0.875) (Fig. 7 & Table 6). The 4 other models also showed AUC >80% which are statistically significant values (Table 6 & Supplementary File 3).

Figure 7
figure 7

Performance of the model 2 on the validation cohort. The combinatorial multivariable ROC curve of the model is demonstrated. Colored lines represent the ROCs of each marker individually. Black line represents the combinatorial ROC for all of the DEMs as model 2.

Table 6 The performance of the considered models on the validation set.


In order to early diagnosis of patients with PC, there is a comprehensive need to find biomarkers with high specificity and sensitivity. By using bioinformatics methods, this study describes 5 new miRNA panels for PC diagnosis using a combination of 27 miRNAs in serum of PC patients. To find the miRNA candidates, 4 GEO microarray datasets containing the miRNA expression profiles from the serum of PC and healthy controls were chosen for differential expression analysis. DEMs from all of the datasets were extracted and at the result, 105 miRNAs that were common between 4 or 3 categories were captured out. In the next step, the expression values for all these DEMs were extracted and normalized. Using ROC curve analysis, the most powerful DEMs in discriminating PC patients from healthy controls were picked out. Overall, 27 DEMs showed AUC >80% and were considered as suitable candidates for further analysis. Although recent findings have introduced circulating miRNAs as diagnostic cancer markers, none have led to the utilization of these markers for clinical plans due to the insufficiency of individual miRNA biomarkers in clinical testing22,23. An increasing interest to combine biomarkers into unique panels tackles the problem of tumor heterogeneity and low specificity and sensitivity of single miRNAs to diagnose certain cancers. For this reason, multiple mathematical models have been employed to weigh the efficiency of combinations of miRNAs as cancer diagnostic biomarkers. These methods comprise threshold-based methods, logistic regression, decision trees and support vector machine24. In this study, we used logistic regression method to find the most promising miRNA combinations as diagnostic models for PC. Figure 8 demonstrates all steps of finding the hub DEMs and clustering processes.

Figure 8
figure 8

The summary of the whole steps of the hub DEMs selection and clustering of them.

To find the hub genes, multiple bioinformatics and statistical analyses were performed to specify the fittest miRNA combination among all 27 DEMs, as a diagnostic model. Five clusters of miRNAs were introduced based on different approaches in grouping genes such as clustering based on co-expression (cluster 1-2), correlations of the expressions (cluster 3), association of the target genes to PC (cluster 4) and functional enrichment analysis of the target genes (cluster 5). The first 2 clusters were determined based on the results of hierarchically clustered genes, where based on the co-expression of the DEMs, 2 clusters could be separated. The 3th cluster is consisted of the genes that have the highest positive or negative correlation (>0.8) of the expression to each other. Based on the functional enrichment analyses that were performed on the predicted target genes of considered DEMs, two more clusters were specified. One of them included the DEMs that target well-known genes contributed to PC (cluster 4) and the second one consisted the DEMs that are associated with the top modules extracted from the PPI network of whole target genes of 27 DEMs (cluster 5). Finally, in order to find the most promising models amongst the determined clusters, multivariate cox regression analysis was performed on all clusters and the fittest models as panels of miRNAs were identified to discriminate PC patients from the healthy subjects. Overall, 2 models as the co-expression models, 1 model as the correlation model and 2 models as the gene functional models were introduced.

As can be seen in Fig. 4, two groups of co-expressed DEMs are prominent. The first cluster is down-regulated and the second one is up-regulated and both of them consist the DEMs with the most similarity in expression to each other. The downregulated co-expressed cluster consists of miR-125b-1-3p, miR-125a-3p, miR-92a-5p, miR-4530, miR-6893-5p and miR-4476. While all of these DEMs may be strong discriminators of PC patients from healthy controls solitarily, but the fittest model extracted from this cluster included miR-125a-3p, miR-92a-5p and miR-4530. These miRNAs have shown differential expression in a variety of cancers especially gastro Intestinal cancers in multiple studies and are known as tumor suppressor miRNAs25,26,27,28. The other DEMs of this cluster also have documented traces in association with GI cancers. For example, Yamada A et al. have introduced liquid biopsy markers for early detection of colorectal cancer and, in a cohort of 237 patients, circulating levels miR-125b independently showed differentiated expression in colorectal neoplasms in comparison to healthy controls. However, they showed that this miRNA in combination with some other miRNAs as a panel has improved the accuracy of detection29. The second cluster included a set of up-regulated miRNAs such as miR-1469, miR-1246, miR-5100, miR-8073, miR-642b-3p and miR-663a that showed significant co-expression in PC patients. MiR-1469 was the most powerful individual marker in the diagnosis of PC in the results of our analyses. Multiple studies have reported the effect and differential expression of this miRNA in various cancers such as lung, gastric, rectal and also pancreatic cancer30,31,32,33. Similarly, miR-1246 has been introduced as a significant serum/plasma marker in the diagnosis of a variety of cancers including esophageal squamous cell carcinoma, lung, prostate, colorectal and eventually pancreatic cancer34,35,36,37,38. MiR-5100 has been associated with risk of PC as 2 studies have reported differential expression of this miRNA in PC patients’ saliva and cell lines39,40. Despite that miR-8073 has been confirmed as a tumor suppressor miRNA in some cancers such as breast, ovarian and colorectal41,42,43, in this study we found significant up-regulation of this miRNA in the serum of PC patients. In line with the results of our study, mir-642b-3p have records of significant overexpression in the serum of PC patients in some recent studies44,45,46. For miR-663a, some studies have reported significant down-regulation in a variety of cancers such as colorectal and non-small cell lung as well as PC33,47,48 that may show a tumor suppressive effect of this gene, but in this study we detected significant up-regulation of this miRNA through the analyzed datasets. Stablished model from cluster 2, consisted miR-5100, miR-8073, miR-642b-3p and miR-663a. Although miR-1469 and miR-1246 individually showed high power to differentiate PC patients from healthy controls, they did not fit into the logistic model along with other miRNAs. The correlation model was also the top model introduced from the third cluster including the DEMs with the strongest negative or positive correlation of expression. MiR-125a-3p, miR-606, miR-4668, miR-3910, miR-5100, miR-642b-3p and miR-532 formed this cluster and most of them have been reported as feasible diagnostic biomarkers for a variety cancers (Bibi et al., 2016b; Cheng, Wang, Han, & sciences, 2017; Song, Wang, Jin, Wang, & Duan, 2015), but here we introduced them as a unified model for this aim. The acquired logistic model from this cluster includes miR-125a-3p, miR-5100, miR-642b-3p that were discussed earlier.

In order to find miRNAs that are the most related to each other and can be used as a diagnostic panel, we performed multiple functional enrichment analyses so we could categorize the considered miRNAs based on the functions of their target genes. First, using an online miRNA target gene prediction software, all feasible target genes were predicted. Afterwards, the functional status of all target gene were analyzed in regard of GO and KEGG functional enrichment. GO analysis showed that a significant number of all target genes have role in the processes such as cellular membrane organization, cell-cell junction organization and. These kinds of cellular and molecular functions are clearly understood as critical procedures in tumorigenesis and metastasis49,50. On the other hand, the results of KEGG pathway analysis demonstrated that most of the target genes of considered miRNAs are implicated in various cancer-related pathways including chronic myeloid leukemia, glioma, prostate cancer and also PC. Figure 5A represents all miRNA-target genes interactions while the green circles represents the target genes implicated in cancer related pathways. Figure 5B is the extracted module consists of known genes correlated to PC as well as their contributed miRNAs. MiRNAs of this module were considered as a cluster and miR-4668, miR-663a, miR-3128, miR-125a-3p, miR-3910, miR-3152, miR-606, miR-3927 and miR-6893-5p were the compartments of it. After performing multivariate logistic regression analysis miR-4668, miR-663a and miR-125a-3p were unified as the fifth model. Amongst the 3 compartments of this model, miR-4668 has been shown to have over-expression in serum and tissue samples from in hepatocellular carcinoma and gastric cancer51,52. As it is shown in Fig. 5B, a group of considered miRNAs in this study may target some known genes associated with pancreatic cancer. For example, Schutte et al. detected improper hypermethylation of the p16/CDKN1A gene in a group of PC patients53. In this study, we found significant overexpression of miR-4668 and miR-663a that both are strongly predicted to target the CDKN1A gene. Another gene of this module is RHOB, that is a known tumor suppressor gene in various cancers54,55, nevertheless, it is not much known in PC. The revival of suppressed RHOB leads to tumor regression in different types of cancers56 and may be used as a critical target in cancer therapy57. Yonggang Tan et al. demonstrated significant down-regulation of RHOB in human PC and showed that this gene suppresses the progression of PC by inhibiting proliferation, migration, and invasion, as well as by inducing apoptosis58. In the present study, we showed that miR-92a-5p and miR-663a are predicted to target this gene and may have associations to PC progression.

In the aspect of PPI network construction through the target genes of considered DEMs and analysis of the top modules, we identified 3 top modules that had the strongest PPIs through all the target genes. Later, we found that a set of miRNAs are connected to all these top 3 modules, as it is shown in Fig. 8 (miR-8073, miR-92a-2-5p, miR-5100, miR-1246, miR-1469 and miR-642b-3p), so we assumed them as a cluster and the fifth model consisting of miR-8073, miR-92a-5p and miR-5100 were extracted from this cluster. The functional enrichment analyses showed that the most component of this three modules are implicated in ubiquitination, gene expression regulation and spliceosome complexes.

In conclusion, this study supports the accuracy of some formerly proposed biomarkers for PC and also has suggested new candidate miRNAs which can be used as diagnostic or prognostic means or as therapeutic targets. We introduced 5 diagnostic models consisting of different combinations of miRNAs, based on their significant expression algorithms and functional properties. The aim of this study was to identify appropriate miRNA biomarkers in serum samples that could differentiate PC from healthy individuals. For this matter, we have to first test the sides of the coin (ie, PC vs healthy controls) because if panels of microRNA could not discriminate these two extremes, it would not be possible to develop a diagnostic microRNA for early detection of primary tumors or early stages of the disease. However, it should be considered that none of these models have been tested in experimental studies up to now and they need to be validated in such investigations. Even though this bioinformatics study presented some additional biomarkers or panels for possible consideration in future research, the analyses in these datasets do not support the immediate clinical use of these biomarkers without more rigorous testing in large case-control and cohort studies. Besides, in order to reach to compatible results, researchers should avoid contaminations. As miRNAs can be found in the serum in different forms such as free, associated to HDL or enclosed in exosomes or micro vesicles, researchers should be careful in the isolation step of miRNAs from the desired samples.