Identification of hub genes for adult patients with sepsis via RNA sequencing

To screen out potential prognostic hub genes for adult patients with sepsis via RNA sequencing and construction of a microRNA–mRNA–PPI network and investigate the localization of these hub genes in peripheral blood monocytes. The peripheral blood of 33 subjects was subjected to microRNA and mRNA sequencing using high-throughput sequencing, and differentially expressed genes (DEGs) and differentially expressed microRNAs (DEMs) were identified by bioinformatics. Single-cell transcriptome sequencing (10 × Genomics) was further conducted. Among the samples from 23 adult septic patients and 10 healthy individuals, 20,391 genes and 1633 microRNAs were detected by RNA sequencing. In total, 1114 preliminary DEGs and 76 DEMs were obtained using DESeq2, and 454 DEGs were ultimately distinguished. A microRNA–mRNA–PPI network was constructed based on the DEGs and the top 20 DEMs, which included 10 upregulated and 10 downregulated microRNAs. Furthermore, the hub genes TLR5, FCGR1A, ELANE, GNLY, IL2RB and TGFBR3, which may be associated with the prognosis of sepsis, and their negatively correlated microRNAs, were analysed. The genes TLR5, FCGR1A and ELANE were mainly expressed in macrophages, and the genes GNLY, IL2RB and TGFBR3 were expressed specifically in T cells and natural killer cells. Parallel analysis of mRNAs and microRNAs in patients with sepsis was demonstrated to be feasible using RNA-seq. Potential hub genes and microRNAs that may be related to sepsis prognosis were identified, providing new prospects for sepsis treatment. However, further experiments are needed.

Sepsis is a complex syndrome involving host response malfunction and life-threatening organ damage caused by infection 1 . It is a common but severe emergency affecting approximately 19 million patients worldwide each year 2 . The incidence rate of sepsis in the ICUs of hospitals in China is approximately 20.6%, and the mortality rate of sepsis remains high despite the timely use of antibiotics and other beneficial adjutant therapies. The fatality rate of patients with severe sepsis can reach 50% or higher 3 . The pathogenesis of sepsis is complex and has long been the focus of medical research. The Multiple Organ Dysfunction (MOD) score and Sequential Organ Failure Assessment (SOFA) score are widely used for evaluation of organ damage and prognosis in patients with sepsis, but research has shown that do not adequately predict death or survival in individuals with sepsis 4 . Rapid diagnosis is an important means to improve the survival rate of sepsis and reduce sepsis-related organ dysfunction. At present, the diagnosis of sepsis mainly depends on recognition of clinical infection symptoms and blood cultures, for which there is a lack of rapid and sensitive biomarkers 5 . Currently, studies on biomarkers of sepsis are increasing. The level of procalcitonin (PCT) in patients with sepsis or septic shock at admission is considered to be a better prognostic indicator than other inflammatory markers 2 . Nevertheless, the critical value of PCT for determining death or survival in patients with sepsis cannot be determined 6 , and PCT cannot be used for sepsis prognosis prediction. C-reactive protein and white blood cells also exhibit poor specificity and sensitivity for the diagnosis of bacterial infection 7 . In particular, under the standard of Sepsis 3.0 standards, further study of the pathogenesis of sepsis and discovery of new biomarkers related to sepsis are urgently needed to provide a theoretical basis for clinical diagnosis and treatment.
High-throughput sequencing methods such as RNA sequencing (RNA-seq) have gained increasing attention with advances in science and technology and are now widely used in the transcriptomics field for applications such as including gene expression profiling, novel transcript discovery, and sequence variation detection 8  www.nature.com/scientificreports/ Differential gene and miRNA expression analysis. The data were divided into the sepsis group and the healthy control group. Bioinformatics analysis was performed according to the specific workflow of the online platform iDEP0.9 18 . The raw data were normalized with edgeR (V4.0) 19 , the expression values were converted by the log2(CPM + 4) values, genes with low-quality values were removed, and the preliminary DEGs and differentially expressed miRNAs (DEMs) were analysed using DESeq2 20 iDEP0.9 with thresholds of a false discovery rate (FDR) < 0.05 and a log2 (fold change) (log2FC) value > 2. Afterwards, the top 20 miRNAs in ascending order of by FDR were selected as DEMs in sepsis. To further explore the potential miRNA regulators of core target genes, samples from the same patients were subjected to RNA-seq analysis to facilitate calculation of the relationships between miRNAs and target genes. For a regulatory relationship to be accepted, at least two conditions had to be met. First, complementary base pairing had to occur between the miRNA and mRNA. miRWalk3.0 21 (http:// mirwa lk. umm. unihe idelb erg. de/) was used to predict the potential target genes of DEMs in view of the principle of base pairing. Second, according to the mechanism of mRNA inhibition by miRNAs, there had to be a negative correlation between the expression values of the miRNA and the mRNA, the screening criteria were defined as a P value < 0.05 and a cor value < − 0.4. Among the preliminary DEGs, genes with a negative correlation with the DEMs were identified by OmicShare (https:// www. omics hare. com/), and intersected with the predicted target genes of DEMs. The intersecting genes were defined as the final DEGs (hereafter referred to as DEGs).

Enrichment analysis of DEG function.
To further understand the integrated information on the DEGs from a broad perspective, the Metascape database (https:// metas cape. org/) was employed to conduct DEG Gene Ontology (GO) analysis, gene-disease association analysis 22 , and gene-organization distribution analysis 23 . The Metascape database integrates multiple authoritative data resources, such as the GO, Kyoto Encyclopedia of Genes and Genomes (KEGG), UniProt and DrugBank. Metascape provides provide comprehensive and detailed information about each gene and can be used to complete not only pathway enrichment and bioprocess annotation but also gene-related protein network analysis, gene-disease association analysis, and gene-organization distribution analysis.
Construction of a miRNA-mRNA-PPI network. A PPI network was constructed based on previous research. In a PPI network, proteins with an interaction relationship are connected. If a specific target protein has more connections than other proteins, it is located at the core of the network. Therefore, researchers can infer whether a gene has potential research value on the basis of the network. The DEGs were subjected to PPI analysis using the STRING database (https:// string-db. org/) 24 . In this study, the lowest value of the connection strength parameter between two proteins was 0.4. To ensure the reliable prediction, only experimentally verified results were included. The potential miRNA-mRNA relationships were predicted according to the posttranscriptional regulation mechanism and the negative correlation between miRNA and mRNA. On the basis of the above PPI network, the core genes were added to the miRNA-mRNA regulation relationships to provide potential clues for follow-up mechanistic research. The miRNA-mRNA-PPI network was visualized with OmicShare (https:// www. omics hare. com/) to screen potential core genes.
Hub gene survival analysis. The GSE65682 dataset 25 was downloaded from the Gene Expression Omnibus (GEO) database (https:// www. ncbi. nlm. nih. gov/ geo/) and included clinical information such as gene expression data and survival time for 479 patients with sepsis in the ICU, including 365 sepsis survivors. The patients were divided into a high-expression group and a low-expression group according to the specific gene expression values. The survival data of patients with sepsis in GSE65682 were applied to conduct survival analysis for the core genes in the miRNA-mRNA-PPI network, and the survival curve was generated with GraphPad Prism (version 7.0) software, and the potential hub genes related to the prognosis of sepsis were selected. The log rank test was used for statistical analysis, and P < 0.05 was considered to indicate statistical significance.
Negative regulation of miRNA-mRNA pairs. To understand the mechanism of the six hub genes in sepsis and the miRNAs involved in their regulation, directed network analysis between the core genes and the top 20 DEMs was performed using OmicShare tools (https:// www. omics hare. com/ tools). The regulatory relationships between six specific hub genes and upstream miRNAs were derived from the local network module described above (see above for relevant screening conditions).
10 × single-cell RNA sequencing and data analysis. Five PBMC were collected from 2 healthy controls, 1 systemic inflammatory response syndrome (SIRS) patient and 2 septic patients, and were subjected to single-cell RNA-seq analysis (10 × Genomics). The 10 × Genomics platform applied microfluidic technology according to the manufacturer's protocol. The experimental data were further quality controlled for quality based on a preliminary quality control step with Cell Ranger to exclude data from mutliplets, doublets, or unbound cells. Cells with gene numbers and unique molecular identifier (UMI) numbers within the mean ± 2 standard deviations (SDs) and cells with fewer than 20% UMIs mapped to mitochondrial genes were considered high-quality cells. The results were visualized in two-dimensional space by t-distributed stochastic neighbour embedding (tSNE; nonlinear dimensionality reduction).

Statistical analysis.
The clinical data of the subjects were analysed with SPSS 22.0 software. Continuous variables are expressed as the mean ± SD and were analysed with unpaired Student's t test; the chi-square test was adopted for categorical variables, and a P value < 0.05 was considered to indicate statistical significance.

Results
Subjects' clinical characteristics. The experimental flow chart of this study is shown in Fig. 1. A total of 23 patients with sepsis and 10 healthy volunteers were recruited for the current study. As the clinical information shown in Table 1, there were no significant differences in age, sex, or platelets (PLT) counts, alaninetransaminase (ALT), aspartate aminotransferase (AST), or creatinine (Crea) between the septic patients and the healthy controls, but considerable differences were found in white blood cell (WBC) counts, neutrophils counts, neutrophil/ lymphocyte ratios (NLRs) and haemoglobin (HGB) levels. In addition, the levels of procalcitonin (PCT), prothrombin time/international normalization ratio (PT-INR), and lactic acid (LAC) were considerably higher than normal in the sepsis group. Blood cultures were positive in 12 (52%) sepsis cases, and gram-negative bacteria were the predominant microorganism (8, 66%). In addition, surgical sites were the major primary sites of infection (10, 43%). Compared with the healthy controls, the Charlson Comorbidity Index (CCI) that was used to predict 10-year survival in patients with multiple comorbidities, was higher and statistically significant in sepsis.

DEGs and DEMs in sepsis.
To identify plasma prognostic biomarkers and potential mechanisms of sepsis, RNA-seq including mRNA and miRNA sequencing was utilized for each specimen simultaneously, mRNA sequencing for 33 subjects yielded the relative transcript levels of 20,391 genes. Furthermore, 1633 miRNAs were detected via miRNA sequencing. Afterwards, bioinformatics was used to analyse the above sequencing data to obtain the DEGs and DEMs. During normalizing, principal component analysis (PCA) was applied to remove the heterogeneous samples among the sepsis samples and control samples ( Fig. 2A,B). A total of 1114 preliminary DEGs (767 upregulated) and 76 DEMs (45 downregulated) were initially screened out by DESeq2 (Fig. 2C,D) between the sepsis group and the healthy control group. According to the FDR values in ascending order, the top 10 upregulated and 10 downregulated miRNAs were considered as DEMs in sepsis ( Table 2). Intersection analysis of the DEGs that were negatively associated with DEMs and the predicted target genes based on the DEMs with miRwalk3.0 yielded 454 Figure 1. Experimental flow chart of this study. RNA-seq was applied to sequence mRNAs and miRNAs in the peripheral blood of a total of 33 subjects. Combined with bioinformatics, this analysis was used to screen out the key genes and corresponding miRNAs in sepsis.

Enrichment analysis of DEG function.
We deem it essential to comprehensively elucidate the biological functions of the DEGs. Enrichment analysis showed that the identified DEGs were significantly related to neutrophil degranulation, regulation of defence, response to bacteria, and chemotaxis (Fig. 3A). Gene-disease association analysis revealed that these genes may be involved in inflammation, immunosuppression, pneumonia, and bacterial infection (Fig. 3B). Furthermore, gene-organization distribution analysis revealed that these genes are mainly distributed in natural killer (NK) cells, the bone marrow, the spleen, adipocytes, the blood, and the liver (Fig. 3C).
miRNA-mRNA-PPI regulatory network. The DEMs and DEGs were submitted to the STRING database (https:// string-db. org/), and the OmicShare (https:// www. omics hare. com/). A miRNA-mRNA-PPI regulatory network for sepsis was constructed (Fig. 4A,B), and the 30 potential key genes located at the centre of the miRNA-mRNA-PPI network were screened out as the potential core genes involved in sepsis and presented in a heatmap (Fig. 4C). In the two groups of different expression trend modules, the core genes such as CD160, GATA2, GNLY, IL2RB and TGFBR3 were downregulated in the sepsis group, while genes such as ELANE, IL1R1, TLR5, FCGR1A, MAPK14 and PCSK9 were upregulated.
Hub gene survival analysis. Based on the data from GSE65682, we explored associations between the potential core genes and sepsis outcomes, Survival analysis showed that the genes TLR5, FCGR1A and ELANE  Red nodes, upregulated genes or miRNAs in sepsis; blue nodes, downregulated genes or miRNAs; gray nodes, genes or miRNAs with no considerable difference between the sepsis group and the NC group. Log2 fold change (log2FC) = 2, FDR < 0.05. NC the healthy control group, SEPSIS the sepsis group, FDR false discovery rate. www.nature.com/scientificreports/ whose expression was upregulated in sepsis, and the genes GNLY, IL2RB and TGFBR3, whose expression was downregulated in sepsis, were significantly associated with sepsis outcomes, Higher expression of the genes TLR5, FCGR1A, GNLY, IL2RB and TGFBR3 was associated with the better the clinical outcomes in patients with sepsis (P < 0.05). The relationship between the gene ELANE and sepsis outcomes showed the opposite trend (P < 0.05) (Fig. 5A-F). Consequently, the genes TLR5, FCGR1A, ELANE, GNLY, IL2RB and TGFBR3 genes were ultimately identified as hub genes in sepsis.
Negative regulation of hub miRNA-mRNA pairs. Directed network analysis with OmicShare (https:// www. omics hare. com/) was used to analyse the regulatory relationships between hub genes and miRNAs, and a directed network diagram was drawn. The miRNAs associated with these core biomarkers were identified and are presented in Fig. 6A-F.

Location of hub genes in cell clusters.
We compared gene expression patterns in PBMCs among healthy controls, SIRS patients and septic patients and identified 9 transcriptionally distinct cell clusters (Fig. 7A), The markers CD14and CD3E represented monocytes and NK-T cells, respectively (Fig. 7B,C). The genes TLR5, FCGR1A and ELANE genes were mainly expressed in macrophages (Fig. 7D-F), while the genes GNLY, IL2RB and TGFBR3 genes were expressed specifically in T cells and NK cells (Fig. 7G-I). The expression abundance values and ratios of each hub gene in different cell clusters are shown in Fig. 7J. These findings lay a foundation for subsequent mechanistic studies.

Discussion
RNA-seq, also referred to as transcriptome sequencing, is a newly developed technique for transcriptome analysis that use deep sequencing technology and can quantitatively detect RNA expression levels 12 . RNA-seq can be applied to identify DEGs in healthy and diseased tissues and provide a platform for further study of the mechanism of sepsis. Sepsis is a health-and life-threatening condition. Despite early administration of antibiotics and the improvements in organ support, the rates of mortality remain high among patients with sepsis. The development of immune response and the prognosis of sepsis are the focuses of medical research. The aim of the present study was to distinguish molecular differences between patients with sepsis and healthy controls and to determine associations with sepsis outcomes. We conducted RNA-seq (including mRNA and miRNA sequencing) on 23 patients with sepsis and 10 healthy controls and distinguished 454 DEGs (361 upregulated) and 20 DEMs in patients with sepsis compared to healthy volunteers. Based on clinical phenomena, hundreds of DEGs were screened, and functional enrichment analysis was carried out to understand the overall change characteristics of the genes. To further explore which genes play keys role in sepsis, a miRNA-mRNA-PPI regulatory network was constructed through integrated transcriptomics analysis for the above DEGs and 20 DEMs. We obtained dozens of potential core targets and determined their mutual regulatory relationships via network analysis. Moreover, survival curves were analysed for the potential core targets, and six hub genes were ultimately identified, including the upregulated genes TLR5, FCGR1A and ELANE and the downregulated genes GNLY, IL2RB and TGFBR3. www.nature.com/scientificreports/ Moreover, these genes are highly related to the prognosis of sepsis. In addition, the biological functions of the core genes were probably associated with bacterial infection, inflammation, immunosuppression and chemotaxis. The localization of these hub genes in PBMCs was further clarified by single-cell sequencing. miRNAs have been proposed as to be good biomarkers because they are stably present in biofluids and biospecimens, including blood, urine, and saliva; this availability enables relatively easy collection and analysis using different methods, such as RNA-seq and qPCR 26,27 . Abnormal miRNA expression is correlated with the severity of sepsis and may serve as a potential diagnostic and prognostic biomarker in sepsis 16,28 . A variety of miRNAs, including miR-150, miR-133a, miR-223 and miR-23a have been described to play roles in sepsis or sepsis-related organ injury [29][30][31] . Furthermore, based on integrated analyses of miRNAs and mRNA, miR-106b-5p, miR-128-3p, and miR-144-3p and their mRNA targets are new potential diagnostic and therapeutic indicators 13 . In the present study, we assessed the hub genes that bind with DEMs and found that they were negatively regulated by different miRNAs, providing beneficial opportunities for studying the pathogenesis of sepsis. We observed that the www.nature.com/scientificreports/ downregulated miRNAs miR-20a, miR-101-3p, let-7f and miR-196B-3p and the upregulated miRNAs miR-212a, miR-129-5p, miR-149-5p, miR-219a, miR-124-3p and miR-9-5p potentially regulate these hub genes. TLR5, a core member of the Toll-like receptors (TLRs) family, is a receptor of bacterial motile components that targets extracellular flagellin and thus induces inflammatory cytokines production and the immune response 32 . Previous studies have demonstrated that TLR5 not only participates in H. pylori infection 33 and SIRS 34 , but also prevents systemic inflammation and liver damage caused by B. pseudomallei infection. TLR5 deficiency facilitates bacterial growth and dissemination 35 . Another paper has indicated that increased TLR5 expression on monocytes is associated with mortality in patients with sepsis 36 , which is not consistent with our www.nature.com/scientificreports/ founding that elevated TLR5 expression in peripheral blood is associated with attenuated mortality of septic patients. This discordance might be explained by the different samples used; more experiments are needed to confirm this hypothesis. FCGR1A, also named CD64, is expressed on most myeloid cells and is a high-affinity Fc receptor (FcγRI), that binds to monomeric IgG 37 . It participates in a number of functions, including phagocytosis, antigen presentation, and cytokine production 38 . Neutrophils FCGR1A is involved in tuberculosis (TB), regardless of HIV infection 39,40 , and can be used to differentiate TB from latent TB infection 41 . A previous study has indicated that neutrophil CD64 expression is an important diagnostic marker of infection and sepsis in hospital patients 42 . In the current study, elevated expression of FCGR1A in peripheral blood was associated with a good outcome in sepsis, and FCGR1A was mainly expressed in macrophages. The gene ELANE encodes an elastase that exists in neutrophils, and plays an important role in pathogen killing. A previous study found that ELANE was overexpressed in septic patients via gene expression profile analysis 43 , which is in accordance with our conclusions. Moreover, inhibition of ELANE-mediated histone H3 proteolysis contributes to mononuclear macrophage differentiation 44 . GNLY is a cytotoxic granular protein secreted by cytotoxic T lymphocytes and NK cells 45 that exerts toxic effect on bacteria, fungi, parasites, and tumors 46 , GNLY acts as an immune alarmin and promotes antigen-presenting cell activation through TLR4 47 . GNLY is associated with the efficacy of pegylatedinterferon-alpha therapy in Chinese patients with HBeAg-positive chronic hepatitis 48 , rheumatoid arthritis 49 , and Mycoplasma pneumoniae pneumonia 50 . IL2RB, a subunit of IL-2R, mediates signal transduction for IL-2R and IL-15R 51 , Mutations in human IL2RB result in immune dysregulation, cytomegalovirus (CMV) susceptibility 52 , reduced T reg frequency, and an abnormal NK compartment 53 . Gene expression profiling and bioinformatics analysis have indicated that IL2RB is weakly expressed in sepsis, which indicates that IL2RB may be a potential diagnostic tool for sepsis 54,55 . Our findings are consistent with this possibility. TGFBR3, also known as betaglycan, is a coreceptor of the TGF-β superfamily, and plays important roles in cardiomyocyte apoptosis 56 , renal cell carcinoma 57 , keratinocyte proliferation 58 , cervical carcinoma 59 , and angiogenesis 60 . To our knowledge, there is little evidence of a role of TGFBR3 in sepsis.
In summary, we comprehensively analyzed miRNA-seq, mRNA-seq and single-cell sequencing profiling and established an integrated miRNA-mRNA-PPI network to screen hub genes in sepsis, the potential hub genes www.nature.com/scientificreports/ TLR5, FCGR1A, ELANE, GNLY, IL2RB and TGFBR3 and miRNAs that are possible posttranscriptional and regulatory factors related to sepsis prognosis were screened out, the findings provide new prospects for exploration of the physiopathologic mechanisms, diagnosis, and treatment of sepsis. Nevertheless, some limitations of this study should be mentioned. First, this study was performed in a single centre with a small sample size; and we will increase the sample size in further research. Second, the mechanisms of the hub genes in sepsis must be validated in subsequent experiments, as they were not confirmed in the current study.

Data availability
The RNA-seq dataset analysed during the current study is available in the China National GeneBank DataBase (CNGBdb) and can be found below: https:// db. cng. org/, under the accession: CNP0002611.