Introduction

Epstein-Barr virus (EBV) is considered to be one of eight human herpesviruses that contain protein capsids around the double-stranded linear DNA genome1. EBV infects over 90% of people worldwide. The oral cavity is assumed to be the site of primary EBV infection2. Lymphocytes and epithelial cells make up the majority of host cells for EBV3. In adolescents, primary EBV infection often causes infectious mononucleosis (IM). Several autoimmune disorders, including systemic lupus erythematosus (SLE), have been linked to EBV4. In addition, as the first tumorigenic virus to be identified, EBV causes approximately 200,000 new instances of cancer each year, comprising cancers originating from B cells like Burkitt lymphoma and diffuse large B-cell lymphoma (DLBCL)5.

SLE is an autoimmune disorder that usually occurs among women of childbearing age with multi-systemic involvement6. The phenotypes of SLE can vary from minor mucocutaneous signs to serious central nervous or kidney damage, which results in a high risk of death for patients7. The characteristic feature of SLE is the occurrence of autoantibodies against nuclear antigens (ANA), which can be detected up to 10 years before disease onset8. SLE is considered to be brought on by a confluence of hereditary and environmental factors. Among environmental factors, viral infection, especially EBV infection, is closely related to SLE9. Previous studies have shown that, compared to healthy controls, higher viral loads and elevated titers of antibodies against EBV are detected among SLE patients10. Moreover, a growing body of research suggests that EBV-infected B cells may become resistant to apoptosis, leading to the proliferation, activation and antibody production of autoreactive B cells, which may result in tissue damage for SLE11. However, the underlying genetic molecular mechanism of EBV infection on the development of SLE is still not fully elucidated.

DLBCL refers to diffuse growth tumors with nuclei greater than 2 normal lymphocytes. It is the most prevalent kind of non-Hodgkin's lymphoma, making up roughly 40% of all B-cell lymphomas12. The R-CHOP therapy can cure approximately 60% of DLBCL. However, many patients still develop resistant to this treatment or experience a recurrence and eventually die13. At present, the etiology of DLBCL is not particularly clear, which is usually associated with gene abnormalities, EBV infection and other reasons14. In 2016, the World Health Organization classified EBV-associated DLBCL as a new subtype of DLBCL (EBV + DLBCL, NOS)15. Although a number of researches have demonstrated that EBV play extremely crucial roles in the process of inducing malignant transformation of lymphocytes16,17, the specific molecular mechanism of EBV regulating B cell signaling pathway still needs to be further studied.

Some studies have found an elevated incidence of malignancies, notably lymphoma, in people with SLE18. DLBCL composes 37–62% of all lymphomas identified in SLE19, but there is currently a lack of knowledge on the pathophysiology between DLBCL and SLE. Several determinants could be behind this altered risk, such as genetic factors, immunologic derangements and viruses, among which EBV infection is suggested to be the link between SLE and DLBCL20. Recent studies have shown that persistent EBV infection in SLE patients can promote malignant transformation of B cells due to the use of immunosuppressive drugs and other reasons21. However, molecular mechanisms underlying the contribution of EBV infection to the development of DLBCL in SLE patients are yet unknown.

In this study, we attempted to identify the shared gene signatures between EBV infection, SLE and DLBCL and explore the possible biological effect of EBV infection to the pathogenesis of DLBCL in the context of SLE. Firstly, we employed three datasets in this investigation to explore the biological link between EBV infection, SLE, and DLBCL. GSE49628, GSE61635, and GSE32018 were chosen from the Gene Expression Omnibus (GEO) database for EBV infection, SLE, and DLBCL, respectively. We first identified the differentially expressed genes (DEGs) in each dataset before identifying the common DEGs in these three datasets. To comprehend genome-based biological processes, shared DEGs were employed as the key experimental genes throughout the study, including gene ontology (GO) analyses, pathway enrichment analyses, and protein–protein interaction (PPI) network construction. In addition, 6 hub genes were extracted utilizing Cytoscape software for gene regulatory investigation, such as transcription factors (TFs)-gene network and miRNAs-gene network generation and immune landscape assessment. Finally, receiver operating characteristic (ROC) curves and candidate drugs prediction for SLE and DLBCL were performed based on the hub genes. Taken together, our study explored the molecular mechanisms by which EBV infection contribute to the susceptibility of DLBCL in SLE patients for the first time and identified potential biomarkers and therapeutic targets for patients with SLE and DLBCL.

Results

Identification of DEGs among EBV infection, SLE and DLBCL

Figure 1 depicts the whole work flow of this project. Firstly, 5016 genes, comprising 2883 up-regulated and 2133 down-regulated genes from the GSE49628 dataset, were differentially expressed for EBV infection (Fig. 2A). A total of 1549 DEGs were discovered using the SLE dataset (GSE61635), of which 992 genes showed up-regulation and 557 genes showed down-regulation (Fig. 2B). We discovered 1802 DEGs for the DLBCL dataset (GSE32018), comprising 791 up-regulated genes and 1011 down-regulated genes (Fig. 2C). With the use of the cutoff criteria (P-value < 0.05 and |logFC|> 1), all significant DEGs were retrieved. The summarized information of these datasets was listed in Table 1. Then by taking the intersection of DEGs of EBV, SLE and DLBCL datasets, 72 common DEGs were identified and visualized by Venn diagrams (Fig. 2D).

Figure 1
figure 1

Flowchart for this investigation.

Figure 2
figure 2

Volcano diagrams and Venn diagram. The volcano maps of GSE49628 (A), GSE61635 (B), and GSE32018 (C). Up-regulated genes are shown in red, while down-regulated genes are shown in blue. (D) There were 72 DEGs shared by these three datasets.

Table 1 Summary of the datasets used in this investigation, together with their geo-features and quantitative metrics.

GO and pathway enrichment analysis

We used the Enrichr online tool to analyze GO and pathway enrichment of the common DEGs. The rank of significant terms was determined by the P-value. GO analysis consists of three categories, including biological process, cellular component and molecular function. The top 10 significant terms of different categories were summarized in Table 2 and listed as bar graphs in Fig. 3. The most impacted pathways of the common DEGs among EBV infection, SLE and DLBCL were identified from four databases (WikiPathways, Reactome, KEGG and BioCarta). The top 10 pathways gathered from the above datasets were listed in Table 3 and also shown precisely in bar graphs in Fig. 4.

Table 2 Ontological analysis of shared DEGs between EBV, SLE, and DLBCL.
Figure 3
figure 3

GO terms of common genes between EBV infection, SLE and DLBCL. (A) Biological Processes, (B) cellular component, (C) molecular function.

Table 3 Pathway enrichment analysis of shared DEGs between EBV, SLE, and DLBCL.
Figure 4
figure 4

Pathway enrichment analysis of common genes between EBV infection, SLE and DLBCL. (A) Wikipathway, (B) Reactome Pathway, (C) KEGG Human Pathway, (D) BioCarta Pathway.

PPI network and submodule analysis

The PPI network of 72 common DEGs was analyzed by STRING platform, and the result was further visualized with Cytoscape software. The 42 nodes and 91 edges of the PPI network were shown in Fig. 5A. The stronger the connection of the gene with other genes, the redder the node was in the PPI network. Besides, we applied the MCODE plug-in of Cytoscape to construct a key gene module, which contained 8 common DEGs (Fig. 5B).

Figure 5
figure 5

PPI network and gene module analysis. (A) PPI network diagram. (B) Significant gene module. The redder the color of the gene in the network, the higher the connectivity of the gene with other genes.

Hub gene identification and functional analysis

Using 7 algorithms of cytoHubba plug-in of Cytoscape, the top 10 hub genes were screened. By applying the intersection of Venn diagrams, we finally identified 6 common hub genes, including CDK1, KIF23, NEK2, TOP2A, NEIL3 and DEPDC1 (Fig. 6A). The expression of the six hub genes were further verified in the EBV infection dataset (GSE49628), SLE dataset (GSE61635) and DLBCL datasets (GSE32018 and TCGA_GTEx-DLBC) (Fig. 6B–E). All of the six genes expressions were higher in EBV infection, SLE and DLBCL group compared to the control group. Through the GeneMANIA database, a complex gene interaction network was constructed for understanding the biological roles of the common hub genes, with the co-expression of 92.3%, co-localization of 3.98%, physical interactions of 3.52%, pathway of 0.12% and shared protein domains of 0.08% (Fig. 6F). Based on the 6 hub genes, 20 related genes were identified, showing that they were mainly associated with mitotic nuclear division, chromosome segregation and cell cycle checkpoint. Besides, GO and pathway enrichment of the hub genes was analyzed, showing the similar results from DEGs (Supplementary Figs. 2 and 3).

Figure 6
figure 6

Venn diagram and hub gene co-expression network. (A) The Venn diagram displayed six hub genes that were filtered by seven algorithms. (B) Differential expression of hub genes in EBV-transformed B cells and resting B cells in the GSE49628 dataset. (C) Differential expression of hub genes in SLE patients and healthy donors in the GSE61635 dataset. (D) Differential expression of hub genes between DLBCL samples and healthy controls in the GSE32018 dataset. (E) Differential expression of hub genes between DLBCL samples and healthy controls in the TCGA_GTEx-DLBC dataset. (F) GeneMANIA was used to assess hub genes and the co-expressed genes. *P < 0.05, **P < 0.01, and ***P < 0.001.

Determination of regulatory signatures

To identify the regulatory molecules of hub genes at the transcriptional level, we constructed TFs-gene and miRNAs-gene networks using NetworkAnalyst platform, which were visualized by Cytoscape. As shown in Fig. 7, the TFs-gene interaction network contains 50 nodes and 52 edges. CDK1 was modulated by 18 TF genes, and NEK2 was modulated by 17 TF genes. The TFs such as SIN3A, GABPA, ZNF18 and ZNF24 regulated many hub genes in the network. The miRNA-gene regulatory network was also predicted by Networkanalyst and created by Cytoscape, including159 nodes and 185 edges (Fig. 8). It has been ascertained that 30 miRNAs regulated with more than one hub gene, which demonstrated the high interaction between them.

Figure 7
figure 7

The TFs-gene regulatory network. Hub genes are represented by the pink nodes, and TF genes are represented by the other blue nodes.

Figure 8
figure 8

The miRNAs-gene regulatory network. Hub genes are represented by the pink nodes, and miRNAs are represented by the other green nodes.

ROC curves of hub genes

The diagnostic effectiveness of the 6 hub genes was evaluated using ROC curves. To identify SLE patients from healthy controls in the SLE dataset (GSE61635), CDK1 (AUC: 0.923) and TOP2A (AUC: 0.904) showed preferable diagnostic effectiveness (Fig. 9A). For separating DLBCL patients from healthy controls in the DLBCL dataset (GSE32018), TOP2A (AUC: 0.935), DEPDC1 (AUC: 0.935), NEIL3 (AUC: 0.922), and NEK2 (AUC: 0.916) showed good diagnostic performance (Fig. 9B). To distinguish DLBCL patients from healthy controls in the TCGA_GETx dataset (Fig. 9C), TOP2A (AUC: 0.825), CDK1 (AUC: 0.799) and NEK2 (AUC: 0.782) ranked the top 3 genes.

Figure 9
figure 9

Verification of common diagnostic biomarkers. (A) The ROC curve used to verify the diagnostic efficacy in GSE61635. (B) The ROC curve used to verify the diagnostic efficacy in GSE32018. (C) The ROC curve used to verify the diagnostic efficacy in TCGA_GTEx dataset.

Immune infiltration assessment

We investigated the relationship between the expression of hub genes and the infiltration of immune cells in DLBCL based on the TIMER database (Fig. 10A). The results showed that the expressions of CDK1, KIF23, NEK2, TOP2A, NEIL3 and DEPDC1 were all positively associated with T helper 2 (Th2) cells and T helper cells. On the contrary, the expressions of these six hub genes were negatively linked with plasmacytoid dendritic cells (pDCs) and NK CD56bright cells. In addition, immune checkpoint correlation analysis showed that hub genes were closely associated with various immune checkpoint factors, such as NRP1, TNFSF18 and LGALS9 (Fig. 10B).

Figure 10
figure 10

Immune landscape assessment. (A) Relationship of hub genes expression and immune cell subtypes in DLBCL patients. (B) Heatmap of the correlation between hub genes expression and immune checkpoint. *P < 0.05, **P < 0.01, and ***P < 0.001.

Prediction of candidate drugs

Using the Enrichr platform, which is based on the DSigDB database, the top 10 candidate therapeutic compounds were extracted and sorted by their P-value in the areas of hub genes as prospective pharmacological targets for EBV infection, SLE, and DLBCL (Table 4). It was discovered that the three pharmacological molecules that interacted with the majority of genes were testosterone (CTD 00006844), resveratrol (CTD 00002483), and calcitriol (CTD 00005558).

Table 4 Prediction of top 10 candidate drugs for EBV, SLE, and DLBCL.

Discussion

Several studies have indicated that EBV infections could trigger the occurrence of SLE and DLBCL. Additionally, SLE patients are more likely to develop DLBCL, suggesting that EBV infection may be the link between SLE and DLBCL. However, the underlying mechanism is still unclear. In this study, we performed a series of bioinformatics analysis and attempted to reveal the possible molecular machanisms by which EBV infection contribute to the pathogenesis of DLBCL in SLE patients for the first time and discovered potential biomarkers and therapeutic targets for SLE and DLBCL.

GO analysis was performed based on the 72 common DEGs identified by the three datasets of EBV infection, SLE and DLBCL. In terms of biological process terms, regulation of glial cell differentiation, serine family amino acid biosynthetic process, cellular response to peptide, myoblast differentiation and serine family amino acid metabolic process were the most significant. Serine is a significant supply of one-carbon units, which are necessary building blocks for the production of nucleotides. EBV upregulates import and synthesis of serine through its encoded protein EBNA2, which is essential for EBV-driven B cell survival and proliferation22. Moreover, reducing the amount of extracellular serine available or preventing the production of serine from glycolytic intermediates have also been proposed as innovative therapeutic approaches for the treatment of autoimmune diseases23 and B-cell lymphomas24. Regarding the molecular function, double-stranded RNA binding, hydrolase activity, cadherin binding were the top terms. EBER-1 and EBER-2, two short RNAs that EBV encodes, are highly expressed in latently infected cells. EBERs can bind to the double-stranded RNA-activated protein kinase (PKR) and prevent it from being phosphorylated, leading to the resistance of IFN-alpha-induced apoptosis in EBV-positive lymphoma25. As for cellular components, intracellular membrane-bounded organelle, secretory vesicle and nucleus ranked the top 3. Secretory vesicles released from EBV-infected cells are capable of promoting inflammation and immune dysfunction, which may contribute to autoimmune diseases and numerous malignancies26.

The KEGG pathway enrichment analysis showed that p53 signaling pathway, aldosterone-regulated sodium reabsorption and mRNA surveillance pathway are the top 3 significant pathways. Mutations that inactivate p53 promote genomic instability and are hallmark of cancer27. EBV-encoded oncoprotein latent membrane protein 1 (LMP1) is reported to promote the growth of lymphoma cells by the degradation of p5328. Interestingly, mutations in p53 have been shown to increase autoimmune susceptibility in multiple strains of mice29,30. The autoantibodies against p53 have been detected in the serums of patients with SLE, which could functionally block p53 activation and affect apoptosis31. Results from WikiPathways indicated that trans-sulfuration and one-carbon metabolism, white fat cell differentiation and prion disease pathway are the most significant. The trans-sulfuration route is a biochemical process that connects methionine metabolism to the production of cellular redox-controlling molecules, which contributes to atherosclerosis and tumor development32. Reactome analysis showed that the most interacted gene pathways are establishment of sister chromatid cohesion, mitotic telophase/cytokinesis and depolymerisat of nuclear lamina. According to recent research, abnormal sister chromatid cohesion leads to chromosomal instability, which in turn promotes the growth of cancer33. As for BioCarta, the top pathways are as follows: cdc25 and chk1 regulatory pathway in response to DNA damage, TSP-1 induced apoptosis in microvascular endothelial cell and sonic hedgehog receptor Ptc1 regulates cell cycle.

Based on the PPI network, CDK1, KIF23, NEK2, TOP2A, NEIL3 and DEPDC1 were selected as hub genes. CDK1 is able to regulate cell cycle progression and transcription34. CDK1 expression was up-regulated in EBV-positive DLBCL and nasal natural killer/T-cell lymphoma (NNKTL). Treatment with CDK1 inhibitors causes the death of EBV-transformed cells35,36,37. In addition, type I interferon (IFN) signaling is also thought to be a major pathogenic route in SLE. The excessively increased type I IFN signaling in SLE may be caused by overexpression of CDK1. KIF23, a kinesin 6 family member that is found at the interzone of mitotic spindles, is essential for cytokinesis38. KIF23 expression is increased in DLBCL and is a risk factor for this disease39. NEK2, a member of NIMA-related kinase family that regulates cell cycle, is up-regulated in a variety of malignancies, including DLBCL40,41. TOP2A is a multifunctional nuclear enzyme required during DNA replication, transcription and DNA damage repair42. It was identified that TOP2A-nucleolin interaction is essential for regulating Top2A targeting agent induced DLBCL cell death43. TOP2A is also proposed as potential biomarker for SLE diagnosis by comparative analysis44. For the remaining two hub genes, NEIL3 and DEPDC1, there are no publications describing their role in EBV, SLE or DLBCL, which highlights its significance for future research.

We also performed the TFs-gene and miRNAs-gene connection to discover the transcriptional and post-transcriptional regulators of the hub genes. The TFs such as SIN3A and ZNF18 regulated most hub genes in the network. SIN3A is hypothesized to control gene expression by acting as histone deacetylases, which is linked to tumor progression45. EBV nuclear antigen 3C is reported to recruit SIN3A to repress CDKN2A, which is important for immortal human B-lymphoblastoid cell line proliferation46. Furthermore, three miRNAs, including hsa-miRNA-186, hsa-miRNA-192 and hsa-miRNA-215 were identified to regulate most hub genes. These miRNAs play critical role in several types of cancers, such as breast cancer, gastric cancer and colorectal cancer47,48,49. However, their function in SLE or DLBCL has not been reported.

With the ROC curves, we found that six hub genes exhibit good diagnostic performance in SLE and DLBCL. Furthermore, we used a cohort of DLBCL patients from TCGA dataset to investigate the association between the expression levels of the six hub genes and overall survival. By ultilizing GEPIA250 tool, Kaplan–Meier survival analysis and Cox proportional hazards regression to assess the association between gene expression and patient survival were conducted (Supplementary Fig. 1). However, we observed that the expression levels of these hub genes were not associated with overall survival in DLBCL patients. These findings suggest that while these hub genes may be useful for diagnostic purposes, they may not be reliable prognostic markers for DLBCL.

With assessment of immune infiltration, we observed that hub genes expressions are positively correlated with Th2 cell infiltration and negatively linked with pDC in DLBCL. Th2 cells have been primarily associated with promoting tumor growth and suppressing anti-tumor immunity51, while pDCs can contribute to anti-tumor immunity by promoting the activation and expansion of effector immune cells, and by producing type I interferons52. In addition, we found that these 6 hub genes had the highest positive correlation with NRP1 expression. NRP1 can promote tumor angiogenesis, tumor cell migration and invasion, and the infiltration of immunosuppressive cells into the tumor microenvironment53,54. This finding suggested that hub genes may play a role in remodeling immune landscape of tumor microenvironment (TME) to promote DLBCL.

Six hub genes were applied to the DSigDB database for the prediction of potential medicines. The top 3 chemical molecules were listed as follow: lucanthone, testosterone and trifluridine. Recently, it was discovered that the anti-schistosomal drug lucanthone, which may pass the blood–brain barrier, inhibit autophagy and suppress the growth of breast cancer and glioblastoma55,56. In peripheral blood mononuclear cells from SLE patients, testosterone, one of the male hormones, is able to reduce the production of anti-DNA antibodies by inhibiting B cell hyperactivity, supporting the therapeutic effects for SLE57,58. Trifluridine, a thymidine-based nucleoside analog, is a novel oral cytotoxic chemotherapy licensed for the treatment of metastatic colorectal cancer refractory to standard therapies59. It is also an antiviral agent for topical use in the eye, which could effectively inhibit the replication of herpes simplex virus type 160.

Certainly, our study has some limitations. On the one hand, there was a lack of information on patients with DLBCL secondary to SLE with EBV infection to further verify our finding.. On the other hand, our findings were obtained by pure bioinformatics analysis, so the function of hub genes and prospective medicines is required to be further confirmed by scientific investigation in vitro and in vivo.

Conclusion

To the best of our knowledge, our study is the first to reveal shared DEGs, GO and pathway enrichment and PPI network for EBV infection, SLE and DLBCL using bioinformatic analysis to explore the potential molecular mechanisms underlying the contribution of EBV infection to the development of DLBCL in SLE patients. Our research also identified immune-related biomarkers and future therapeutic targets for patients with SLE and DLBCL, which will help better manage SLE patients and provide early diagnosis and treatment for DLBCL.

Methods

Compilation of datasets

GEO (www.ncbi.nlm.nih.gov/geo) is a big database providing gene expression profiles for a variety of disorders. It is free of charge and publicly available61. GSE4962862 dataset contains resting and EBV transformed B cells from 3 donors and GSE61635 dataset consists of 79 SLE patients and 30 healthy donors, which were both sequenced using the Affymetrix Human Genome U133 Plus 2.0 Array platform. GSE3201863 is of 21 DLBCL samples and 7 healthy controls, which was sequenced by Agilent-014850 Whole Human Genome Microarray 4 × 44 K G4112F platform. Gene expression data and patient clinical information were downloaded from TCGA database (project ID: TCGA_GTEx-DLBC) (https://portal.gdc.cancer.gov/). In total, information for 41 DLBCL tissues and 447 adjacent normal tissues was obtained.

Identification of shared DEGs between EBV infection, SLE and DLBCL

An online program called GEO2R (www.ncbi.nlm.nih.gov/geo/geo2r/) can be used in the comparison and analysis between the groups with different gene expression64. GEO2R was used to identify DEGs for GSE49628, GSE61635 and GSE32018. DEGs were defined as genes with |log2 fold change (log2FC)|> 1.0 and an adjusted P-value < 0.05. The volcano map of DEGs from each dataset and the Venn diagram of shared DEGs between these three datasets were both plotted by Bioinformatics (https://www.bioinformatics.com.cn).

Investigation of GO and pathways enrichment

In order to find the functions of the shared DEGs or hub genes in EBV infection, SLE and DLBCL, we utilized Enrichr65, a useful gene set enrichment online platform (https://maayanlab.cloud/Enrichr/), to conduct a series of enrichment analyses. Biological process, cellular component, and molecular function are the three elements of GO66. The shared pathways among DEGs or hub genes were identified based on four databases, including WikiPathways67, Reactome68, Kyoto Encyclopedia of Genes and Genomes (KEGG)69,70,71,72, and BioCarta73.

PPI network construction and module analysis

The assessment of PPI network is the cornerstone of cellular biology for understanding protein function and the mechanism of cellular machinery operations. Search Tool for the Retrieval of Interacting Genes (STRING, http://string-db.org) is a database for studying PPI network with physical and functional interactions74. We used STRING to create the PPI network of shared DEGs with an interaction score greater than 0.4 and displayed it by Cytoscape (Version 3.9.1)75. The core functional module was produced using a Cytoscape plug-in—Molecular Complex Detection (MCODE)76.

Recognition of hub genes

The hub genes of this study were identified using CytoHubba77, a plug-in of Cytoscape. Subsequently, the final hub genes were confirmed by seven algorithms (MCC, MNC, EPC, Closeness, Degree, Radiality and Stress) and visualized by Venn diagram. Based on the final hub genes, co-expression networks were constructed by GeneMANIA (http://genemania.org), an online platform for gene interactions prediction78.

Validation of the hub genes in EBV infection, SLE and DLBCL

To confirm hub shared genes in EBV infection, SLE and DLBCL, we conducted the differentially expressed gene analysis on validation datasets (GSE49628, GSE61635 and GSE32018). GEPIA250 (http://gepia2.cancer-pku.cn/#index) is an analysis tool containing RNA sequence expression data of tumors and normal tissue samples. We used GEPIA2 to analyze TCGA and GTEx databases of gene expression profiles in DLBCL. Unpaired comparisons of hub gene expression between two groups were analyzed by Wilcoxon rank-sum test. We also performed prognostic analysis of hub genes in DLBCL with the “survival plots” module using a Kaplan–Meier curve with GEPIA2.

Identification of TFs and miRNAs interactions with hub genes

Proteins called TFs control how quickly genes are transcribed by binding to certain DNA regions. TFs-gene networks were created by utilizing ENCODE database79 on NetworkAnalyst platform (https://www.networkanalyst.ca/)80. Furthermore, miRNAs are a group of short non-coding RNAs, which can impede translation or degrade the target mRNA. The network of miRNAs-gene was acquired from mirTarbase81 via NetworkAnalyst. Cytoscape was used to display TFs-gene and miRNAs-gene regulatory networks.

Receiver operating characteristic curves of hub genes

Receiver operating characteristic (ROC) curves of the hub genes on SLE (GSE61635) and DLBCL (GSE32018 and TCGA_GTEx-DLBC) were both plotted by Bioinformatics platform. The diagnostic ability of each hub gene was assessed using the calculation of area under ROC curve (AUC).

Assessment of the immune landscape

We used immune cell infiltration and gene expression data from the TIMER database82 to identify relationships between the expression of hub genes and immune cell abundance in DLBCL, and plotted bubble plots to show these results. Spearman’s correlation analysis was used to analyze the relationship between hub genes and a range of immune-related genes, such as immune checkpoint-associated genes and immune cell subpopulation-associated genes.

Prediction of candidate drugs

It is important to evaluate protein-drug interactions in this research. On the basis of hub genes, drug molecules were obtained from Drug Signatures database (DsigDB)83 on Enrichr platform. P-value was used to rank the candidate medications from small to large.

Copyright permission of KEGG

We have contacted Kanehisa Laboratories. We do not directly use these KEGG Pathway map “images” in the article, we need not obtain copyright permission of KEGG. However, they believe that we have written our article using their data, they kindly ask us to cite the following articles in it69,70,71,72.