Mechanisms of indigo naturalis on treating ulcerative colitis explored by GEO gene chips combined with network pharmacology and molecular docking

Oral administration of indigo naturalis (IN) can induce remission in ulcerative colitis (UC); however, the underlying mechanism remains unknown. The main active components and targets of IN were obtained by searching three traditional Chinese medicine network databases such as TCMSP and five Targets fishing databases such as PharmMapper. UC disease targets were obtained from three disease databases such as DrugBank,combined with four GEO gene chips. IN-UC targets were identified by matching the two. A protein–protein interaction network was constructed, and the core targets were screened according to the topological structure. GO and KEGG enrichment analysis and bioGPS localization were performed,and an Herbs-Components-Targets network, a Compound Targets-Organs location network, and a Core Targets-Signal Pathways network were established. Molecular docking technology was used to verify the main compounds-targets. Ten core active components and 184 compound targets of IN-UC, of which 43 were core targets, were enriched and analyzed by bioGPS, GO, and KEGG. The therapeutic effect of IN on UC may involve activation of systemic immunity, which is involved in the regulation of nuclear transcription, protein phosphorylation, cytokine activity, reactive oxygen metabolism, epithelial cell proliferation, and cell apoptosis through Th17 cell differentiation, the Jak-STAT and IL-17 signaling pathways, toll-like and NOD-like receptors, and other cellular and innate immune signaling pathways. The molecular mechanism underlying the effect of IN on inducing UC remission was predicted using a network pharmacology method, thereby providing a theoretical basis for further study of the effective components and mechanism of IN in the treatment of UC.

potential targets of in. The active components of a drug perform related biological functions through the relevant targets. In addition to obtaining the targets of the core active ingredients of indigo directly from the TCMSP database, information on the small molecule structure of the core active ingredient (Canonical SMILES) was used for target identification. Similarity ensemble approach (https ://sea.bksla b.org/) 23 , STITCH (https ://stitc h.embl.de/) 24 and Swiss Target Prediction (https ://new.swiss targe tpred ictio n.ch/) 25 , and PharmMapper (https :// lilab ecust .cn/pharm mappe r/) 26  construction of a Uc-related targets database. First, microarray data of differentially expressed mRNAs in the intestinal mucosa between the normal group and the UC group were obtained from the GEO www.nature.com/scientificreports/ database (https ://www.ncbi.nlm.nih.gov/geo/), Series: GSE87466, GSE65114, GSE9686, GSE10616. Sva and Limma of Rmur3.6.1 were used to carry out joint analysis of multiple chips and correct data batches. The two software packages can be obtained from (https ://www.bioco nduct or.org/). Genes with an adjusted P < 0.05 and log 2 (fold change) > 1 or log 2 (fold change) < − 1 were considered significantly differentially expressed and UCrelated targets. In addition, UC-related disease targets were integrated in the database, including the: DrugBank database (https ://www.drugb ank.ca/) 27 , TTD database (https ://db.idrbl ab.org/ttd/) 28 , and DisGeNET database (https ://www.disge net.org/web/DisGe NET/menu/home) 29 , using "Ulcerative Colitis" as the keyword, combined with the Uniprot database (https ://www.unipr ot.org/) 30 , and GEO analysis results to eliminate repeated disease targets and establish the disease target database of UC.
construction of the ppi interaction network. Based the above analyses, the core active ingredient target of indigo was matched with the disease target of UC to obtain the compound target of IN-UC. The VENN map was drawn by Bioinformatics (https ://bioin forma tics.psb.ugent .be/webto ols/Venn/) and the PPI network of the target was obtained by using the String online tool (https ://strin g-db.org/) 31 . construction of an "Herbs-components-targets" network of in. Based on the PPI network obtained above, the "Herbs-Components-Targets" network (H-C-T network) of IN was constructed using Cytoscape3.6.1(https ://www.cytos cape.org/) 32 . According to the topological characteristics of the network, the three most important parameters were selected to screen the core composite targets of Indigo: degree of Degree Centrality (DC) 33 , Closeness Centrality (CC) 34 , and Betweenness Centrality (BC) 35 . Degree centrality refers to the number of other nodes associated with a node in the network. The greater the degree centrality, the greater the importance of the node. Betweenness centrality calculates the number of shortest paths through a node. The more the number of shortest paths through a node, the higher its intermediary centrality. Closeness centrality calculates the sum of the distances from one point to all other points. The smaller the sum, the shorter the path from this point to all other points, which means that the point is closer to all other points. The levels of these three parameters represented the topological importance of the nodes in the network, they can reflect the role and influence of the corresponding nodes in the whole network and importance of the nodes was positively correlated with the output value in the network. According to relevant literature reports, the target showing twofold the median value was selected for DC 36 , and the target with median value for BC and CC was selected 37 to obtain a more accurate core targets.

construction of a compoud targets-organs location network. The metabolism of IN in vivo is
not clear, and multiple tissues and organs may be involved in the intestinal mucosal repair of UC induced by IN. Therefore, the BioGPS database (https ://biogp s.org) was used to examine the mRNA expression profile of each IN-UC compound target at the organ tissue level 38 . The database provides gene expression data obtained by direct measurement of gene expression by microarray analysis 39 . The specific steps are as follows: first, the distribution data of mRNA expression of each compound target in 84 organs and tissues are obtained, and then the average value of each mRNA in each tissue and the overall average value in all tissues are calculated. Finally, the related organs and tissues whose mRNA expression is higher than the overall average are extracted 40 , and the Compoud Targets-Organs location network is established by Cytoscape 3.6.1.

Go and KeGG enrichment analysis.
After obtaining the core target, we used ClusterProfiler 41 GO and clusterProfiler KEGG of R 3.6.1 to analyze the GO and KEGG enrichment of the core target. The two software packages can be obtained from (https ://www.bioco nduct or.org/) acquisition. GO enrichment mainly analyzes the biological process, cellular composition, and molecular function of the target, whereas KEEG (www.kegg. jp/kegg/kegg1 .html) enrichment analyzes the potential biological pathways and functions associated with the target.
Active components-targets docking. Three components were selected among the core components of IN and docked with three proteins selected from the core targets to verify the accuracy of the main components and prediction targets. The candidate composition and the target crystal structure were downloaded from the TCMSP database and RCSB protein data (https ://www.pdb.org/), respectively. The latter preferably selects a model with ligand binding smaller than 3 Å, and then imports the crystal into the Pymol 1.7.2.1 Software (https ://pymol .org/2/) for dehydration, hydrogenation, and separation of ligands; it then imports AutoDockTools 1.5.6 to construct the docking grid box 42,43 for each target. Docking was completed by Autodock Vina 1.1.2 software 44 , and the molecules with the lowest binding energy in the docking conformation were selected to observe the binding effect by matching with the original ligands and intermolecular interactions (such as hydrophobicity, cation-π, anion-π, π-π stacking, hydrogen bonding, etc.).

Results
targets prediction and analysis of in. A total of 29 active ingredients were obtained from TSMSP, BATMAN-TCM, and TCMID, and nine core active components were selected according to the screening criteria of ADME OB ≥ 30% and DL ≥ 0.18. However, tryptanthrin was the third most effective component of indigo, and the related literature confirmed that it showed protective effects in an experimental UC animal model 21,22 . Therefore, it was included although it did not meet the OB and DL criteria. Finally, ten core active components (  Table S1). Joint analysis of four gene chips in the GEO database (GSE87466,GSE65114, GSE10616, GSE9686) identified 165 differentially expressed genes related to UC (Supplementary Table S2), which were used to build a volcano map (Fig. 2). In addition, we integrated "Herbs-components-targets" network of in analysis. The IN-UC composite targets identified were input into STRING to remove the unconnected target, and the PPI network was obtained. Cytoscape 3.6.1, showed that the network contained 182 nodes and 2054 edges. The "Herbs-Components-Targets" network of IN was constructed (Fig. 4), including 192 nodes and 748 edges. Then, according to the characteristics of the network topology, DC selects the target showing twofold the median value 36 , and BC and CC select the target with the median value 37 . The median DC of the network was 15 DC BC and the median value of CC was 0.0017 and 0.4641, respectively. After network analysis with NetworkAnalyzer, 43 core targets were selected (see Fig. 5 for  www.nature.com/scientificreports/ the process of screening core target network in Table 2 and the PPI network, including 43 nodes and 645 edges). At the same time, we constructed the network diagram of core targets and non-core targets (Fig. 6).  Table S5, S6). Then, the Targets-Organs Location Network ( Fig. 7) was constructed, which contained 186 nodes and 997 edges. Most of the targets were highly expressed in several organs and tissues simultaneously, indicating that these organs are closely related to the targets of indigo. In    45 , also promote the secretion of anti-inflammatory factors IL-22 and IL-10, and then repair mucous membrane. At the same time, Indigo and Indirubin have also been proved to have antioxidant activity 46,47 , which is consistent with the conclusions related to MF and BP of GO. (Supplementary Table S7, S8, S9)We selected the first 20 functional enrichment processes to draw a bubble diagram and a secondary classification chart, as shown in Fig. 8. In addition, we identified the main signaling pathways involved in the treatment of UC by KEGG enrichment analysis, and screened the first 20 pathways related to UC   Table S10).
components-targets docking analysis. Analysis of the literature and the current research hotspots was performed, and the three components with the highest content of indigo were selected for molecular docking verification, namely indigo, indirubin, and tryptanthrin. AHR, MAPK1, and EGFR were selected from the core targets to download the crystal structures of 5NJ8 (AHR), 5K4I (MAPK1, containing ligands), and 3W2S (EGFR, containing ligands) from RCSB protein data. Since RCSB did not find the effective crystal structure of the AHR binding ligand, and all three are recognized AHR ligands, direct docking was performed with grid center 6.359 30.434 216.222 and NPTS 40 40 40 0.375. The affinity energy of best mode Indigo-AHR, Indirubin-AHR and Tryptamethrin-AHR are − 6.2 kcal/ mol, − 6.7 kcal/mol and − 6.7 kcal/mol. (Supplementary Table S11, S12, S13) The results of indigo-AHR molecular docking showed that both sides of the indole ring have cation-π and anion-π interactions formed by lysine residues (LYS-63) and aspartic acid residues (ASP-65), respectively, and the right indole rings are located in the hydrophobic cavity formed by two valine residues (VAL60). Indirubin-AHR and tryptanthrin-AHR showed the same results. In the process of docking with EGFR, the affinity energy of best mode Indigo-EGFR, Indirubin-EGFR and Tryptamethrin-EGFR are − 8.1 kcal/mol, − 8.5 kcal/mol and − 7.9 kcal/mol. (Supplementary Table S14, S15, S16)The center of the grid is 3.88 1.496 10.744 and the left indole ring of indigo is located in the hydrophobic cavity formed by two leucine residues (LEU-788 and LEU-777). The left indole ring of indigo forms a hydrogen bond with an aspartic acid residue (ASP-855) with an anion-π interaction and a cation-π interaction with a lysine residue (LYS-745). The right indole ring is located in the hydrophobic cavity formed by an alanine residue (ALA-743), a valine residue (VAL-726), and a leucine residue (LEU-718, 844). The left indole ring of indirubin-EGFR and tryptanthrin-EGFR is surrounded by a hydrophobic cavity formed by a hydrophobic alanine (ALA-743), a valine residue (VAL-726), a leucine residue (LEU792, LEU-1001), and a methionine residue (MET-793). The right indole ring forms a cation-π interaction with a lysine residue (LYS-745) and an anion π interaction with an aspartic acid residue (ASP-855). Indirubin also forms a 2.7 Å hydrogen bond with LYS-745.

Discussion
IN, also known as Qingdai, is a plant derived product commonly used as a dye and pigment. Because traditional Chinese medicine is based on a complex system of multiple components, multiple targets and multi-action pathways, the material basis and mechanism of action of its components remain unclear. Therefore, it brings difficulties to the modern research of traditional Chinese medicine. In 2007, Professor Hopkins and Professor Shao Li proposed the concept of network pharmacology and the framework of network pharmacology of traditional Chinese medicine 54 . Network pharmacology of traditional Chinese medicine is a bioinformation network construction and network topology analysis strategy based on high-throughput data analysis, virtual computing, and network database retrieval. The integrity and systematic characteristics of the research strategy of network pharmacology of traditional Chinese medicine are consistent with the principles of diagnosis and treatment of diseases, as well as the synergistic effects of multiple components, multiple pathways, and multiple targets of traditional Chinese medicine and its prescriptions. With the rise of the network pharmacology of traditional Chinese medicine, in the past 10 years, many Chinese medicine scholars have gradually adopted this method to analyze and explore the classical traditional Chinese medicine prescription, single herb and the compounds of traditional Chinese medicine 55,56 . At the same time, the introduction of the concept of network biology revealed that a healthy human body is a dynamically balanced biological network formed by genes, proteins, and other components. If the balance of this network is destroyed, the body presents a state of disease. The objective of the treatment of diseases using drugs is to reconstruct the equilibrium of the biological network In this study, we identified the active components of IN using three databases: TCMSP, TCMID, and BAT-MAN-TCM. Screening of ADME and related literature led to the identification of ten core active components. Among these ten components, four components (indigo, indirubin, tryptanthrin, and β-sitosterol) were tested in experimental studies using the UC model in vivo and in vitro, and these studies showed different degrees of anti-inflammatory effects. For example, in a UC model using dextran sodium sulfate (DSS) and 6-trinitrobenzenesulfonic acid (TNBS) mice, indigo significantly decreased the severity of colitis. Treatment with indigo significantly increased the levels of CYP1A1, IL-10, and IL-22mRNA in lymphocytes in the lamina propria of the colon. In spleen cells treated with indigo, the number of IL-10 producing CD4 + T cells and IL-22 producing CD3-RoRγ + T cells increased 45   www.nature.com/scientificreports/ TNF-α/NF-κB p65 and IL-6/STAT3 signaling pathways by inhibiting the degradation of IαBκ and the phosphorylation of STAT3 57,58 . Tryptanthrin also improves the body weight and pathology of DSS mice and increases the survival rate. It regulates the TNF-α/NF-κB p65 and IL-6/STAT3 signaling pathways by inhibiting the degradation of IαBκ and the phosphorylation of STAT3 22 . Spleen cells from mice with colitis treated with tryptanthrin produce less IL-2 and IFN-γ after mitogen stimulation than those from untreated mice. An inflammatory model of RAW264.7 cell UC induced by LPS confirmed that indirubin and tryptanthrin have anti-inflammatory effects in vitro; the anti-inflammatory mechanism may involve the downregulation of IL-6/TNF-α, which can be used for the prevention and treatment of UC 59 . A common sterol in Chinese herbal medicine, β sitosterol, prevents the shortening of the colon in C57BL/6 mice with colitis induced by TNBS, decreases the general score and myeloperoxidase activity, downregulates the proinflammatory cytokines TNF-α, IL-1β, and IL-6, and the inflammatory enzyme cyclooxygenase (COX)-2, and inhibits the activation of NF-κB 60 . It also significantly increases the expression of antimicrobial peptides in intestinal epithelial cells 61 . Therefore, β-sitosterol may alleviate colitis by inhibiting the NF-κB pathway. Similar results were obtained in C57BL/6 J mice with colitis induced by a high-fat western diet + DSS treated with β-sitosterol 62 . Several active components of IN are effective against UC.
According to the above active components, we performed target fishing using the SEA, STITCH, STP, and PM databases, eliminated repeat compounds, and obtained a total of 933 indigo targets to construct a "drugcomponent-target" network. Then, we integrated four gene chips of GEO and UC disease targets from DrugBank, DisGeNET, and TTD databases, eliminated duplicates, and finally obtained 913 disease targets. These were To explore the underlying mechanism of IN, we performed GO and KEGG enrichment analyses of the core target and retrieved BioGPS to observe the distribution of all compound targets in organs and tissues. GO and KEGG enrichment analyses showed that the core target was enriched in biological process, cellular composition, and molecular function, with a total of 28 items in cellular composition. The targets involved plasma membrane components, cytoplasmic components, and cell junctions, and were enriched to 100 items in molecular functions, which were mainly related to the regulation of nuclear transcription, protein phosphorylation, and cytokine activity among others. In the biological process, it mainly involved the modification and metabolic regulation of reactive oxygen species, the positive regulation of epithelial cell proliferation, and involvement in apoptosis. In addition, we observed 20 pathways related to UC and constructed a "Targets-Pathways" network, which involved innate immunity, cellular immunity, classical inflammation, cell proliferation, and apoptosis. Intestinal innate immunity has recently received increasing attention. UC intestinal immune inflammatory damage is closely related to the abnormal activation of innate immunity, and pattern recognition receptor (PRRs) play an important role in innate immunity. Several types of PRRs (including Toll-like receptors and NOD-like receptors) maintain the mucosal interactive immunity of the intestinal flora by recognizing pathogen-related molecular patterns. Overactivation of TLRs (such as TLR2 and TLR4) and NLRs (such as NLRP3) leads to the activation of NF-κ B, which promotes the production and release of pro-inflammatory factors such as IL-1β, IL-18, and IL-33, and induces the occurrence of cell pyrogenesis 66,67 . The inflammatory necrosis of intestinal epithelial cells and the increase of intestinal mucosal permeability eventually lead to the occurrence and development of UC. For example, Th17 cell differentiation and Jak-STAT, T cell receptor, and IL-17 signaling pathways play an important role in maintaining the balance between Th17 and Treg cells. The immunomodulatory mechanism of Th17 and Tregs has become a hot topic in immunology. Studies show that the pathogenesis of UC is closely related to the imbalance of Th17/Treg 68 . Zhang et al. 69 showed that the differentiation of Th17 and the expression levels of RORγt, IL-17A, and IL-6 were increased in UC model mice, whereas the differentiation of Tregs and the expression levels of Foxp3 and IL-10 decreased. IL-6 is considered a key factor leading to the imbalance of Th17/Treg differentiation,Kimura et al. 70 found that AHR ligands (TCDD or FICZ) alone cannot induce the differentiation of Th17 and Tregs. The cytokines IL-6 and TGF-β promote the effect of AHR ligands on increasing the differentiation of Th17 and secretion of IL-17, whereas only TGF-β can increase the expression of Foxp3. Studies also show that different AHR ligands have different effects on inducing Th17/Treg differentiation. The "mirror cells" of CD4 + Th cells-intrinsic lymphocytes (ILCs) were identified as a new target for UC therapy, and were divided into different groups according to the expression of transcription factors, namely, receiving and secreting cytokines including ILC1, ILC2, ILC3, and ILCreg. These localize to barrier tissues and participate in Scientific RepoRtS | (2020) 10:15204 | https://doi.org/10.1038/s41598-020-71030-w www.nature.com/scientificreports/ the maintenance of mucosal dynamic balance and host defenses against infection. Japanese scholars showed that indigo, the main ingredient, activates AHR, which upregulates ILC3/IL22 to achieve anti-inflammatory effects.
In addition, studies show that ILC2 levels are high in AHR receptors, although AHR inhibits ILC2 function 71 . Among different ILC subsets of the intestine, only ILC2 can be induced to produce IL-10 72 . An ILC regulatory subgroup (also known as ILCreg) secreting IL-10 has not been identified 73 ; therefore, how indigo and its main components regulate Th17/Treg and ILCs needs to be further investigated. We found that IN targets were distributed in 84 tissues and organs in the BioGPS database (including colon and small intestine). We selected 17 tissues and organs highly related to immunity (bone marrow, lymph nodes, lymphocytes) to construct an IN targets-organs location network, and most of the targets were highly expressed in several organs. This suggests that IN not only plays a role in the local intestinal mucosa, but it can also stimulate anti-inflammatory processes in distant immune tissues and organs. Thus, IN may also lead to the activation of these targets, inducing right coloitis, pulmonary hypertension, and liver damage. Liver damage associated with IN and the side effects of inducing right colitis and pulmonary hypertension also have this biological basis. It is known that the network pharmacology of traditional Chinese medicine is a research method aimed at elucidating the effective components and action targets of traditional Chinese medicine, but there is still a lot of room for improvement in the discipline itself. for example, we need to establish more high-quality comprehensive network pharmacology platform. we should fully integrate the existing traditional Chinese medicine, ingredients, syndromes, diseases, targets and other contents, and constantly improve and supplement. At the same time, as traditional Chinese medicine may play a role in the treatment of diseases through the coordination of multiple components, how to predict and evaluate the synergistic effect of multiple compounds is also a challenge we are facing at present. in addition, there is also a lack of information about active components activating or inhibiting targets and signal pathways in the platform. If we can constantly improve the above shortcomings, we will certainly be able to provide a more reliable theoretical basis for the research of traditional Chinese medicine. With regard to the limitations of this manuscript and the aspects that need to be further studied, I think first of all, we can use LC/MS technology to verify and supplement the effective compounds of IN, and carry out corresponding pharmacokinetic and metabonomic studies. In terms of data collection, we can further search other disease databases to supplement disease targets, and verify differential genes combined with patients' colonic mucosa samples. In addition, we also need to use recognized animal and cell models to verify the relevant signal pathways and targets.

conclusions
Although IN is an effective mucosal repair agent, the optimal dose for inducing remision with low toxicity needs to be determined. IN is a herb which can eliminate pathogenic factors of body in traditional Chinese medicine; however, whether it is suitable for long-term maintenance treatment of UC remains to be determined in future clinical trials. In addition, basic experiments are necessary to clarify the mechanism of action of this drug.

Data availability
All the data can be obtained from the open source website we provide, and the conclusion can be drawn through the analysis of the relevant software.