Integrative computational approach identifies drug targets in CD4+ T-cell-mediated immune disorders

CD4+ T cells provide adaptive immunity against pathogens and abnormal cells, and they are also associated with various immune-related diseases. CD4+ T cells’ metabolism is dysregulated in these pathologies and represents an opportunity for drug discovery and development. Genome-scale metabolic modeling offers an opportunity to accelerate drug discovery by providing high-quality information about possible target space in the context of a modeled disease. Here, we develop genome-scale models of naïve, Th1, Th2, and Th17 CD4+ T-cell subtypes to map metabolic perturbations in rheumatoid arthritis, multiple sclerosis, and primary biliary cholangitis. We subjected these models to in silico simulations for drug response analysis of existing FDA-approved drugs and compounds. Integration of disease-specific differentially expressed genes with altered reactions in response to metabolic perturbations identified 68 drug targets for the three autoimmune diseases. In vitro experimental validation, together with literature-based evidence, showed that modulation of fifty percent of identified drug targets suppressed CD4+ T cells, further increasing their potential impact as therapeutic interventions. Our approach can be generalized in the context of other diseases, and the metabolic models can be further used to dissect CD4+ T-cell metabolism.


INTRODUCTION
CD4 + T cells are essential components of the human immune system that fight against pathogenic invaders and abnormal cells by producing cytokines and stimulating other cells, such as B cells, macrophages, and neutrophils 1 . During an immune response, CD4 + T cells are activated and proliferate, and their metabolism adjusts to fulfill increased bioenergetic and biosynthetic demands. For example, activated effector CD4 + T cells are highly glycolytic 2 and use aerobic glycolysis and oxidative phosphorylation (OXPHOS) for proliferation 3 . On the other hand, naïve, resting, and regulatory CD4 + T cells are less glycolytic and use OXPHOS and fatty acid oxidation (FAO) for energy generation. Accordingly, metabolically dysregulated CD4 + T cells were observed in several diseases such as diabetes 4 , atherosclerosis 5 , cancers 6 , and autoimmune diseases such as rheumatoid arthritis (RA) 7,8 , multiple sclerosis (MS) 9 , primary biliary cholangitis (PBC) 10 , and systemic lupus erythematosus (SLE) 11,12 . Furthermore, metabolism of Type 1 T helper (Th1), Type 17 T helper (Th17), and inducible regulatory T cells have been found to be dysregulated in MS 13 . Controlling CD4 + metabolic pathways can be important in fighting against some immune diseases. For example, CD4 + T cells are hyperactive in systemic lupus erythematosus (SLE), and inhibiting glycolysis as well as the mitochondrial metabolism improved the outcome in an animal model 14 . Together, this evidence suggests a significant role of CD4 + T-cell metabolism in immune-mediated diseases.
Repurposing existing drugs for novel indications represents a cost-effective approach for the development of new treatment options 15 . Several studies have recently demonstrated the potential for drug repurposing in CD4 + T-cell-mediated diseases 16,17 . For example, 2-deoxy-D-glucose (anticancer agent) and metformin (antidiabetic drug) were shown to reverse SLE in a mouse model 14 . However, drug repurposing, as well as drug discovery and development efforts for targeting T-cell metabolism, have been limited due to the lack of knowledge about the key molecular targets in this context.
In recent years, analysis of large-scale biological datasets has emerged as a powerful strategy for discovering novel mechanisms, drug targets, and biomarkers in human diseases [18][19][20][21] . Here, we develop a computational modeling approach that integrates multi-omic data with systematic perturbation analyses of newly constructed whole-genome metabolic models of naïve CD4 + T cells and Th1, Th2, and Th17 cells. This led to identifying potential drug targets for CD4 + T-cell-mediated diseases (RA, MS, and PBC).

RESULTS
Identification of genes expressed in the CD4 + T cells We used the computational approach shown in Fig. 1 (see also Supplementary Methods 1) to construct metabolic models of naïve and effector CD4 + T cells. To identify metabolic genes expressed across CD4 + T-cell subtypes (naïve, Th1, Th2, and Th17 cells), we integrated transcriptomics [22][23][24][25][26][27][28][29][30] and proteomics data 31 (see Materials and methods; Supplementary Data 1). The comparison of genes expressed in CD4 + T-cell subtypes identified by different datasets is shown in Supplementary Fig. 1. The analysis showed that between 675 and 836 metabolic genes were expressed depending on the CD4 + T-cell subtype (Supplementary Data 2). Of these, 530 genes were expressed in all subtypes (Fig.  2a). On the other hand, 16,25,7, and 96 genes were specific to naïve, Th1, Th2, and Th17 cells, respectively. Pathway enrichment analysis using expressed metabolic genes suggested 6 enriched KEGG pathways common across all subtypes: carbon metabolism, TCA cycle, oxidative phosphorylation (OXPHOS), amino sugar and nucleotide sugar metabolism, and valine, leucine, and isoleucine degradation (Fig. 2b). Fatty acid degradation and pentose phosphate pathway were enriched in naïve CD4 + T cells only, and fatty acid metabolism was enriched in the naïve, Th2, and Th17 subtypes. No specific KEGG pathways were found enriched solely in Th1 and Th17 cells. Among the enriched pathways shared by all CD4 + T cells, the TCA cycle was enriched more than two-fold in naïve, Th1, and Th2 subtypes. Similarly, OXPHOS was enriched more than two-fold in naïve and Th1 subtypes (Fig. 2c). These results suggest that key metabolic pathways are active across all the subtypes. Importantly, the metabolism of various CD4 + T-cell subtypes can be different with respect to these pathways' levels of activity and the number of reactions active within the pathways.
Development and validation of genome-scale metabolic models of CD4 + T cells To further examine these issues, we developed constraint-based metabolic models specific to naïve CD4 + T cells, Th1, Th2, and Th17 cells. Genome-scale constraint-based metabolic models include a structural representation of a cell's entire biochemical reactions and help in computing biochemically feasible functional states 32 . These models consist of a stoichiometry matrix of biochemical reactions, exchange reactions (connecting cells internal metabolism with the environment), and additional constraints for upper and lower bounds for fluxes to be identified. Using modeling approaches such as Flux Balance Analysis, these models can be solved to predict flux distribution. In addition to identifying feasible flux distribution, the constraint-based models are useful in understanding human disease and identifying drug targets, modeling interspecies interactions, and metabolic analysis of across strains and species 33 . Our genome-scale metabolic models consist of 3956 to 5282 reactions associated with 1055 to 1250 genes (Supplementary Table 1; Supplementary Data 3-6).
The number of internal enzyme-catalyzed reactions was 2501, 1969, 2549, and 2640 for naïve, Th1, Th2, and Th17 models, respectively, distributed across 84 metabolic pathways (Supplementary Fig. 2; note that transport and exchange reactions were excluded). The models include more genes associations than expressed genes identified from the data because the modelbuilding algorithm inserts some reactions that are not supported by data but required for the model to achieve essential metabolic functions for biomass production (see Materials and methods). Furthermore, the algorithm kept the complete gene rules associated with reactions catalyzed by multiple isozymes if one of them was expressed in the data. Thus, the final models comprise~30% more genes than found in the data. Removing the unexpressed genes would not delete the model's reaction because the data support at least one isozyme. We compared our models with an existing model for naïve CD4 + T cell (CD4T1670 34 ) (Supplementary Table 1). Since our models exclude blocked reactions and dead ends, we have removed dead ends from CD4T1670 for fair comparisons. Since we based all models of this study on the more recent human metabolic network Recon3D 35 , they include more reactions and metabolites than CD4T1670, based on Recon2 36 .
We validated the models based on the active pathways and gene essentiality. We first identified pathways that are known to be active in different CD4 + T-cell subtypes (see Materials and methods) and searched for their activity (with non-zero fluxes) in the corresponding models through Flux Balance Analysis (FBA). Several major pathways were in agreement with the literature. These include glycolysis, TCA cycle, glutaminolysis, and pyruvate to lactate conversion (aerobic glycolysis) that showed non-zero flux in all the models. We present an illustrative flux map of the pathways mentioned above for the naïve model in Fig. 3. The figure also indicates flux differences with the Th1 model for critical reactions. Flux maps for the Th1, Th2, and Th17 models are shown in Supplementary Figs. 3-5. Furthermore, we show some specific Integrative approach for the identification of potential metabolic drug targets. The computational approach comprised of five major steps: (1) Construction of metabolic models using integrated transcriptomics and proteomics data, (2) Identification of metabolic genes that are targets for existing drugs/compounds, (3) In silico inhibition of targets of existing drugs to identify affected reactions, (4) Identification of differentially expressed genes (DEGs) in autoimmune diseases and their integration with flux ratios (Flux perturbed /Flux WT ), and (5) Validation with literature and prediction of new targets.
observed behaviors of CD4 + T cells collected from the literature used for validation in Table 1. For comparison, we also performed these validations in CD4T1670, also shown in Table 1. In all the models developed in this study, limiting glucose from the environment resulted in a decreased growth rate (Fig. 4a), which is in agreement with existing knowledge 37,38 but not reproduced by the CD4T1670 model ( Supplementary Fig. 6). All our models produced lactate 39 (Table 1, Supplementary Fig. 7). Literature shows that increasing PDHm (pyruvate dehydrogenase) by inhibiting PDHK (pyruvate dehydrogenase kinase) would redirect flux from pyruvate to TCA cycle and, therefore, will decrease the lactate production 38 . This was reproduced by our models (Fig. 4b).
Our models were also able to reproduce the essentiality of Leucine, Arginine, and ACC1 for T-cell growth (Table 1), in agreement with the literature 38 . It is important to note that the activity of some pathways in the models was not agreeing or partially agreeing with the literature. Specifically, we did not observe a significant effect on growth rate when glutamine 40 was removed from exchange reactions in the effector CD4 + T-cell models (Fig. 4c). However, literature has shown that while transporters of glutamine in activated CD4 + T cells are dispensable, removal of glutamine is critical and affects growth 38 . Our systematic analyses showed that (1) inhibiting glutamine synthase (GLNS) (that converts glutamate to glutamine) in the absence of glutamine uptake in the model decreases growth to zero, and (2) glutamine can have an impact on growth when glucose availability is limited (less than 2 mmol/g.DW/hr) (Fig. 4d). Using CD4T1670, we observed no effect on biomass when varying glutamine in the presence and absence of glucose (Supplementary Fig. 8). Thus, we may hypothesize that glutamine might be conditionally critical for CD4 + T cells specifically when the availability of other nutrients is low.
Next, we predicted essential genes and compared the results against independent data to identify the overlap between genes predicted by our models and identified from experiments. Because large-scale essentiality datasets were not available for CD4 + T cells, we used essentiality datasets available for different human cancer cell lines. Gene deletion analysis predicted 84, 95, 81, and 84 genes as essential in the naïve, Th1, Th2, and Th17 models, respectively (Supplementary Data 7). More than 70% of these predictions agreed with genes experimentally defined as essential and conditionally essential in those different but related cell lines 41 (Supplementary Fig. 9). To assess if higher-ranked predictions compared better with gene essentiality data, we generated precision-recall curves using 84, 95, 81, and 84 genes, respectively, from each model that were identified as essential. The area under the curve for all models was >75% (Supplementary Figs. [10][11][12][13]. Additional validations based on CD4 + T-cell-specific essential functions are presented in Supplementary Methods 3. Overall, the validation confirmed that our constraint-based metabolic models specific to naïve CD4 + T cells, Th1, Th2, and Th17 cells represent relevant and realistic systems to examine drug response and predict drug targets. Mapping existing and identifying potential drug targets in CD4 + T cells We used the validated CD4 + T-cell-specific models to predict potential drug targets and combined it with the publicly available drug repurposing and tool compound data set from the Connectivity Map (cMap) database and mapped the approved drugs, clinical drug candidates, and tool compounds in the dataset with the metabolic genes in the models (Fig. 5a). Next, we performed in silico knock-outs of the associated drug target genes. Due to the presence of isozymes, not all the deleted target genes influenced the reaction(s). We identified 86, 79, 86, and 90 target genes whose deletion blocks at least one associated reaction in naïve, Th1, Th2, and Th17 models, respectively (Fig. 5b). In turn, these disruptions block multiple downstream reactions. Of these, 62 were common among four CD4 + T-cell subtypes (Fig.  5c). Four genes were targeted only in Th1 cells. All modeled gene deletions resulted in altered flux distributions that were quantified using flux ratios. For each drug target deletion, we classified all reactions into three categories (see Materials and methods): (1) reactions with decreased fluxes (down-reactions), (2) reactions with increased fluxes (up-reactions), and (3) reactions without any changes. We used these flux ratios to identify potential drug targets specific for immune diseases by exploring how diseasespecific metabolic functions are affected upon each drug target inhibition.
First, we identified disease-specific metabolic functions for RA, MS, and PBC using differential gene expression analysis of publicly Fig. 2 Construction of metabolic models in CD4 + T cells. a Expressed metabolic genes identified using integrated transcriptomics and proteomics data of CD4 + T-cell subtypes. b KEGG pathway enrichment analysis of expressed genes in each cell type using all 1892 metabolic genes as a background. c Fold enrichment and P-values (larger sizes correspond to lower P-values) of KEGG pathways enriched across CD4 + T-cell subtypes. A pathway was considered significantly enriched with P-value < 0.05 and false discovery rate (FDR) <5%.
available patients' data (Case-Control studies) (see Materials and methods). We identified 986, 472, and 545 differentially expressed genes (DEGs) (Benjamini & Hochberg P adj < 0.05) for RA, MS, and PBC, respectively (Supplementary Data 8). From these DEGs, we selected genes relevant to our metabolic models. For example, 38 metabolic genes were upregulated and 29 genes were downregulated in RA (Fig. 5d). Biological process enrichment analysis (Fisher's exact FDR < 0.05) identified purine metabolism and starch sucrose metabolism as enriched in upregulated genes. On the contrary, lysine degradation, fatty acid elongation, and carbon metabolism were downregulated. Enriched metabolic pathways for all three diseases are shown in Supplementary Data 8.
To identify potential drug targets for the aforementioned diseases, we looked for target genes whose deletion (inhibition) would have the appropriate effect on diseases' DEGs. For each gene inhibition, we specifically investigated the decrease in metabolic flux through reactions controlled by genes upregulated in disease and increase in metabolic flux through reactions controlled by genes downregulated in disease (Fig. 5e). Using flux ratios of metabolic DEGs, we calculated a perturbation effect score For key reactions, flux comparison between naïve and Th1 is shown through bar plots. Both naïve and Th1 models convert pyruvate to lactate (aerobic glycolysis). In glycolysis, the naïve model had the reverse direction flux through PGI reaction, while Th1 cells have forward direction flux. All the models exhibit an uptake of glutamine that ultimately forms α-Ketoglutaric acid (glutaminolysis). GLNtm (glutamine transporter) and GLUNm (convert glutamine to glutamate) reactions are active in the naïve model and not in models that use different routes for glutamine to glutamate conversion. (PES; see Materials and methods) for each drug target gene in each pair model/disease. PES represents the effect of gene inhibition on both upregulated and downregulated genes. A positive PES value for the drug target gene means that its inhibition decreases more fluxes controlled by genes upregulated in disease than it increases, or increases more fluxes controlled by genes downregulated in disease than it decreases. As such, inhibition of that gene target reverses the fluxes controlled by disease DEGs. In contrast, a negative PES means that the inhibition of a target gene increases more fluxes controlled by upregulated genes or decrease the more fluxes controlled by downregulated genes than the opposite. Among the different combinations of cell types and diseases, the PESs range was from −2 to 2 ( Supplementary Fig. 14). Based on these considerations, genes with higher positive PES can serve as potential drug targets for the disease.
Using PES as a measure of target relevance, we identified 62 potential drug targets that were common to our models (Fig. 5c). These genes displayed various PES ranks across models and diseases. To choose drug targets that performed better across different CD4 + T cells, we considered PES ranks of the four subtype-specific models. First, we normalized the PES ranks by transforming them into Z-scores in each model. Since the studied autoimmune diseases typically involve more than one type of CD4 + T-cell subtype, we next summed up the Z-scores of all the models within a disease for each drug target ( Supplementary Fig.  14). A minimum aggregate Z-score represents overall high PES ranks predicted across four cell types. Therefore, a gene with a minimum aggregated Z-score could be a potential high confidence drug target. We used a Z-score cutoff of −1 (1 standard deviation lower than the mean aggregated Z-score) and identified 17, 27, and 24 potential drug targets for RA, MS, and PBC, respectively ( Table 2; see more details in Supplementary Table 2). Ranking based on aggregated Z-scores is provided in Supplementary Data 9. Taken together, our combined use of the diseasematched genome-scale metabolic models of CD4 + T cells and the well target-annotated public dataset of bioactive compounds generated a manageable list of potential drug targets suitable for deeper analysis and follow-up.
43 of the 68 predicted drug targets were found for only one disease. Six drug targets (LSS, ABAT, SQLE, FDFT1, CAT, CS) were common to RA and MS; five drug targets (NAMPT, EPHX2, COMT, HIBCH, GAA) were common to RA and PBC; and two drug targets (ADH5, FASN) were common to MS and PBC. We show the drugs and compounds available for these targets in Table 3. Out of the 55 unique drug targets identified across three diseases, 38 were robust across different cutoffs (Supplementary Methods 9). A few examples of drug targets from purine metabolism, fatty acid synthesis, and the TCA cycle are shown in Fig. 6.

Experimental validation
Next, we experimentally validated our predictions by targeting genes that have not been reported in the literature to suppress CD4 + T-cell proliferation. We selected ten FDA-approved drugs based on their association with reactions belonging to pathways we were interested in, while ensuring they would cover at least one target from each disease. Four of the tested drugs indeed resulted in decreased CD4 + T-cell proliferation. Genes targeted by these drugs include COMT (RA, PBC), CPT2 (MS), DPYD (MS), and ACSL3 (PBC). COMT is associated with five different reactions of the tyrosine metabolism (Fig. 6f). The genes CPT2 and ACSL3 both catalyze the production of long-chain Acyl-CoA in fatty acid biosynthesis from different substrates (Fig. 6c). DPYD catalyzes the production of uracil and thymine from dihydrouracil and dihydrothymine in pyrimidine metabolism. We subjected human CD4 + T cells stimulated with TCR signaling and IL-2 to different doses of drugs for 48 h and 72 h. Their proliferation was assessed by the MTT colorimetric cell proliferation assay (Fig. 7). A reduction of CD4 + T-cell proliferation was observed for every chosen drug. Entacapone, targeting the COMT gene, showed significant activity at 48 h and 72 h at 100 μM and 1000 μM (Fig. 7a). EPA, targeting ACSL3, decreased proliferation at the highest dose (1000 μM) at 72 h, while the other concentrations did not significantly affect proliferation (Fig. 7b). Perhexiline, targeting CPT2, also reduced proliferation at 72 h at 100 μM and 1000 μM (Fig. 7c). Fluorouracil, targeting DPYD, is the only drug that showed an effective impact on CD4 + T-cell proliferation at a lower dose (1 μM) at 72 h (Fig. 7d). Interestingly, Entacapone, Perhexiline exhibit a significant enhancement of proliferation at a low dose (1 μM), indicating that these drugs can present a biphasic effect in T-cell proliferation. Our experimental validations thus indicate that the perturbation of the activity of our predicted targets can impact CD4 + T-cell proliferation in a time and dose-dependent manner, as we have recently envisioned 46 .
Six drugs showed no or opposite effect on the T-cell proliferation, including DFMO, Pemetrexed, N6022, Ethyl-pyruvate, Pyrazinamide, and Quercetin. As an example, Fig. 7e shows that DFMO, targeting ODC1, does not impact T-cell proliferation with the indicating dose. The data for the other five drugs are shown in Supplementary Fig. 15.
Taken together, our analysis has identified 68 possible drug targets of relevance to metabolic regulation of autoimmune diseases. We discuss the implications of our results in more detail below.

DISCUSSION
Our predicted drug targets were classified into two categories: validated and potential. We considered a target to be validated if previously explored as such in the context of RA, MS, and/or PBC. Potential targets, those not previously reported as such for the three diseases we focused on here, were further classified into three subcategories: target genes that are supported by published Fig. 4 Model predictions are consistent with the literature. a Dependency of growth rate to varying rate of glucose uptake. b Production of lactate with increased flux through pyruvate dehydrogenase. c The dependency of growth rate (in all models) on glutamine when glucose was available (>5 mmol/g DW/hr). d The dependency of growth on glutamine when glucose was removed from the environment. Dots in a, c, and d are average flux and error bars are standard deviation (n = 5). We varied the uptake rate of glucose or glutamine five times (n = 5) to obtain the average and standard deviation at each dot. For example, in a, the growth rate at 3.0 mmol/g DW/hr glucose uptake is the average of growth rates obtained under 3.2, 3.1, 3.0, 2.9, 2.8 (mmol/g DW/hr) of glucose uptake. experimental data, predicted target genes for which no data is currently available, and predicted drug targets for which we provide experimental validation. We will discuss select examples of targets in each of these categories to illustrate ways in which our model and analysis can be used to advance future drug repurposing as well as drug discovery efforts.
Our predictions include three genes (DHODH, ACAT1, and DHFR) that code for proteins targeted by approved drugs currently used Fig. 5 Identification of potential drug targets for RA, MS, and PBC. a Distribution of metabolic drug target genes, and inhibitory drugs or compounds in each model. b Number of metabolic genes in the models mapped with inhibitory drugs (blue bars) and number of genes among drugs mapped genes that can block at least one reaction upon inhibition (red bars). c Comparison of metabolic drug targets that affect reactions upon deletion in CD4 + T-cell models. d Number of all differentially expressed genes (DEGs) and metabolic DEGs in three diseases rheumatoid arthritis (RA), multiple sclerosis (MS), and primary biliary cholangitis (PBC) (P adj < 0.05). The DEGs were analyzed using three transcriptomics datasets (one dataset per disease). The data were obtained from peripheral CD4 + T cells of groups of patients and healthy individuals. e Schematic representation of the integration of disease-associated differentially expressed genes and affected reaction on each drug target gene perturbation. For each drug target deletion, we investigated how many of fluxes regulated by upregulated genes are decreased and fluxes regulated by downregulated genes are increased. We used these numbers to calculate PES (perturbation effect score, see Materials and methods). to treat autoimmune diseases [42][43][44][45]47 . A strong example of an already validated drug target predicted by our models is dihydroorotate dehydrogenase (DHODH), a key enzyme in the de novo pyrimidine synthesis pathway, and a target of leflunomide, an approved drug for rheumatoid arthritis 43,48 . DHFR (dihydrofolate reductase) is a well-established oncology drug target, and also an immunosuppressant and anti-inflammatory target 49 . Low doses of an FDA-approved DHFR inhibitor methotrexate have been found effective as a treatment for MS, RA, and Crohn's disease 45 . Mitochondrial acetyl-CoA acetyltransferase (ACAT1) is a target for FDA approved anti-inflammatory drug sulfasalazine in inflammatory bowel syndrome. Furthermore, this drug is indicated for treatment for rheumatoid arthritis and ulcerative colitis 50 . Taken together, our models successfully replicated current clinical practice, further strengthening the value of our approach. Interestingly, a major subcategory of gene targets we identified code for proteins that have not previously been explored for the treatment of RA, MS, and PBC. We can now use these insights to formulate preclinical and clinical hypotheses. For example, ABAT, which encodes the GABA-transaminase enzyme that breaks down γ-aminobutyric acid (GABA; a neurotransmitter), was identified in our analysis as a potential target. While ABAT has not been previously identified as a drug target for RA, we can hypothesize that its inhibition may increase free GABA levels which would, in turn, inhibit CD4 + T-cell activation. The relationship between GABA levels and suppressions of CD4 + T-cell activation has been previously reported, further suggesting a link between neurotransmission and immune response [51][52][53] . There are currently two Fatty acid synthase linked to the pathogenicity of Th17 cells 106 .
Citrate synthase (CS) Increased citrate in MS patients 109   FDA-approved drugs, vigabatrin, and phenelzine, that target ABAT. Although we do not expect that either one of these agents can be repurposed to treat autoimmune disease given that they are an anti-seizure medicine and an antidepressant, respectively, we propose that further analysis of relationships between neurotransmission and immune response offers an interesting targeting opportunity 42 . Another example is glutathione reductase (GSR), an enzyme that reduces oxidized glutathione disulfide to cellular antioxidant GSH 54 . It has been shown that inhibition of the de novo GSH synthesis can reduce the pathological progression of experimental autoimmune encephalomyelitis (EAE) 44 . Here, carmustine, a chemotherapy drug, is an FDA-approved drug that targets GSR, offering a viable starting point for future pre-clinical testing (Table 3). Additional high confidence predictions and target candidates are a group of genes that have been experimentally shown to repress CD4 + T cells upon inhibition. This list includes nicotinamide phosphoribosyltransferase (NAMPT), which we now predict is a drug target for RA. This enzyme is involved in NAD + synthesis 55 and was previously explored as a drug target in EAE for MS 56 , melanoma, T-cell lymphoma, and leukemia 57 . Given that two NAMPT inhibitors, GMX1778 and FK-866, are in phase II clinical trials (Table 3), this enzyme represents a target where pre-clinical testing and followup may lead to drug repurposing opportunities. Another example worth highlighting is epoxide hydrolase 2 (EPHX2), which converts toxic epoxides to non-toxic dihydrodiols 54,55 . Its inhibition was reported to decrease the production of proinflammatory cytokines in preclinical evaluation in inflammatory bowel syndrome 58 . For EPHX2, an inhibitor GSK2256294A is in phase I clinical trial, indicating that developing drugs for this target may be possible. Moreover, our model implicated enzymes such as pyruvate kinase (PKM), which impacts glycolysis, HMG-CoA reductase (HMGCR), which regulates cholesterol biosynthesis, and adenosine deaminase (ADA), which converts harmful deoxyadenosine to not harmful deoxyinosine. Each of these enzymes has been experimentally linked to T-cell proliferation and development [59][60][61] , and all three targets have been the subject of previous drug development campaigns (Table 3). ATP Citrate Lyase (ACLY), Catalase (CAT), Farnesyl diphosphate synthase (FDPS), Lanosterol synthase (LSS), Squalene epoxidase (SQLE), and Superoxide dismutase 2 (SOD2) also represent targets we identified. SQLE is involved in cholesterol biosynthesis, and in general agreement with recent reports that inhibiting cholesterol pathways can suppress T-cell proliferation 60 . The loss of SOD2 can increase superoxides and defective T-cell development 62 . For all these targets, either preclinical, clinical, or approved inhibitors are available, which we consider encouraging for further study and drug repurposing (see Table 3 for details).
Other predicted genes are part of the TCA cycle (Citrate synthase (CS), Isocitrate dehydrogenase 2 (IDH2)), ribonucleotide biosynthetic processes (Phosphoribosyl pyrophosphate amidotransferase (PPAT), Carbamoyl-phosphate synthetase 2, Aspartate transcarbamylase, and Dihydroorotase (CAD)), and lipid biosynthesis (Fatty acid synthase (FASN)) that are also important for T cell development. As with the examples above, many of these potential candidate targets have inhibitors that are in different stages of preclinical and clinical development, and some (like PPAT and FASN inhibitors) have been FDA approved (Table 3).
In addition to gene targets with robust or partial existing experimental evidence, we identified 31 potential gene targets for which no evidence currently exists. These genes are involved in glycolysis, TCA cycle, OXPHOS, and the metabolism of fatty acid, pyruvate, purine, pyrimidine, arginine, proline, and tyrosine, which are all critical for T-cell activation and proliferation 39,63 . We performed in vitro experiments to investigate predicted targets without existing published evidence and we tested ten FDAapproved drugs. In four cases of drug targets (COMT, ACSL3, CPT2, and DPYD), the drugs (entacapone, EPA, perhexiline, and fluorouracil, respectively) inhibited the proliferation of CD4 + T cells in a dose and time-dependent manner. Some drugs displayed a "biphasic effect" where it encouraged proliferation at a low dose while inhibiting T-cell proliferation at a higher dose. This dose-response effect, called hormesis, is usually an adaptive survival response to toxic agents or stressful environments. Hormetic effects are, in general, produced by highly conserved evolutionary systems and characterized by the coordinated activation of molecular signals triggering coherent metabolic responses to maintain cell homeostasis 64 . Indeed, it is well established that the ability of T cells to modulate their metabolism upon environmental changes is key to preserve their survival and fulfill their functions 65 . In the case of ODC1, we did not observe any impact on CD4 + T-cell proliferation under any tested concentration of DFMO at both 48 and 72 h. It is difficult to explain this lack of effect without further experimental investigations. It might be that ODC1 is in large excess in our cell cultures, or that the amount of active enzyme is tightly regulated.
Collectively, our models and approach led to the identification of potential high-value targets for RA, MS, and PBC treatment, and proposed several drugs in current clinical use for drug repurposing.
Data integration enabled us to build, refine, and validate highquality cell-type-specific models. While many of the major pathways important for CD4 + T-cell activation and proliferation are commonly active across different CD4 + T-cell subtypes we tested, the models differ with respect to how these pathways are used for growth. For example, higher activity of the fatty acid Table 3 continued oxidation pathway is more important in naïve but not in effector CD4 + T cells that have elevated glycolysis 63 and fatty acid synthesis pathways. A substantial number of essential genes identified from these models overlapped with gene essentiality data obtained from different cell lines 41 . When compared to gene essentiality from independent cell line data, the effector cell models performed slightly better than the naïve model. One possible reason for this difference might be the difference in nutrient preferences between naive and effector T cells. We are conscious that assessing model performances based on comparison with essential genes defined on different cell lines is not ideal. Properly assessing their accuracy would require CD4 + T-cellspecific essentiality data. Integration of disease-associated DEGs with flux profiles under gene knock-out helped us to select disease-specific drug targets. While computational models of signal transduction in CD4 + T cells 66,67 are available, metabolic models of effector and regulatory CD4 + T cells have not been developed (except for naïve CD4 + T cells 34 ). Similar metabolic models were previously used to predict drug targets against pathogens 18 and complex diseases such as cancers 68 . While our approach can be generalized for human diseases and used with any -omics dataset, the unavailability of reliable data contributes to some limitations. Because of heterogeneity with respect to time after stimulation with cytokines in the available datasets, constructed models represent inclusive metabolic phenotypes during activation and proliferation for each CD4 + Tcell subtype. Thus, time-specific data would be required to study metabolic phenotype at a specific time point in CD4 + T-cell development. Furthermore, by integrating gene expression and proteomics data, we improved the identification of expressed genes compared with the sole use of gene expression or proteomics datasets. This approach presents certain limitations because of post-transcriptional and post-translational modifications. Integrating more functional data such as enzyme activities and measured metabolic fluxes could further improve the selection of active reactions within the context of specific models. In addition, the biomass objective function used in our study is not specific to CD4 + T cells. We used a biomass objective function generated for human alveolar macrophages. Differences in growth conditions and the respective sizes of macrophages and CD4 + T cells might impact the ratios of precursors used for biomass objective functions. Since the precursors for biomass production would be the same across different cell types, in the absence of comprehensive data on precursor ratios from a specific cell type, the objective function of similar cells is useful for obtaining flux distribution. A specific objective function that considers varying utilization of precursor metabolites (such as glycolysis intermediates) by different CD4 + T cells for biomass production might Fig. 7 Analysis of CD4 + T-cell proliferation response upon drug treatment by MTT assay. CD4 + T cells were exposed to various concentrations of drugs (1, 10, 100, and 1000 μM) for 48 h (white bars) and 72 h (orange bars). Drugs' names (Entacapone, EPA, Perhexiline, Fluorouracil, and DFMO) were indicated on the top of each graph bar with their corresponding targeted gene in parentheses. Cell proliferation is expressed as fold change ± SEM relative to untreated control cells and is representative of four independent experiments. Statistic significance was only shown for effective concentration and was evaluated using a paired t-test, one-tailed (*p < 0.05, **p < 0.005).
further improve the models. However, we have shown that changing objective functions (from the Recon3D to macrophage model) had no significant impact on the constructed models (see Materials and methods for details). Similarly, reliable diseasespecific data were unavailable for specific CD4 + T-cell subtypes, therefore, building subtype-specific cell metabolic models under disease conditions was not possible. We mitigated this limitation by integrating disease-specific DEGs from sorted CD4 + T cells with models, which resulted in metabolic fluxes relevant to diseases. In the future, with the availability of more disease and cell-typespecific data, our integrative approach may further improve these results. In particular, we hope to include other cell-types such as T follicular helper (Tfh),T helper 9 (Th9), and T helper 22 (Th22) cells, which also play a role in the development of autoimmune diseases.
Overall, our integrative systems modeling approach has provided a new perspective for the treatment of RA, MS, and PBC. Moreover, the newly constructed models may serve as tools to explore the metabolism of CD4 + T cells. Additionally, our approach is generalizable to other disease areas for which reliable disease-specific data are available, making it a potentially important computational platform for both drug target identification and prioritizing targets for drug repurposing efforts.

MATERIALS AND METHODS High throughput data acquisition and integration
We collected transcriptomics data from the GEO 69 database and proteomics data 31 . A total of 121 transcriptomics [22][23][24][25][26][27][28][29][30] and 20 proteomics 31 samples relevant to the CD4 + T cells were selected (Supplementary Data 1). Transcriptomics data analysis was performed using the affy 70 and limma 71 R packages. Because we aimed to characterize gene activities instead of gene expression levels, the processed transcriptomics data were discretized (active = 1; inactive = 0) and samples for each cell type were combined. Genes expressed in more than 50% of the samples in which the probe was present were considered as active (see Supplementary Methods 2). Similarly, proteins expressed in more than 50% of samples in the proteomics dataset were considered as active. In the proteomics datasets, protein IDs were mapped to gene IDs.
Next, we integrated activities from transcriptomics and proteomics datasets. First, biological entities that overlapped in both types of data were selected as high-confidence. Second, we found that some genes were expressed in the majority of transcriptomics datasets, but expressed in less than 50% samples of proteomics data. Similarly, some proteins were identified within groups of highly abundant proteins in multiple samples in proteomics datasets but expressed in less than 50% samples of transcriptomics datasets. Such non-overlapping genes were selected as moderate-confidence based on consensus in single types of -omics data (Supplementary Methods 2.1.1). Third, moderate-confidence genes exclusively present in the transcriptomics data were added to the overlapping genes if expressed in at least 90% of samples. Fourth, moderateconfidence genes exclusively present in the proteomics dataset were added if their abundance was ranked in the top 25% (fourth quartile) (Supplementary Methods 2; Supplementary Data 10). We used these cutoffs to decrease the false negatives while not selecting false positives by removing genes and proteins that are not expressed in any sample either in transcriptomics or proteomics data.

Model validation
Models were validated based on literature knowledge related to active pathways in proliferating and differentiated CD4 + T cells. CD4 + T-cellspecific metabolic functions were searched in the literature using PubMed 76 . Naïve CD4 + T cells tend to have low energy demands and mainly rely on fatty acid β-oxidation, oxidation of pyruvate, and glutamine via the TCA cycle 77 . On the other hand, the high bioenergetics demand in effector cells is met by shifting OXPHOS to glycolysis and fatty acid oxidation to fatty acid synthesis 39 . Furthermore, similar to cancer cells, proliferating effector CD4 + T cells convert lactate from pyruvate by lactate dehydrogenase enzyme 39 . Thus, we obtained the flux distribution of metabolic pathways under wild-type conditions using Flux Balance Analysis (FBA) and searched the non-zero fluxes through the aforementioned pathways in all the models. Flux maps were created using Escher web application (https://escher.github.io/#/) 78,79 . It has also been observed previously that deficiency in glucose and glutamine impairs CD4 + T-cell activation and proliferation 37,80 . We performed this experiment in silico, whereby we varied the flux through exchange reactions of glucose (EX_glc_D[e]) and glutamine (EX_gln_L[e]) in the models and analyzed the effect on growth rate.

Comparison of essential genes predicted by the models and identified in different cell lines
To predict gene essentiality, we knocked out model genes to predict their effect on the growth rate. This was performed using singleGeneDeletion in the COBRA toolbox using the Minimization of Metabolic Adjustment (MoMA) method 81 . Because of the unavailability of CD4 + T-cell-specific data, predicted essential genes were compared with experimentally identified essential genes in humans from different cell lines. The data for experimentally tested essential and nonessential genes for human were obtained from the OGEE database 41 . In this database, the essentiality data for humans was compiled using 18 experiments across various cell lines that include RNAi-based inhibition, CRISPR, and CRISPR-CAS9 systems. To investigate how many model-predicted essential genes are also essential in other cell lines, predicted essential genes were compared with experimentally observed essential and conditionally essential genes reported in the OGEE database. Essential and conditionally essential genes were merged together. Additional validations of models using CD4 + T-cellspecific essential genes can be found in Supplementary Methods 3.

Mapping drug targets
The developed models were used to predict potential drug targets for autoimmune diseases in which effector subtypes have been found hyperactive 82,83 . Therefore, a reasonable drug target should downregulate effector CD4 + T cells. Among the metabolic genes of selected models, we first identified targets of existing drugs. The drugs and their annotations including target genes were imported from The Drug Repurposing Hub 84 in the ConnectivityMap (CMap) database 85 (see also Supplementary Methods 4). All withdrawn drugs and their annotations were first removed. In this list, the gene symbols of target genes of drugs were converted to Entrez IDs. Next, we searched Entrez IDs from CMap data in the genes of metabolic models. For each mapped gene in the model, the drugs were listed.

Metabolic genes differentially expressed in autoimmune diseases
The lack of reliable data from specific CD4 + T-cell subtypes involved in autoimmune disease conditions led us to utilize patients' data (casecontrol studies) available for autoimmune diseases that were collected from peripheral CD4 + T cells. Datasets GSE56649 86 (rheumatoid arthritis), GSE43591 87 (multiple sclerosis), and GSE93170 88 (primary biliary cholangitis) were obtained from the GEO database (see details in Supplementary Methods 5). Raw data files were processed using the affy and limma packages 70,71 in Bioconductor/R. The limma package was used to identify DEGs between patients and healthy controls. For significant differential expression, selective cutoffs of fold-changes were used with adjusted Pvalues < 0.05. For differentially expressed genes, we used a two-fold cutoff. A cutoff of 1.5-fold was used for datasets where two-fold resulted in a very low number to zero differentially expressed metabolic genes.

Perturbation of metabolism and perturbation effect score (PES)
In metabolic models, the knockout of genes that are targets of existing drugs was performed in the COBRA toolbox using MoMA 81 (Supplementary Methods 6). For each knockout, we investigated the change in fluxes regulated by DEGs in diseases. The change in fluxes was computed using flux ratios of perturbed flux/WT flux, and all fluxes that are affected by each perturbation were calculated. We counted fluxes regulated by upregulated genes that are decreased or increased after perturbation (UpDec and UpInc) as well as fluxes regulated by downregulated genes that are decreased or increased after perturbation (DownDec and DownInc). The total number of fluxes for each perturbation also include upregulated genes that were unchanged after perturbation (UpUnc) as well as downregulated genes that were unchanged after perturbation (DownUnc) (see also Supplementary Methods 7). For each perturbed gene, a perturbation effect score (PES) was calculated as:  Next, for each disease and model combination, the ranks of PES were computed. The gene with the highest PES obtained the top rank and the one with the minimum PES obtained the lowest rank. For each disease, we prioritized drug targets by utilizing their ranks across all models. The PES ranks in each model were first transformed into Z-score as: where x is a PES rank, μ is the mean of PES ranks in a model for one disease, σ is the standard deviation of PES ranks obtained by a model for one disease. For each disease type and each gene, Z-scores across four models were summed up to calculate an aggregated Z-score. Genes were ranked based on minimum to maximum aggregate Z-scores (Supplementary Methods 8).

Pathway enrichment analysis
For biological processes enrichment analysis, we used DAVID V6.8 89 , and STRING database 90 together with Gene Ontology biological processes 91 , KEGG pathways 92 , and Reactome pathways 93 . A cutoff of 5% 94,95 False Discovery Rate (FDR) and P-value < 0.05 were used for significant enrichment. The pathway maps used in Fig. 6 were generated with yED graph editor software. The ChEMBL IDs 96 of drug targets can be found in Supplementary Table 2.

Experimental validation
We used frozen vials of peripheral blood mononuclear (IXCells Biotechnologies) from healthy donors. CD4 + were purified by negative selection using magnetic human CD4 + T cells nanobeads (MojoSort, Biolegend) according to the manufacturer's protocol. For cell activation, anti-CD3 (clone OKT3, Biolegend) was coated on plates at 4 µg/ml overnight in PBS at 4°C. Cells were then cultured in X-Vivo media (Lonza) supplemented with 2 µg/ml of anti-CD28 (clone CD28.2, Biolegend) and with 10 ng/ml of recombinant IL-2 (Peprotech) for 7 days. Half of the media was renewed every 2-3 days by adding fresh media supplemented with 2 µg/ml of anti-CD28 and 10 ng/ml IL-2.
We purchased alpha-difluoromethylornithine (DFMO) and eicosapentaenoic Acid (EPA) from Cayman Chemical. DFMO was resuspended in water at 50 mM while EPA was already resuspended in ethanol at 826 mM. Perhexiline, Entacapone, and Fluorouracil were obtained from Tocris and dissolved in DMSO to make a stock solution at 12 mM, 200 mM, and 198 mM, respectively. For drug treatment, CD4 + T cells were seeded at 50,000 cells in 96 round bottom wells in culture media supplemented with anti-CD3, anti-CD28, and IL-2. Before incubation with CD4 + T cells, drugs were diluted in culture media with concentration ranging from 1 µM to 1000 µM and incubated for 48 h or 72 h.
We assessed T-cell proliferation using the TACS MTT proliferation assay (R&D Systems). Briefly, a tetrazolium salt solution (10 μl) was added to each well, and the plate was incubated at 37°C for 4 h. After incubation, 100 μl of stop solution was added to each well and incubated overnight before absorbance measurement. Cell proliferation was read at 48 h and 72 h post drug treatment. The absorbance was measured at 570 nm using the BioTek B.L. Puniya et al. microplate reader instrument (BioSPX). We corrected cell absorbance readings using cells treated with DMSO or the media controls.
The CD4 + T-cell proliferation upon drug treatment was analyzed statistically with a paired t-test, one-tailed for four independent experiments. All data are presented as mean plus or minus standard error of the mean (SEM) and analyzed using GraphPad Prism software. The fold change of cell proliferation-cultured was calculated using untreated cells as 1.