Introduction

Addiction is defined by a chronically relapsing disorder characterized by the compulsion to seek for stimuli and to take drugs, and by the loss of control in the intake1. Addiction is driven, among others, by interindividual responses to genetic, epigenetic, and environmental factors determining the disease’s vulnerability or resilience. Therefore, it is critical to understand the neurobiological differences between recreational and controlled use and the loss of control and compulsive intake of the rewarding stimuli, which all are driven by transcriptional reprogramming in the brain reward system.

The neuronal circuits regulating reward and motivational behaviors involve primarily the mesocorticolimbic dopamine system, including the ventral tegmental area, nucleus accumbens (NAc), dorsal striatum, ventral pallidum, hippocampus, prefrontal cortex (PFC), and amygdala2. The PFC involves different brain regions (such as anterior cingulate, prelimbic, and infralimbic cortex), regulating cognitive and executive functions, including awareness, decision-making, self-control, and salience attribution. In fact, due to its dense projections to subcortical regions, the PFC exerts a top-down inhibitory control of appetitive and aversive behaviors3,4,5. Notably, neuroimaging studies in addicted subjects showed that impaired self-control is driven by reduced brain network activity, including PFC and striatum6.

Therefore, the chronic exposure to the reward triggers adaptations in the brain reward system, leading to the development of addiction in vulnerable individuals. Operant conditioning models in rodents have been fundamental in understanding the mechanisms involved in addiction. These self-administration models have been extensively used to measure the positive reinforcing effects of stimuli and reward effectiveness2 and addiction-like behaviors promoted, e.g., by palatable food and cocaine5, 7,8,9. Neuronal adaptations associated with addiction-like behaviors are driven by reprogramming of gene expression. Thus, numerous gene expression studies have been carried out to reveal molecular players underlying addiction in self-administration models using either drugs of abuse or natural rewards5, 8,9,10,11,12. However, the knowledge about core gene expression signatures is still elusive amongst the different types of addiction, despite that they elicit similar adaptations and behavioral changes13.

Therefore, we performed a computational analysis of the publicly available whole transcriptome datasets of the PFC from two independent self-administration studies in mice using palatable food and cocaine as reinforcers, respectively5, 9. Importantly, in both studies, mice were scored with addiction criteria-index based on their operant behavior, which allowed to classify them as addicted (vulnerable) and non-addicted (resilient) mice. After analyzing the datasets, 56 core genes were found to be upregulated in both addiction-like conditions. Gene ontology analysis of common genes revealed biological processes associated with addiction. Protein–protein association analysis identified a hub network of several shared genes at the protein level. Using single-cell RNA-seq data from a publicly available study14, we could allocate the shared molecular players in a cell-type-specific manner.

Results

Gene expression signature in PFC of addiction-like behaviors associated with palatable food and cocaine addiction

We performed a computational analysis of whole transcriptomic data of PFC from palatable food addiction-like behavior5 and cocaine addiction9 studies from NCBI-GEO (Fig. 1a). For the food addiction study, mice were exposed to a self-administration model using chocolate-flavored pellets as reinforcers under a fixed ratio (FR) 1 schedule of reinforcement during six sessions. Then, FR5 schedule across 112 sessions followed to mimic the transition to addiction by the repeated seeking of palatable food (Fig. 1a). After the extended operant conditioning training, mice were classified into addicted and non-addicted mice according to three addiction criteria5 (Fig. 1a). In the cocaine study, mice were food trained, followed by a cocaine or saline self-administration paradigm under FR1 and FR2 schedule across 10–15 days (Fig. 1a). Notably, an addiction index was assigned to each mouse based on its operant behavior in the self-administration model9 (Fig. 1a). In the cocaine self-administration study, we focused on those samples with the lowest (n = 10) and highest (n = 10) addiction index in the cocaine self-administration group, independently whether the mice were challenged with cocaine withdrawal or cocaine/context priming before sample collection, thereby resembling the classification criteria of the palatable food addiction study. Thus, we compared the whole transcriptomic data of those mice with the lowest and the highest addictive-like criteria/index in both studies. Upon performing the clustering analysis, we could determine the transcriptional variation between palatable food and cocaine-addicted vs. non-addicted mice, respectively (Fig. 1b,c), confirming the behavioral characterization on the addiction-like criteria/index. The principal component analysis (PCA) showed that the transcriptome strongly changed on the PC1 and PC2, allocating the mice into two clusters, indicating that differences in gene expression led to behavioral changes. Next, using a differential expression analysis we found 111 down-regulated and 70 upregulated genes between palatable food non-addicted and addicted mice (Fig. 1d; Tables 1, 2, Supplementary Table 1), while 29 genes were down-regulated and 422 genes were upregulated between cocaine non-addicted and addicted mice (Fig. 1e; Tables 1, 2, Supplementary Table 2). Interestingly, 56 upregulated genes were associated with both palatable food and cocaine addictive-like behaviors, whereas an overlap of 13 genes were up-regulated in the cocaine-addicted mice, but down-regulated in palatable food addicted mice (Fig. 1f; Tables 1 and 2). Dopamine D2 receptor (Drd2), adenosine 2A receptor (Adora2a), G protein-coupled receptor 88 (Gpr88), dopamine D1 receptor (Drd1), G protein-coupled receptor 6 (Gpr6), Glucagon-like peptide 1 receptor (Glp1r), Galanin type-1 receptor (Galr1), proenkephalin (Penk), choline O-acetyltransferase (Chat) and regulator of G protein signaling 9 (Rgs9) were upregulated in both studies, indicating the strong link with addiction-like behaviors in the PFC (Fig. 1g; Supplementary Fig. 1a; Table 1). Additionally, some transcription factors, such as forkhead box J1 (Foxj1), Isl LIM/homeobox 1 (Isl1), sine oculis related homeobox 3 (Six3), transcriptional activator Myb (Myb), and PR domain containing 12 (Prdm12) were also differentially expressed between the addicted and non-addicted-like conditions in both studies (Supplementary Fig. 1b; Table 1). Furthermore, our analysis could identify a unique gene expression pattern for palatable food and cocaine addiction (Fig. 1f, Supplementary Tables 1, 2). Thus, 14 upregulated and 98 downregulated differentially expressed genes were particularly associated with food addiction, whereas 353 upregulated and 29 downregulated genes were found specifically in cocaine-addicted mice as compared to cocaine non-addicted mice (Fig. 1f, Supplementary Tables 1, 2), suggesting that most of the transcriptional reprogramming was exclusively associated with either palatable food or cocaine. Thus, our analysis identified a common and unique gene signature in the PFC linked to addiction-like behaviors.

Figure 1
figure 1

Gene expression pattern associated with food and cocaine addiction-like behaviors. (a) Experimental design of palatable food and cocaine self-administration studies. Black arrows indicates tissue collection in each study. (b) PCA plot explaining the transcriptome of palatable food non-addicted and addicted mice. (c) PCA plot analysis of the transcriptome of cocaine non-addicted and addicted mice. (d) Volcano plot of RNA-seq data representing the gene expression changes of significantly upregulated genes (70) and down-regulated genes (111) in mice addicted to palatable food as compared to mice non-addicted to palatable food. (e) Volcano plot of RNA-seq data representing the gene expression changes of significantly upregulated genes (422) and significantly down-regulated genes (29) in cocaine addicted mice as compared to cocaine non-addicted mice. (f) Venn diagram representing the overlap of differentially expressed genes from palatable food and cocaine studies. (g) Expression levels of Adora2a, Drd1, Drd2, Gpr6, and Gpr88 in palatable food and cocaine non-addicted and addicted mice.

Table 1 List of common upregulated genes in palatable food and cocaine addicted mice.
Table 2 List of shared genes downregulated in food addicted mice and upregulated in cocaine addicted mice.

In order to validate our findings and the classification in the cocaine study, we performed a further analysis of the cocaine self-administrated non-addictive mice and saline self-administrated mice (Supplementary Fig. 1c). We found that some common genes (32 out of 56) were downregulated in cocaine non-addicted mice (Supplementary Fig. 1c). This analysis was only possible in the cocaine study. These results support the hypothesis of a common gene signature for addiction and suggest that downregulation of key addiction genes represents a protective mechanism underlying resilience to addiction behavior.

Addiction signature is associated with learning and memory, dopaminergic synaptic transmission, cAMP signaling pathway, and histone phosphorylation

Gene ontology (GO) analysis of shared upregulated genes revealed that Drd2, Drd1, Ppp1r1b, Gpr88, Glp1r, and Ntrk1, among others genes, contribute to behavioral responses, including learning and memory, response to cocaine, feeding behavior, and response to stress (Fig. 2a, Supplementary Table 3). The GO analysis also showed gene’s participation (Adora2a, Drd2, Drd1, Rgs9, Ntrk1) in biological processes related to synaptic plasticity, such as long-term potentiation, prepulse inhibition, and regulation of both glutamatergic and dopaminergic synaptic transmission (Fig. 2a, Supplementary Table 3). Finally, the analysis identified gene expression changes (Adora2a, Drd2, Drd1, Ppp1r1b, Pde10a, Rgs9, CD4, Glp1r) at molecular level functions, including in the regulation of cAMP signaling pathway, calcium ion transport, and histone phosphorylation (Fig. 2a, Supplementary Table 3). Notably, genes encoding dopamine- and adenosine-mediated cAMP signaling pathway, including Drd2, Drd1, Adora2a, and the downstream target Ppp1r1b (encoding the dopamine and cAMP-regulated neuronal phosphoprotein, DARPP-32), make a strong contribution to the above analysis.

Figure 2
figure 2

GO and gene association network analysis of shared genes in food and cocaine addiction-like behavior. (a) Bar plot representing the behavioral, biological, and molecular processes. Number of genes and − log 10 (p-value) in red and blue, respectively. (b) Protein–protein interaction network of shared genes in palatable food and cocaine addiction. The thickness of an edge represents the confidence score for the given interactions.

Drd2, Adora2a, Drd1, Gpr88, and Gpr6, together with downstream targets of cAMP signaling pathway, establish a hub for a network at protein levels

The computational and GO analysis gave the basis to perform functional protein–protein association network analysis among the shared genes using the STRING database15. In Fig. 2b, every node represents one gene, and each edge connecting two nodes represents different degrees of associations at protein levels. The thickness of an edge visualizes the confidence score for the given interactions. The STRING analysis revealed a protein–protein interaction network of 35 shared genes. Out of these candidates, eight genes, five encoding G-protein-coupled receptors (GPCRs, including dopamine D2 receptor (D2); dopamine D1 receptor (D1); adenosine 2A receptor (A2A); Gpr88 receptor (GPR88), and Gpr6 receptor (GPR6)) are building the core of this network, together with three further proteins, all related to the cAMP signaling pathway (DARPP32; phosphodiesterase 10a or PDE10A, encoded by phosphodiesterase 10a (Pde10a) gene; and a regulator of G-protein signaling-9 or RGS9, encoded by Rgs9 gene (Fig. 2b). Accordingly, previous studies showed that the formation of D1–D2 heteromers modifies the functional properties of these receptors by coupling to Gq proteins and increasing the sensitivity to amphetamine16. Functional D2–A2A heteromers have also been recently demonstrated17.

Furthermore, synergistic interaction between Drd2 and Adora2a genes might play a role in anxiety disorders18. The co-occurrence between anxiety disorders and substance abuse disorders has a higher prevalence than expected by chance level19. Thus, previous data validate the protein–protein interaction network analysis of gene signature of PFC in addiction-like behaviors. We propose that this gene network's core might make the most considerable contribution to the neuroplasticity and adaptation associated with addiction-like behaviors.

Cell type-specific expression of the shared molecular players

Finally, we investigated the cell type selective expression of the shared genes using the publicly available single-cell RNA-seq data of the PFC (including anterior cingular, prelimbic, and infralimbic cortex) in adult male C57BL/6 mice14. In the present study, we focused on the dataset of cocaine self-administration mice during the maintenance phase to study the impact of long-term exposure to cocaine on transcriptional changes at cell type specific levels. First, we determined the different cell subtypes stated in Bhattacherjee et al.14 using tSNE approach and replicated this (Fig. 3a). After visualizing the specific markers for PFC cell clusters (Supplementary Fig. 2), we could identify the expression of 28 shared genes in the different cell type clusters (Fig. 3b; Supplementary Figs. 3, 4). Drd1, Drd2, Gpr88, and Gpr6 were almost exclusively expressed in excitatory neurons. However, Gpr88 and Drd1 were also found at lower levels in inhibitory neurons and non-neuronal cells, such as oligodendrocyte precursors (OPC) and endothelial cells (Fig. 3b). Strikingly, Adora2a was mostly expressed in endothelial cells, at lesser levels in microglia, and very sparse expression was observed in excitatory and inhibitory neurons (Fig. 3b). The regulatory genes (Ppp1r1b, Rgs9, and Pde10a) of the cAMP signaling pathway showed a broad expression in the different cell clusters, including excitatory neurons, inhibitory neurons, astrocytes, oligodendrocytes, OPC, newly formed oligodendrocytes (NF oligo), and endothelial cells (Supplementary Figs. 3, 4). The transcription factor Foxj1 also showed a broad expression and was expressed in excitatory neurons, endothelial cells, oligodendrocytes, and NF oligo. However, other genes showed a very selective expression in one of the clusters, such as Cd4, Ido1, and Dmkn in excitatory neurons, Top2a in NF oligo, Slc5a7 in endothelial cells, and Spint1 in microglia (Supplementary Figs. 3, 4).

Figure 3
figure 3

Cell type specific expression of the relevant genes in mouse PFC. (a) t-SNE plot representing different cell type clusters in PFC. t-SNE plot representing the clusters of the different cell subtypes in PFC based on the transcriptome of cocaine self-administration mice during the habituation phase from Bhattacherjee et al.14. (b) t-SNE plot representing some of the common upregulated genes found in cocaine and food addiction studies.

Discussion

Addiction is defined by behavioral abnormalities, including a loss of control over reward intake, a compulsive reward intake despite aversive consequences, and chronic relapse after long periods of abstinence. Importantly, the same behavioral abnormalities associated with the addiction symptoms are driven by diverse rewarding stimuli (drug of abuse, natural rewards, and other stimulants), suggesting a common pattern of cellular adaptations in the brain reward circuit of vulnerable individuals. Despite that a huge number of studies have tried to determine the molecular basis of addiction, there is still a limited understanding of the common and unique molecular mechanisms underlying addiction disorders. In this study, we performed a computational analysis of publicly available datasets of two independent studies using animal models of palatable food and cocaine addiction5, 9. We uncovered a group of genes in PFC associated with vulnerability vs. resilience to addiction-like behaviors.

As a part of the brain reward system, PFC is instrumental in the control of reward intake, which is impaired in addiction leading to compulsive drug intake and relapse3, 20. Several human and animal model studies associated a reduced neuronal activity in PFC with compulsive behavior in reward intake3, 5, 20, 21. These alterations at cellular and circuitry levels are driven by transcriptional reprogramming after long-term exposure to reward intake. Accordingly, the principal component analysis of RNA-seq data in both studies showed substantial transcriptomic differences between addicted and non-addicted mice. Our computational analysis revealed that a total of 69 differentially expressed genes were found in common between these two studies. However, 13 of these genes were upregulated in cocaine-addicted mice, while the same genes were downregulated in food addicted mice, which remains to be interpreted.

Interestingly, 56 common genes were upregulated in both food and cocaine-addicted mice. These shared genes include several GPCRs (Drd2, Drd1, Adora2a, Gpr88, Gpr6, Glp1r) and transcription factors (Foxj1, Six3, Prdm12, etc.), among others. GPCRs are very attractive, as 34% of drugs approved by FDA target this receptor class22. Accordingly, a previous study demonstrated that D1 expressing neurons in PFC are activated by food intake, and optogenetic stimulation of these D1 neurons increased feeding23. Moreover, we have recently demonstrated that overexpression of Drd2 in PFC-NAc projection neurons promoted a compulsive-like behavior for chocolate pellet seeking in mice5. In contrast, mice with deficiency in the Gpr88 gene, which is highly expressed in the striatum and at lower levels in the cortex and thalamus of adult mice24, showed increased alcohol seeking and consumption25. However, its specific function in the PFC remains unknown. Interestingly, some preclinical studies have examined the use of GLP-1 analogs in alcohol use disorder26, suggesting a potential protective role or, alternatively, a compensatory mechanism of upregulated Glpr1 gene expression in PFC of vulnerable mice. A genome-wide association study identified Six3 and Drd2 loci associated with alcohol dependence27. Thus, the core gene signature of PFC in addiction identified by our comparative computational analysis is also supported by previous studies. Notably, the computational analysis could also identify new players associated with addiction such as transcription factors (Foxj1, Isl1, Psdm12). Hence, this study opens new avenues to be explored in future studies, for example, the role of Foxj1 and synergistic approaches targeting different GPCRs for the treatment of addiction disorders.

Besides, our analysis also identified a unique gene signature for either addiction condition. Indeed, most of the differentially expressed genes found in both studies were exclusively related to either palatable food or cocaine addiction. Interestingly, there was no convergence in the down-regulated genes associated with palatable food and cocaine addiction-like phenotype, implying that these mechanisms leading to gene expression down-regulation in PFC are distinct for each type of addiction.

Both food and cocaine, like other rewarding stimuli, increase dopamine levels in NAc, which are responsible at least in part for their reinforcing effects. Dopamine dynamics are directly mediated by the activation of dopamine neurons in the ventral tegmental area, which also sends projections to PFC, hippocampus, and amygdala apart from the mentioned NAc. Upon performing GO analysis of the shared genes, the study revealed that several genes were involved in behavioral responses, such as learning and memory, feeding behavior, response to cocaine, stress responses as well as in change in synaptic plasticity, such as long-term potentiation, prepulse inhibition, and regulation of dopaminergic and glutamatergic synaptic transmission. Reward prediction error28 and incentive salience29 hypothesis underline the importance of dopamine dynamics in brain reward areas in learning processes, as suggested by our GO analysis. Indeed, excessive learning habits have been involved in relapse and craving responses to reward-related cues previously associated with the reward intake. Recent findings have also identified that dopamine release in mPFC mediates behavioral learning responses to aversive stimuli30.

Likewise, the GO analysis identified a group of common genes involved in synaptic plasticity processes. Synaptic plasticity in PFC evoked by repeated exposure to the reward play a pivotal role in changes in neuronal circuits and addictive behaviors, e.g., relapse31. Consequently, relapse is caused by powerful and long-lasting memories of the reward experience related to synaptic plasticity changes associated with repeated reward intake.

Interestingly, histone phosphorylation was also identified by GO analysis at the molecular function level. Epigenetic mechanisms have been revealed as essential mediators of long-lasting gene expression changes linked to addiction, and stable epigenetic changes might confer addiction vulnerability32. Histone phosphorylation generally allows the transcription of genes and seems to play a crucial role in promoting the expression of IEG, such as c-Fos and c-Jun33. Several studies reported that cocaine increased phosphorylation of histone H3 in striatal neurons, and it may be important in the cocaine-induced long-term neuronal plasticity34,35,36. In this context, several compounds that inhibit histone phosphorylation are under investigation as clinical candidates in human cancer37, 38. Thereby, more insights will clarify the role of histone phosphorylation after chronic exposure to the reward and identify inhibitors of histone phosphorylation as a potential treatment of addiction.

The protein–protein association analysis of the shared genes showed a core network of 8 genes (Drd1, Drd2, Adora2a, Gpr88, Gpr6, Ppp1rb1b, Rgs9, Pde10), predicting protein–protein interactions at physical and at functional levels. The hub of the network includes five GPCRs (Drd1, Drd2, Adora2a, Gpr88, Gpr6) and three proteins associated with the cAMP signaling pathway (Ppp1rb1b, Rgs9, Pde10). As mentioned above, GPCRs have been investigated extensively due to their contribution to physiological and pathological processes. In this context, GPCR heterodimerization has been postulated several years ago and puts forward the concept of physical associations between two different GPCRs that might have different functional properties from those of the individual receptors39. Previous evidence described heterodimerization processes between dopamine D1–D2 receptors40 and D2–A2A receptors17. Recently, a BRET study has demonstrated physical interactions of GPR88-Rluc8 with mVenus-tagged D2 and mVenus-tagged A2A receptors in transfected cells41. However, according to our knowledge, no evidence of interaction of GPR6 with other GPCRs has been shown until now. Importantly, both D1 and A2A receptors increase cAMP levels by coupling to Gs proteins, while in contrast the, activation of D2 and GPR88 decreases cAMP levels by coupling to Gi proteins. Furthermore, RGS9 can modulate cAMP signaling by interaction with the β-subunit of the G proteins and functionally interact with D2, as suggested by previous studies42, 43. Moreover, the activation of cAMP signaling induces phosphorylation of DARPP-32, which regulates synaptic plasticity as well as many other biological and behavioral responses driven by drugs of abuse44. Finally, PDE10A selectively regulates cAMP signaling by a potentiation of A2A- and D1-mediated phosphorylation of DARPP-32, whereas it blunts D2-induced decrease in DARPP-32 phosphorylation45, and as a result, increases the phosphorylation of DARPP-32. In summary, the STRING database analysis revealed a protein–protein association network that disentangles the dopamine-, adenosine- and GPR88-mediated cAMP signaling pathway in PFC as a pivotal signaling pathway in addiction and identifies this signaling pathway as a potential therapeutic target in addictive disorders. Our analysis of the gene association network provides new insights to understand these psychiatric disorders and to potentially develop a new pharmacological target.

Finally, RNA-seq analysis of cell types in mouse PFC has been recently reported14. Single-cell RNA sequencing allows identifying transcriptional changes across different cell populations associated with physiological or pathological processes, including addiction. Bhattacherjee et al. analyzed the transcriptome dynamics in PFC cell types evoked by chronic cocaine exposure. Therefore, to better understand the cellular mechanisms involved in addiction, we asked whether our candidate genes exhibit a cell type-specific expression in PFC. Based on the fact that addiction is a chronic relapsing disorder and the role of the PFC in cognitive and executive function, we assume that the addiction-related core gene reprogramming takes place in the same cell types in PFC and remains stable across the time course of the disease. Nevertheless, we cannot exclude the possibility that palatable food-induced gene reprogramming may occur in other cell types. For this purpose, we performed computational analysis of the shared genes over the publicly available data of single-cell transcriptome in PFC of cocaine self-administration mice. We observed that some of the relevant genes have a specific expression pattern in different PFC cell clusters. Thus, Drd1, Drd2, Gpr88, Gpr6, and Rgs9 were almost exclusively expressed in excitatory neurons, although Drd1, Gpr88, and Rgs9 also showed expression in other cell clusters. Strikingly, Adora2a had a predominant expression in endothelial cells and microglia, with a very sparse expression in inhibitory and excitatory neurons. These data challenge previous evidence about the anatomical, pharmacological, and functional properties of the A2A receptor46, 47, although they do not entirely invalidate them. Finally, Ppp1r1b and Pde10a were broadly expressed in PFC, suggesting a role in more general cellular functions. Overall, the specific cell type expression of the addiction gene signature based on the computational analysis of a public dataset of single-cell RNA-seq in PFC suggests a significant role of the excitatory neurons in addiction and put forward those protein–protein interactions predicted by the STRING data analysis. The transcriptional study at cell-type specific levels of the addiction gene signature in PFC might be relevant to design potential new pharmaceutical approaches to tackle addiction.

In conclusion, addiction disorders share similar behavioral alterations even if they have been evoked by different rewarding stimuli. Thus, we hypothesize that chronic reward-induced neuronal plasticity is triggered by common transcriptional reprogramming to elicit addiction-like behaviors. Nevertheless, we could not discard that some of these gene expression changes are related to an interindividual predisposition that confers a particular vulnerability to addiction. This study uncovered the common and unique differentially expressed genes in PFC in addiction by computational analysis of public RNA-seq datasets from two independent studies using palatable food and cocaine addiction animal models. Thus, we identified 56 shared genes present in addicted mice as a gene expression signature of addiction in the PFC. These genes contribute to learning and memory responses, synaptic plasticity processes, and regulation of cAMP signaling pathway as suggested by GO analysis. Furthermore, protein–protein association analysis of the candidate genes identified a core network consisting of dopamine, adenosine and orphan GPR88-mediated G protein-coupled cAMP signaling pathway as key players of neuroplasticity changes in PFC during addiction. Finally, computational analysis of public single-cell RNA-seq data suggests that transcriptional reprogramming of the relevant genes in PFC occurred mainly in excitatory neurons. This study unravels a common and unique gene expression signature of PFC that confers the vulnerability and resilience to addiction and disentangle a core network of eight genes that may pave new avenues to develop pharmacological treatments that alleviate the chronic relapse and the compulsivity associated with the addiction syndrome.

Methods

Data collection

Transcriptomics data were obtained from the NCBI-GEO. For current study we considered GSE139482, GSE110344 and GSE124952. Transcriptomics data was received as raw files in fastq format from EBI.

RNA-sequencing data analysis (quality check, alignment, normalization and differential gene expression analysis)

After receiving raw data in fastq format, quality of individual sample was checked using FASTQC version v0.10.5. Sample passing the quality were subjected for the alignment using TopHat48 version v2.153 to the mouse genome (mm9) with default parameters. Mapped reads were considered for read count per gene using HTSeq49 version 0.954. Output of HTSeq (read counts per gene) was normalized and differential gene expression analysis was performed using R package DESeq with false discovery rate (FDR) rate of 0.1. “plotPCA” function from DEseq50 package was used to check variability between the non-addict and addicted mice using PCA analysis. Top varying 500 genes were selected for PCA analysis. “nbinomTest” function used to calculate p-value from DEseq package. Only those genes were considered as differentially expressed genes that fulfil the criteria of at least a 1.5 fold change, a p-value less than 0.05, a FDR less than 0.1 and at least 10 read counts in either condition from both cases of addiction. Volcano plots, boxplots were plotted using ggplot2 package in R.

Gene ontology and protein–protein interaction network analysis

Gene ontology analysis for differentially expressed shared genes were performed using ToppGene51. Protein–protein interactions were predicted using STRING database15. In this analysis, experimental data, co-expressed genes, neighboring genes, other databases and text mining from literature were used for predicting the PPI network. Here we used the default parameter of the confidence score (0.4) provided by database to generate most likely interactions. We implemented Cytoscape52 (version 3.8.0) to visualize the network by importing protein–protein interaction predicted by STRING database.

Single cell RNA-sequencing analysis

Single cell expression matrix was obtained from the NCBI-GEO portal. Matrix was curate for the cells with cocaine self-administration mice during the maintenance phase condition. Selected matrix was processed with seurat53 in R. Gene variability were calculated using function “FindVariableFeatures”. Using “RunPCA” function principle components were calculated. Further cluster analysis was performed using Seurat-inbuilt function “FindClusters”. Further clusters were visualized using the “RunTSNE” function. Clusters were annotated using the expression of the known marker genes (Supplement Fig. 1). Heatmaps were plotted using package “pheatmap” in R.