Integrative multi-omics landscape of fluoxetine action across 27 brain regions reveals global increase in energy metabolism and region-specific chromatin remodelling

Depression and anxiety are major global health burdens. Although SSRIs targeting the serotonergic system are prescribed over 200 million times annually, they have variable therapeutic efficacy and side effects, and mechanisms of action remain incompletely understood. Here, we comprehensively characterise the molecular landscape of gene regulatory changes associated with fluoxetine, a widely-used SSRI. We performed multimodal analysis of SSRI response in 27 mammalian brain regions using 310 bulk RNA-seq and H3K27ac ChIP-seq datasets, followed by in-depth characterisation of two hippocampal regions using single-cell RNA-seq (20 datasets). Remarkably, fluoxetine induced profound region-specific shifts in gene expression and chromatin state, including in the nucleus accumbens shell, locus coeruleus and septal areas, as well as in more well-studied regions such as the raphe and hippocampal dentate gyrus. Expression changes were strongly enriched at GWAS loci for depression and antidepressant drug response, stressing the relevance to human phenotypes. We observed differential expression at dozens of signalling receptors and pathways, many of which are previously unknown. Single-cell analysis revealed stark differences in fluoxetine response between the dorsal and ventral hippocampal dentate gyri, particularly in oligodendrocytes, mossy cells and inhibitory neurons. Across diverse brain regions, integrative omics analysis consistently suggested increased energy metabolism via oxidative phosphorylation and mitochondrial changes, which we corroborated in vitro; this may thus constitute a shared mechanism of action of fluoxetine. Similarly, we observed pervasive chromatin remodelling signatures across the brain. Our study reveals unexpected regional and cell type-specific heterogeneity in SSRI action, highlights under-studied brain regions that may play a major role in antidepressant response, and provides a rich resource of candidate cell types, genes, gene regulatory elements and pathways for mechanistic analysis and identifying new therapeutic targets for depression and anxiety.

TimeSwimming). The time spent immobile in the FST was used as a measure of "behavioural despair" (Figure 1b, Supplementary Table TS1b).
Note: Forty rats per condition were used in the omics-data generation (bulk RNA-seq, ChIP-seq, and scRNA-seq assays), whereas these behavioral assays were performed on an independent cohort of 10 animals per group (Sham vs. FT). This strategy was employed to avoid any confounding effect on the transcriptome induced by the battery of behavioral tests conducted.

Tissue collection
Rat brains were rapidly removed, flash frozen using cold isopentane and stored at -80°C. We did a 200 µm thickness coronal slicing of the frozen brain. We then micro-punched different brain regions from these 200 µm sections, using 1 mm and 0.5 mm diameter tissue punchers, while maintaining -20°C cryostat chamber conditions. We chose the bregma planes of each of the 27 regions following the well-defined illustration found in Paxinos atlas [1], Supplementary Table TS1. Tissue material for any particular region was pooled across 2-3 consecutive sections of the brain. Tissue punches of a region were pooled across 10 animals/brains to form one replicate. There were 4 biological replicates (40 animals per treatment group) for RNA-seq and 2 biological replicates (20 animals per treatment group) for ChIP-seq. In order to avoid introduction of batch effects/technical processing artefacts, corresponding Sham and FT samples of one replicate were processed in parallel.

Bulk RNA-seq data generation
Frozen, pooled brain tissue punches for each region were thawed on ice in 1 ml Trizol. Tissue was then homogenized using a manual glass douncer with 7-15 slow strokes on ice. RNA was extracted from each replicate sample using the Qiagen RNA Mini kit (Qiagen, Cat# 74104) as per the manufacturer's instructions with on-column DNase I treatment. RNA quality was examined using a Bioanalyzer 2100 (Agilent technologies, Santa Clara, USA). RNA Integrity Number (RIN) values ranged from 7.1 to 10, with a median value of 9.5. cDNA libraries were prepared using 300 ng of total RNA, from 27 regions in every replicate, using the Illumina TruSeq Stranded Total RNA LT set (Illumina, Cat# RS-122-2301). Pairedend, 76 bp read length RNA-seq was carried out on the Illumina HiSeq 2500 at a depth of 20M reads per sample.
RNA-seq data analysis 3 a) Mapping: RNA-seq data were aligned to the Rattus norvegicus rn5 genome (Rn5 ENSEMBL) with the STAR aligner. Read counts were concurrently calculated using HTseq. Any dataset that failed the default Fastqc sequencing parameters and < 75% mapping rate was discarded.

b) Visualisation using Principal component analysis (PCA) and t-Distributed Stochastic Neighbor
Embedding (tSNE): Each brain region had 8 datasets (4 Sham and 4 FT replicates). The RNA-seq read counts were quantile normalised, followed by log transformation. Batch correction was performed using limma, adopting default parameters with replicates specified as batches. Following normalisation and batch correction, the top 20% of genes (by read counts) were visualised using tSNE and PCA (Supplementary Figure S1a). (Note that the normalisation and batch correction described here were performed for visualisation purposes only.) c) Outlier removal: Region-wise PCA plots (PC1 and PC2 loadings) of normalised read counts were reviewed for outlier detection. Outliers were defined as data points outside 1.5 standard deviations from the mean of both PC loadings (PC1 & PC2). Two biological replicates of CGC region were identified as outliers and removed from further downstream analysis (Supplementary Figure S1b).

d) Differential analysis (DEG calling):
DESeq2 (https://bioconductor.org/packages/release/bioc/html/DESeq2.html) was used for computing differential expression statistics between Sham and FT samples per region. We provided the original read count matrix as input for DEseq2, and used the recommended batch correction mode, wherein replicate numbers constituted batches. DEG cut-offs were as follows: absolute log2FC ≥ log2 (1.25) and FDR Q-val ≤ 0.1 (Supplementary Figure S2, Supplementary Table   TS4b). All read counts of region-wise DEGs were also visualised by tSNE (Supplementary Table   TS2, Supplementary Figure S1).

Validation of DEGs
High-throughput qPCR The expression levels of 16 top DEGs from six brain regions (8 upregulated and 8 downregulated genes per region; 96 in total) were assayed using high-throughput Fludigm BioMark™ HD System Real-Time PCR. Briefly, diluted cDNA samples from FT and Sham groups were mixed with reaction reagents (primers, RT enzymes) and loaded on microfluidic chips. Chips were read by the Fludigm BioMark™ HD System Real-Time PCR suite. qRT-PCR results were analysed using the formula, 2 -ΔCt (ΔCt = Target gene -Reference gene (B2M)), and fold changes were computed. We plotted qPCR fold changes of the 8 'Up' and 8 'Down' RNA-seq DEGs using box-plots, and calculated the statistical significance between the two groups using Wilcoxon rank sum test (Supplementary Figure S3a). P-val < 0.05 was accepted as statistically significant.

RNAscope, Image Acquisition and Analysis
To validate the regional heterogeneity of DEGs, we used RNAscope on brain tissue sections covering 3-4 regions. Otub1, Trim28 and Sirt2 were the DEGs chosen for single molecule RNA validation.
10 μm-thick sections of frozen fluoxetine-treated and untreated rat brains were sectioned and stored at -   GWAS catalogue genes associated with five clinical phenotypes (SSRI response [2], broad antidepressant response [3], MDD [4], Alzheimer's [5] and alcoholism [6]) were used for this analysis. For every region, the proportion of GWAS catalogue genes within the list of DEGs was calculated. For each phenotype, we simulated a random gene set wherein the gene length was similar to that of the GWAS genes.
For every region analysed, the ratio of the fraction of GWAS genes in the list of DEGs versus the fraction of GWAS genes in the list of the random gene set was calculated, and referred to as GWAS enrichment score. We re-ran the above step 100 times to make sure we obtained a fair distribution of random control genes. Two-proportion z-test was used to calculate the P-val of the difference in fractions (Figure 1g, Supplementary Table TS5a).

PsyGeNET enrichment:
To query the PsyGeNET database, which manually curates genes involved in multiple human cognitive disorders by literature search, we downloaded the PsyGeNET "all gene-disease associations file", available at http://psygenet.org/web/PsyGeNET/menu/downloads. For each disease, we first intersected the total number of disease-associated genes with our list of all rat genes. Enrichment of disease-associated DEGs within each region-specific list of DEGs was then calculated using Fisher's exact test. This process was performed on all brain regions; we corrected for multiple testing (Benjamini-Hochberg) per psychiatric disorder (Supplementary Figure S5a). Figure S6 For each data set, we used DESeq2 to re-analyze the raw counts for the designated comparison as the complete set of analyzed results was not made publicly available. To calculate enrichment in overlap, we used the rank-rank hypergeometric overlap (RRHO2) package (https://github.com/RRHO2/RRHO2), ranking all genes found in both the external dataset and our dataset by the degree of differential expression (-log10(P-val) x effect size direction), and correcting all values by Benjamini-Hochberg. For Bagot et al.,

Rank-Rank hypergeometric analysis (Supplementary
we ran the RRHO2 analysis on all corresponding anatomically equivalent regions: for amygdala, we calculated the RRHO overlap for the basolateral amygdala (BLA) and central medial amygdala (CMA); for the nucleus accumbens, we calculated the RRHO overlap for the nucleus accumbens core (NAcC) and nucleus accumbens shell (NAcSh); for the prefrontal cortex, we calculated the RRHO overlap for the cingulate cortex (CGC), infralimbic cortex (ILC) and prelimbic cortex (PLC).

Cell type enrichment:
To test the enrichment of DEGs for cell type specific markers, we used the BRETIGEA database  Table   TS7).

Co-regulated gene clustering and pathway enrichment:
We constructed the log2FC matrix over the union set of DEGs across 27 brain regions. This matrix was z-transformed and used as input for k-means clustering (no. of clusters=9). We used anRichment to perform gene set enrichment within a larger, comprehensive collection of GO, HDSigDB, MSigDB and several well-curated neurogenomics gene sets. anRichment is available as an R function: (https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/GeneAnnotation/). In addition, we also ran enrichment using GOrilla (http://cbl-gorilla.cs.technion.ac.il/). To test enrichment for each k-means module, DEGs in that module constituted the foreground set, while the union of DEGs in the remaining modules served as the background set. Gene categories with enrichment ratio ≥ 1.5 and FDR Q-val ≤ 0.01 were retained. Additionally, we discarded the enriched gene categories if there were less than 5 DEGs in that category. Redundancy between two functional categories was defined as overlap of ≥ 40% in the number of DEGs. To display the non-redundant significantly enriched gene categories in Figure 2a Table TS8b).
Ingenuity Pathway Analysis (IPA): IPA (www.ingenuity.com) was used to explore and compare the relationship among the regionwise DEGs. The log2FC and FDR matrix of region-wise DEGs, along with the gene identifiers (RefSeq gene names) were uploaded into IPA. The canonical pathway tool was used to identify the top pathways associated with DEGs between compared conditions. The "Expr log Ratio" option and default IPA settings (absolute log2FC > 0.32 and FDR < 25%) were used to compute IPA enrichment. DEGs that positively and negatively correlated with a canonical pathway were assigned a positive and negative activation z-score value respectively. The right-tailed Fisher's exact test was used to calculate a P-val determining the probability that each pathway assigned to the data set was due to chance alone, and P-vals were corrected for multiple comparisons (Benjamini-Hochberg). To display the non-redundant significantly enriched pathways in Figure 2c, we selected for pathway terms that had at least 10 DEGs associated with it, passed the pathway enrichment significance cutoffs: P-val < 0.05 and FDR Q-val < 0.1, and finally was a significant pathway in at least 3 regions (Figure 2c, Supplementary Table TS13).

Bulk ChIP-seq data generation
For each ChIP-seq assay approximately 5-25 mg of frozen brain tissue per replicate per region was aliquoted and thawed on ice in 1 ml PBS buffer. Tissue was homogenized using a manual glass douncer ChIP-seq data analysis a) Read alignment, peak calling and peak height normalization: ChIP-sequencing reads were aligned to the Rn5 rat genome (Ensembl). Each brain region had 4 ChIP-seq libraries (2 replicates per region, 2 treatment groups -FT, Sham). DFilter was used to call peaks from each ChIPseq library. This peak list was further filtered to construct a consensus peak set using the following filtering cut-offs: DFilter P-val < 1e-6 in at least one out of four libraries and a minimum of 20 normalised read counts per 1kb window, in at least two out of the four libraries. This filtering yielded a consensus peak set ~48K peaks across 27 brain regions. 9 Normalisation: The number of reads in each library within 200bp bins were counted and scaled to normalise for sequencing depth. These binned counts were then normalised for GC bias, using the input (no antibody) ChIP-seq library. Following the above normalisations, within each peak region the sum of binned counts were defined as the peak height and blurred within a 1kb-wide mean filter. These computed peak heights for all peaks in the consensus set were collated as one matrix.
b) Visualisation using tSNE and PCA analysis: Read counts of each brain region were quantile normalised, centred and visualised on PC1 and PC2 loadings of PCA. Read counts of Sham and FT samples within each brain region were also visualised by PCA (Supplementary Figure S7a-b).
c) Differential peak calling (DA peaks): We used DESeq2 to call differentially acetylated peaks (DA peaks) from the consensus peak set (~48K peaks across 27 brain regions). DA cut-offs were as follows: absolute log2FC: ≥ log2 (1.25) and FDR Q-val ≤ 0.1. We called a peak in any region as a differential peak under the following conditions: a) the peak featured in the consensus list, b) the peak passed the DESeq2 DA cut-offs.
This resulted in the identification of a 4,511 significant DA peak set, wherein any peak was DA in at least one brain region (Figure 3a, Supplementary Table TS10-11). Next, as a quality metric we i) calculated the correlation between the different replicates per brain region and ii) compared the fraction of region-wise DA peaks to ChIP-seq signal quality in that respective region. ChIP-seq signal quality for every brain region was defined as the ratio of mean read count of top 10,000 peaks in that region to the read count of random genomic bins not lying on peaks (Supplementary Gene ontology categories with enrichment ratio ≥ 1.5 and FDR Q-val ≤ 0.05 were retained. Additionally, we discarded the enriched gene category if less than 5 genes were associated with DA peaks in that category.
Redundancy between two functional categories was defined as overlap of ≥ 40% in the number of genes associated with DA. To display the non-redundant significantly enriched gene categories in Figure 3d, we further selected the top non-redundant category terms from GREAT. The complete list of GREAT results are listed in Supplementary Table TS13a.
IPA enrichment tests were performed on DA peak sets of the top 6 highly ranked regions (as per DA peak ranking). Region-wise upregulated and downregulated peaks were considered separately as foreground sets, while the consensus peak set formed the background set. IPA pathway gene sets that were significantly enriched in the DEG set of these 6 brain regions were selected (IPA analysis of RNA-seq datasets). DA peaks were annotated to a gene that was within 1Mbp of the peak. Fisher's exact test was used to calculate P-vals (Figure 3c, Supplementary Table TS13b).

Cell type enrichment:
To test the enrichment of DA k-means modules for cell type specific markers, we used the BRETIGEA database as described in the RNA-seq methods above. DA peaks were annotated to the nearest gene, whose TSS fell within 1Mbp of the peak. For each cluster, enrichment of annotated genes in that cluster (against a background of all annotated genes) with cell type-specific genes from the BRETIGEA database was assessed through Fisher's exact test. This process was repeated for all cell types and brain regions, and then corrected for multiple testing on a cell type level (Benjamini-Hochberg) (Figure 3d, Supplementary Table TS14).

Motif Analysis:
The eight DA k-mean modules were tested for enrichment of TF-binding motifs. We used the HOMER pipeline's findMotifsGenome.pl script (http://homer.ucsd.edu/homer/ngs/peakMotifs.html). The position weight matrices (PWM) of motifs were drawn from the TRANSFAC vertebrate database (https://genexplain.com/transfac/). We listed the top 3 enriched, non-redundant motifs in Figure 3d (FDR Q-val < 0.01). The complete list is provided in Supplementary Table TS15.

Comparison of fold changes in Histone acetylation and RNA-seq counts
In order to find coherence between changes in histone acetylation and RNA levels (Supplementary Table TS12b), we specifically looked at the differential H3K27ac level at promoters of DEGs. For this purpose, we first estimated the read count of H3K27ac ChIP-seq within 1 kb of transcription start site of every gene. The read count of promoters from replicates in every brain region was normalised using quantile-quantile normalization before calculating fold changes between fluoxetine and sham samples. We then calculated the spearman correlation of this log2 (fold-changes) for DA peaks at promoters and compared it with the corresponding log2 (fold-changes) of DEGs (as called by DEseq2). DEG cut-offs were as follows: absolute log2FC ≥ log2 (1. 5) and FDR Q-val ≤ 0.1, DA promoter cutoffs (ratio of normalised read-counts) for this analysis was: absolute log2FC ≥ log2 (1. 5); no FDR cutoff was applied.
Rank-rank hypergeometric overlap analysis (Figure 4a) Overlaps between the differential expression of the two independent RNA-seq comparisons were visualized and measured with a rank-rank hypergeometric overlap analysis. dorDG and venDG bulk RNAseq datasets were compared by creating a threshold-free list of DEGs that were ranked by increasing log fold-change. We then used the RRHO2 "stratified" method to detect the overlap between genes  RCA's in-built mouse brain panel and assigned each cell to the cell type that showed the highest Pearson correlation. We verified these assignments (i) by considering the expression of canonical marker genes and by (ii) calling differentially expressed genes (DEGs) for each major cell type cluster using Seurat's FindAllMarkers function with default parameters. Using RCA2's cell type specific quality control functionality, we applied stringent cut-offs for all seven major cell types separately to discard cells of low quality while preserving cell type. Libraries with number of detected genes (NODG) < 1000 genes, UMIs > 20,000 (unique molecular identifiers) and proportion of mitochondrial reads > 25%, were filtered from the dataset. The Seurat integration method based on identification of "anchors" between pairs of datasets was used to eliminate biological and technical batch effects of the samples and scRNA-seq libraries.
Subclustering and further downstream analysis was then done using the Seurat pipeline. Dimensional reduction was carried out using the first 30 principal components, with clusters identified using K-nearest neighbours at a resolution of 0.8. Specific cluster markers were identified using the built-in Seurat function FindAllMarkers, with a minimum log2 fold-change of 0.25 and expression in > 25% of cells. Clusters were annotated using common marker genes from the literature and previously published datasets. We identified 13 cell types, out of which cluster 'Endo2' that resembled Hba+ cells and was therefore removed from further analysis. We finally arrived at 12 cell types in dorDG and venDG single cells (Data QC: Supplementary Figure S8a-g, Supplementary Table TS16). The cell type clusters and the cluster-specific markers are listed in Supplementary Table TS17 Supplementary Table TS18.
Single cell DEG calling: To identify DEGs within each cell type, we first created pseudo-bulk expression vectors for each replicate using Seurat, giving us 5 replicates for both Sham and FT groups. To remove outlier replicates, we first discarded genes with an average expression value (across both groups) of less than 0.5 transcripts per 10k (tp10k < 0.5). Outliers were then removed using the following procedure: within each group (Sham or FT), we 1) scaled the pseudo-bulk vectors across genes, 2) calculated the pairwise Euclidean distance between the scaled replicates, 3) for each replicate, took the median of the pairwise distances (MPD), 4) calculated the median absolute deviation (MAD) using the median of the pairwise distances across replicates, and 5) removed replicates that were at least 3 MADs away from the median of the MPDs.

13
Canonical Oxphos signalling module score: We computed a gene module score in single-cell space using Seurat's AddModuleScore implementation with default parameters. A module is a list of genes that defines a phenotype and in our case, we defined the 'oxphos' module as a list of 150 genes from the 'oxidative phosphorylation pathway' curated in the Ingenuity Knowledge base (IPA). The module scores for each cell type and each group (FT and Sham) were computed separately and the differences were assessed using a Wilcoxon rank sum test, and the P-vals were corrected for multiple testing (Figure 4f, Supplementary Table TS21).

SCENIC analysis:
To identify upstream regulators and their downstream targets (regulons), we implemented the SCENIC pipeline [9] with several modifications. For each region, we first removed single cells from the total matrix if they belonged to outlier replicates (see above on single cell DEG calling). This method removed ~5% of cells and did not have a sizeable impact on cell numbers. We then transformed the raw counts matrix into a log2(counts+1) matrix, where 1 denotes the pseudocount, and passed that into SCENIC to calculate correlations. To calculate links between TFs and their downstream targets, we used Grnboost2 from the python implementation of SCENIC, pySCENIC, rather than GENIE3 from the R implementation, as this greatly sped up calculations. The link file that resulted (also known as adjacencies in pySCENIC) was then passed back into SCENIC for construction of regulons and AUC scoring. To construct regulons, we used the database: 'mm10__refseq-r80__500bp_up_and_100bp_down_tss.mc9nr.feather'.
Following regulon identification and AUC scoring, we calculated cell type specific markers and differentially activated regulons (DARs). For each cell type, potential markers needed to have nonzero AUC values in at least 10% of cells in that cell type. We then calculated marker enrichment (or log2 foldchange) by taking the log2 of the mean of the AUC values in that cell type divided by the mean of the AUC values in all other cell types. P-values were calculated using Wilcoxon, again comparing the cell type versus all other cell types, and then adjusted using FDR (Supplementary Table TS22a).  Table TS22b).

NATMI Signalling analysis:
To infer differential signalling interactions between FT and Sham, we first used NATMI's [10] (git commit: f35f677) `ExtractEdges.py` command to compute expression and specificity scores for each interaction edge, defined by a sender cell type, receiver cell type, ligand and receptor. Scores were computed using the counts-per-10k values extracted from the scRNA-seq expression within each sample. Then, for each interaction, we tested the difference in the expression and specificity scores using the Wilcoxon-rank sum test. We reasoned that top differential interactions should be relatively specific, and that the ligands and receptors should be expressed by a non-negligible fraction of cells within the sender and receiver cell types in the condition with the higher score. Hence, to prioritize top differential interactions for inspection, we applied the following filters: absolute difference in specificity score > 0.03; specificity score in condition with higher score > 0.05; average fraction of ligand expressed in sender cell type in samples from the condition with the higher score > 0.1; average fraction of receptor expressed in receiver cell type in samples from the condition with the higher score > 0.1; unadjusted Wilcoxon p-value testing difference in expression or specificity score < 0.05. This resulted in 22 top interactions for dorDG and 6 top interactions for venDG. Then, for each unique ligand-receptor pair, the log2 fold-change in average expression score (calculated using a pseudocount of 1) and the average difference in specificity score between the conditions