Integrated genome-wide methylation and expression analyses reveal functional predictors of response to antidepressants

Major depressive disorder (MDD) is primarily treated with antidepressants, yet many patients fail to respond adequately, and identifying antidepressant response biomarkers is thus of clinical significance. Some hypothesis-driven investigations of epigenetic markers for treatment response have been previously made, but genome-wide approaches remain unexplored. Healthy participants (n = 112) and MDD patients (n = 211) between 18–60 years old were recruited for an 8-week trial of escitalopram treatment. Responders and non-responders were identified using differential Montgomery-Åsberg Depression Rating Scale scores before and after treatment. Genome-wide DNA methylation and gene expression analyses were assessed using the Infinium MethylationEPIC Beadchip and HumanHT-12 v4 Expression Beadchip, respectively, on pre-treatment peripheral blood DNA and RNA samples. Differentially methylated positions (DMPs) located in regions of differentially expressed genes between responders (n = 82) and non-responders (n = 95) were identified, and technically validated using a targeted sequencing approach. Three DMPs located in the genes CHN2 (cg23687322, p = 0.00043 and cg06926818, p = 0.0014) and JAK2 (cg08339825, p = 0.00021) were the most significantly associated with mRNA expression changes and subsequently validated. Replication was then conducted with non-responders (n = 76) and responders (n = 71) in an external cohort that underwent a similar antidepressant trial. One CHN2 site (cg06926818; p = 0.03) was successfully replicated. Our findings indicate that differential methylation at CpG sites upstream of the CHN2 and JAK2 TSS regions are possible peripheral predictors of antidepressant treatment response. Future studies can provide further insight on robustness of our candidate biomarkers, and greater characterization of functional components.


Introduction
Antidepressants are considered an effective treatment option for major depressive disorder (MDD), a severe affective disorder that is currently deemed to be the leading cause of global disability 1 . However, treatment selection is clinically subjective, response is determined by trial and error, and objective patient improvement is difficult to distinguish from the placebo effect 2 . On average, 4 weeks are required for a notable response to treatment, and 6 weeks are required for symptom remission 3 . In addition to the long period of symptom evaluation, the uncomfortable side effects of antidepressants greatly contribute to noncompliance with treatment. Around 60% of patients fail to respond to initial interventions, whereas 20-30% of these patients do not respond despite multiple attempts 4,5 . Thus, a treatment paradigm that reliably matches patients with effective antidepressants as early on as possible would minimize their suffering, and avoid adversities associated with selecting appropriate medications. Predictive biomarkers for antidepressant response could greatly benefit clinical practice by decreasing the duration of evaluating drug efficacy 6 .
MDD is heterogeneous in symptom presentation and treatment response, and environmental factors have been shown to influence the onset, course and duration of illness 7 . Epigenetic modifiers of gene expression are key mediators of environmental effects on the genome. As such, exploring epigenetic mechanisms as possible predictors of treatment response is appealing, as they are better at reflecting the interaction of genetic and environmental factors. The most investigated and best characterized epigenetic modification in clinical studies is DNA methylation 8 . DNA methylation is defined by the addition of a methyl group typically to cytosine bases, and predominantly at those directly followed by a guanine (CpG dinucleotide sites). Differential methylation has been associated with life experiences such as drug abuse 9 , early childhood trauma 10 , and chronic stress 11 , all of which are predisposing factors for MDD.
DNA methylation-based biomarkers have already been successfully utilized for clinical evaluation of neurodevelopmental disorders 12 , multiple types of cancer 13 and cardiovascular disease 14 . To date, only a few studies have investigated differential DNA methylation as a predictor biomarker at specific candidate loci that were previously associated with treatment response [15][16][17] , but no genomewide study has been conducted.
Genome-wide analyses offer a non-biased experimental approach to identify novel candidates. To our knowledge, this is the first genome-wide differential DNA methylation conducted to identify possible predictors of antidepressant response. We compared responders (RES) and non-responders (NRES) to an eight-week escitalopram treatment. In addition, to investigate the possible functional role of identified methylation biomarkers, we also analyzed genome-wide differential gene expression. This information was used to select differentially methylated positions (DMPs) for further analyses.

CAN-BIND-1 discovery cohort characterization
Our discovery cohort consisted of participants recruited for the Canadian Biomaker Integration Network in Depression (CAN-BIND-1), a multisite initiative dedicated to the discovery of treatment response biomarkers, which has been described in detail elsewhere 18 . Briefly, healthy control participants and MDD patients ranging between 18 and 60 years of age were enrolled in a prospective 16-week trial with escitalopram with an option of addition of aripiprazole at week 8. In this study, we are only analyzing baseline and week 8 data and outcomes. Healthy participants were included if they were free of psychiatric psychopathology and with no active medical diagnoses, and were matched for sex and age distribution with MDD patients. Depressed patients were excluded if they had other psychiatric diagnoses in addition to MDD, and if they had psychotic symptoms, high suicidality or concomitant neurological disorders, if they have already failed ≥4 pharmacological treatments for MDD, or previously failed to respond to escitalopram. Research Ethics Boards at all recruitment sites approved of the study design, and consent was obtained from all eligible participants for all procedures prior to the start of the trial. Following screening and recruitment, MDD diagnoses were clinically determined using the Mini International Neuropsychiatric Interview (MINI). All participants were assessed at baseline (W0) for symptom severity using the Montgomery Åsberg Depression Rating Scale (MADRS). MDD patients were administered escitalopram (10-20 mg/d) for 8 weeks. At week 8 (W8), MDD patients were assessed again with the MADRS. Escitalopram response was indicated by a ≥50% decrease in W8 MADRS scores relative to W0, and MDD patients were classified as either a responder (RES) or nonresponder (NRES). Healthy controls underwent the same clinical assessments and evaluations as MDD patients at these time points, but did not receive any type of treatment or placebo. The CAN-BIND-1 clinical trial was registered with the ClinicalTrials.gov identification number: NCT01655706.
Demographic and clinical data were compared between HC, NRES and RES samples included in final analyses.
Genome-wide DNA methylation analysis on the Infinium MethylationEPIC Beadchip DNA was extracted from whole blood samples obtained from healthy controls and MDD patients at baseline prior to treatment, using a modified version of the Qiagen FlexiGene DNA kit. Bisulfite conversion, DNA quality control, genome-wide methylation analysis, and initial methylation signal detection quality control was performed at the McGill University and Genome Quebec Innovation Center (GQ). The Infinium MethylationEPIC Beadchip was used to assess genome-wide DNA methylation (Illumina, US). After accounting for attrition rates, and DNA sample quality control, pre-processing and analysis of raw microarray data for the remaining samples was conducted within R (ver 3.4) predominantly using the Chip Analysis Methylation Pipeline (ChAMP) Bioconductor package 19 , which utilizes many elements of minfi 20 . Sample methylation signal QC was assessed by plotting log median methylated and unmethylated signals.
Samples were removed if they failed to cluster with others or if they exhibited lower median intensities in either signal channel. Probes with low signal detection relative to control probes, probes with <3 beads in >5% of samples, cross reactive probes, non-CpG probes, sex chromosome probes, and probes that hybridize to single nucleotide polymorphism sites were removed. Beta (β) values were calculated as the ratio of methylated signal to the sum of unmethylated and methylated signals at each CpG site, and subsequently normalized. log 2 transformed β values were used for the remainder of pre-processing steps as recommended by Du et al. 21 , but reported as β values. Technical batches and covariates were detected using single value decomposition analysis. Detected and known batch effects were corrected for prior to differential methylation analysis. Differentially methylated positions (DMPs) were identified between NRES and RES using linear regression methods from the limma 22 , with age and sex as covariates. A 2% absolute change in average methylation (Δβ) was set as a cutoff value to decrease the number of significant CpGs and identify sites with more biologically relevant methylation differences. A detailed account of our pre-processing and analysis steps for the MethylationEPIC Beadchip are included in Supplementary Methods.
Genome-wide mRNA gene expression analysis on the HT-12 Beadchip Baseline whole blood samples were obtained from HC and MDD participants in EDTA tubes containing Leu-koLOCK filters (ThermoFisher, USA). Total RNA was extracted from filtered leukocytes using a modified version of the LeukoLOCK Total RNA Isolation System protocol, and treated with DNase to remove genomic DNA. RNA was converted to cRNA, and sent to GQ for further QC and analysis on HT-12 v4 Expression Beadchips (Illumina, USA). Pre-processing steps and differential gene expression analysis were performed in R using the limma Bioconductor package 22 . Only the subset of samples that appeared in our DNA methylation analysis were included. Probe signal detection, normalization, and filtering were conducted prior to differential expression analysis. Probes with a detection p-value <0.01 in at ≥20% of samples were retained. To assess differential expression, linear regression analyses on log 2 transformed values was performed with age and sex accounted for as covariates. A detailed account of preprocessing and differential expression analysis is included in Supplementary Methods. Only probes with ≥±0.1 ΔlogFC values between NRES and RES were included for further investigations. Genes that contained differentially methylated CpGs with average Δβ ≥ 2%, and that appeared in our differential expression analysis were identified.

Targeted bisulfite sequencing for validation of genomewide findings
Differentially methylated CpGs with an Δβ ≥ ±2% methylation and located in differentially expressed gene regions with a logFC ≥ 0.1 were selected for validation with targeted bisulfite sequencing on the Illumina MiSeq platform 23 . NRES and RES DNA samples were bisulfite converted using the Epitect 96 Bisulfite kit (Qiagen, USA) as per manufacturer's guidelines. Primers were designed with the Methyl Primer Express software (ThermoFisher Scientific). All samples were ensured to have an optimal molarity of 2 nM prior to being loaded onto the MiSeq platform with the V3 600 cycle kit (Illumina, US). Methods. Specific details for primer design and amplicon library preparation are included in Supplementary Methods. Upon retrieving raw sequencing data, Trimmomatic (v.0.35) was used to trim adaptor sequences 24 . Reads with phred scores <20 were removed and aligned with Bowtie 2 (v 2.1.0) 25 . Methylated and non-methylated CpG signals were extracted to calculate the level of methylation at our sites of interest. Results were analyzed using one-tailed t-tests. Correlation of microarray and sequencing methylation values was assessed with Pearson correlation coefficients.

Replication within the Douglas biomarker study
Replication was conducted using the Douglas Biomarker Study cohort, which was similarly designed to our discovery cohort. Participants were recruited at the Depressive Disorders Program at the Douglas Mental Health Institute, McGill University (Montreal, QC), and consisted of an 8-week antidepressant treatment for MDD patients randomly selected to receive either densvenlafaxine (serotonin and norepinephrine reuptake inhibitor; SNRI) or escitalopram (selective serotonin reuptake inhibitor; SSRI). Hamilton Depression Rating Scale (HAM-D) scores were used to assess symptom severity at baseline and W8, where a ≥ 50% relative decrease in HAM-D scores at W8 denoted a response. Genome-wide methylation analysis on the Infinium MethylationEPIC Beadchip was conducted at Illumina. Sample descriptive data were statistically analyzed similarly to what was described previously for our discovery cohort. We used the same pipeline described previously to analyze differential methylation, with additional covariate corrections made for antidepressant type. We compared differential methylation at our three probes of interest between MDD patients and psychiatrically healthy controls to ascertain whether our findings were specific to antidepressant response. After identifying our three CpGs of interest, we compared methylation level of healthy controls at those sites specifically to the methylation levels of nonresponders and responders respectively using two-tailed t-tests accounting for equal variance.

ROC curve analysis
Receiver-operating characteristic (ROC) curve analyses were performed to assess the ability of our successfully replicated CpG site cg06926818 to discriminate between non-responders and responders to antidepressant treatment. Discovery and replication cohort methylation levels at cg06926818 for responders and non-responders were utilized within SPSS to calculate sensitivity, specificity and confidence intervals of their respective ROC curves. Analysis of the ROC coordinates determined the area under the curve (AUC), which was used to assess prediction accuracy. AUC significance was determined using a p-value threshold of p < 0.05.

Investigating effects of blood cell heterogeneity
Heterogeneity of white blood cell types has potential confounding effects on DNA methylation measurements based in peripheral blood samples 26 . To address the possibility of confounding effects of blood cell composition, complete blood cell counts were obtained from each patient during the trial. One-way ANOVA tests were used analyze all three comparison groups for any effects of blood cell proportions on our main results.

CAN-BIND cohort characterization
An overview of our research methodology is presented in Fig. 1. In our discovery cohort, 211 depressed patients and 112 healthy controls were initially recruited for the clinical trial. From these, 34 depressed patients and 10 healthy controls had to be excluded from further analyses because of unsuccessful completion of the trial, or poor DNA sample quality. One healthy control was removed due to poor methylation signal detection QC. Downstream analysis proceeded with 101 healthy control and 177 depressed subjects. Using differential MADRS scores, 95 NRES and 82 RES were identified within the MDD group.
For the remaining HC, NRES and RES samples, psychiatric and social demographics (including previous number of major depressive episodes, family history of psychiatric illnesses, age of MDD onset, highest level of education obtained, yearly income, marital status, and ethnicity) are provided in Table 1. No significant differences in age, gender, level of education, yearly income, marital status, ethnicity, number of previous major depressive episodes and age of MDD onset were noted between HC, NRES and RES groups. Significant differences were noted between all three groups for the following categories: MADRS scores at baseline (F = 1362.46, p < 0.05), MADRS scores at Week 8 (F = 466.17, p < 0.05) and family history of psychiatric illness (F = 36.04, p < 0.05). Post-hoc analyses revealed significant differences between HC and drug-treated groups for both baseline and post-treatment MADRS scores (p = 5.10E-9 for both comparisons). However, no significant  differences between NRES and RES groups were noted for baseline MADRS scores (p = 0.181). Expectedly, we noted significant differences between NRES and RES posttreatment MADRS scores (p = 5.10E-9). There was a significant difference between HCs and drug-treated groups for a family history of psychiatric illness (p = 5.13E-9) but not between NRES and RES (p = 0.994).

Differential methylation analysis
Pre-processing of raw data for retained samples was conducted within R using ChAMP and 679,362 CpG probes were retained for downstream analysis. We identified 2571 significantly DMPs (p < 0.05, q = 0.1); however, this included DMPs with very small differences in methylation (i.e. Δβ < 0.5%). Therefore, a Δβ ≥ ±2% cutoff was applied to identify 303 DMPs with methylation changes that are more likely to be biologically relevant (Supplementary Table 1).

Differential mRNA expression analysis
Sixteen thousand three hundred seventy eight mRNA probes were retained and assessed for differential mRNA expression with linear regression analyses. A cutoff of logFC ≥ ±0.1 was used to eliminate gene probes with low levels of differential expression, resulting in 2009 retained probes. The remaining expression probes were overlapped with DMP probes with the intent to identify DMPs that are more likely to affect cis gene expression.

Blood cell heterogeneity
The individual proportion of lymphocytes, monocytes, neutrophils, eosinophils and basophils did not have any specific, significant effects on our primary findings (Supplementary Table 4).

Replication in the Douglas biomarker study cohort
Thirty one HC, 76 NRES, and 71 RES samples from the Douglas Biomarker Study cohort were used for replicating our methylation findings at cg23687322, cg06926818 and cg08584037 between RES and NRES. Detailed sample characterization of the replication cohort is provided in Table 3.

ROC curve analysis
ROC analysis with AUC calculations for cg06926818 between non-responders and responders were used to assess their potential predictive value as biomarkers in both discovery and replication cohorts (Fig. 3). The AUC for the ROC curve of cg06926818 was 0.66 in our discovery cohort (p = 0.0003, C.I. = 0.58-0.74) and 0.59 in our replication cohort (p = 0.05, C.I. = 0.50-0.69). The ROC curves for cg23687322 and cg08584037 are displayed in Supplementary Fig. 1.

Discussion
This study represents the first genome-wide differential methylation analysis of antidepressant response in clinically depressed patients. We identified significantly differentially methylated CpGs associated with antidepressant response through a genome-wide method in peripheral blood samples retrieved prior to receiving 10-20 mg of escitalopram treatment. We applied a stringent Δβ cutoff and incorporated associated differential gene expression data in order select DMPs with a functional component. This allowed us to identify cg23687322, cg06926818 and cg06926818 located in CHN2 and JAK2 gene regions as our candidate predictors of response.
Differential methylation levels at all three of our CpGs of interest were relatively low. This presents the question of whether there was a true distinction between comparison groups. However, small differential values are commonly reported in studies of psychiatric disorders [27][28][29] and these subtle differences are thought to reflect the complex etiology and course of multifactorial illnesses such as MDD. For psychiatric diseases overall, biomarkers will likely be established by collectively considering a panel of multiple differential findings rather than through the standard method of identifying high fold changes of one specific observation. Additionally, many other social and lifestyle factors, such as dietary habits 30 , smoking history 31 and chronic stress 32 , are shown to have specific effects on the methylome landscape, and were not accounted for as covariates in our analysis. We were ultimately able to validate and correlate our microarray-based findings through targeted bisulfite sequencing, and replicated differential methylation at cg06926818. Future genome-wide methylation studies will provide additional information on the robustness of our biomarker candidates.
All three CpGs of interest were located in promoter regions of their respective genes, specifically within 1500 bp upstream of the transcription start site. Increased DNA methylation at CpGs in promoter regions, in particular, is usually associated with a decrease in mRNA Fourteen unique genes overlapped between our differential methylation and differential expression analyses. Fold change (FC) is in reference to RES expression. However, our data showed that responders exhibited relative decreases in both DNA methylation and mRNA expression at all three CpG sites compared to non-responders. Possible explanations for these findings include the action of other regulatory elements, such as enhancers and DNA methyl-transferases, which may be exerting additional secondary effects on gene expression and methylation. In addition, 3D chromatin structure may have regulatory effects on gene expression, particularly through distant, trans-acting mechanisms. For example, insulators can prevent an enhancer from activating a promoter through long-range interactions with other regulator elements 33 . Finally, the combined expression of multiple loci on separate chromosomes, and their subsequent interactions can also activate or repress opposing epigenetic states 34 . Overall, the complex regulation between methylation and expression implied through our findings likely reflects the intricate relationship between predisposing genetic factors and environmental effects on MDD etiology, symptom severity and variation in treatment response. Differential CHN2 methylation has not been directly assessed in the context of treatment response or MDD, but it has is implicated in disorders that often co-occur with MDD or depressive symptoms, such as substance abuse 35 , ADHD 36 and psychosis 37 . CHN2, or β2-chimaerin, maps to chromosome 7p15.3 and encodes for a GTP-ase activating protein predominantly expressed in the pancreas and brain 37 . In the brain, CHN2 observed to have a role in neurodevelopmental hippocampal axon pruning. From animal-based studies, adult hippocampal neurogenesis has been observed to be stimulated by antidepressant administration 38,39 , and additionally shown to be a prerequisite for a behavioural response to all major antidepressants 40 . Thus, it is possible that differential baseline methylation levels at cg23687322 and cg06926818 within CHN2 could therefore reflect the epigenetic regulation of certain molecular processes in the brain, such as hippocampal neurogenesis, that are required for eliciting antidepressant response.
JAK2, or Janus kinase 2, maps to chromosome 9p24.1 and encodes for an intracellular, non-receptor tyrosine kinase. Upon activation of JAK tyrosine kinase activity, a family of transcription factors called STATs (signal transducer and activators of transcription) are further activated to initiate downstream regulatory activity 41 . Similarly, to CHN2, differential methylation at JAK2 may also have functional regulatory effects on hippocampal neurogenesis that could be associated with antidepressant response. Ketamine, an acutely acting antidepressant, reverses stress-induced learning deficits in adult rats and increases Arc levels (a synaptic plasticity consolidating protein) only in the presence of JAK2. Furthermore, phosphorylated JAK2 colocalizes with Arc in dendritic spines, showing evident JAK2/STAT signaling during synaptic plasticity events 42 . However, JAK2 also has important functions in an inflammation, which is important to consider in the context of depression and treatment response. It has a non-redundant role in cytokine receptor signaling pathways, which mediate components of innate and adaptive immunity 43 . Increased peripheral inflammation has been associated with poorer response to antidepressant medication 44 , and inflammatory components associated with JAK/STAT signaling 43 (i.e. interleukin-6; IL-6 and C-reactive protein; CRP) have been as treatment response predictor biomarkers 45 . Differential methylation in JAK2 is possibly a secondary indicator to the underlying variation of patient inflammation levels that predict response. Moreover, the use of SSRI or SNRI antidepressants are also associated with changes in levels of inflammation 46 . Serotonin and norepinephrine moderate differential cytokine production, and a chronic imbalance between the two can modify the ratio of cytokine types 47 . These differential effects of neurotransmitter levels on inflammation may be related to why our JAK2 probe was not indicative of treatment response in the replication cohort given that it involved both SSRI and SNRI treatment. Differential methylation at cg08584037 in JAK2 is potentially only predictive of SSRI response, but not SNRI response due to the effects of neurotransmitter imbalance on cytokine production. Although we did correct for antidepressant type as a covariate, it would be interesting to note whether differential methylation at cg08584037 in JAK2 would be replicated in future studies that specific to SSRI treatment.
There are a number of specific limitations that should be considered in this current study. Firstly, by applying stringent differential methylation and expression cutoffs, this excludes many potentially interesting sites for evaluation. Although differential cutoff values are valid approaches for site selection in genome-wide approaches, they do not take into account the subtle genetic changes that are likely reflected by the heterogeneity of psychiatric disease phenotypes. Secondly, the EPICarray targets specific CpGs and non-CpGs sites, and thus, other genomic methylation sites that could play a role in antidepressant response may not have been investigated. Thirdly, the technical methods used for differential methylation analyses do not distinguish between hydroxymethylated cytosines and methylated cytosines. Hydroxymethlyated cytosines are often found in gene bodies, and all three CpGs of interest were located in gene promoter regions 48 , decreasing the likelihood that our main findings are affected by this limitation. Further, our studies were conducted in peripheral blood samples [49][50][51] , which may not represent methylation processes in the brain, the target organ of depression.
Finally, the strength of our findings demonstrated through replication and ROC curve analysis indicate that it may not be feasible clinically to rely on methylation at one CpG site alone as a predictive biomarker. This is unsurprising, given the complex nature of psychiatric illness, and the multitude of underlying genetic and environmental factors that may contribute to manifestation and the course of disease. Thus, our results promote the concept of multiple biomarkers (or "biosignatures") being used together, although our results also suggest that gene expression may be a more powerful biomarker than methylation.
This study is the first to conduct genome-wide differential DNA methylation analysis associated with antidepressant response from peripheral blood DNA samples of MDD patients. Three DMPs were identified, and technically validated using targeted bisulfite sequencing. One CpG site within CHN2 was further replicated in an independent cohort. Overall, our findings provide initial evidence for the role of epigenetic factors in treatment, and propose new predictors of antidepressant response. Future studies, using larger sample sizes or longitudinal designs with multiple timepoints should be conducted in order to increase power of antidepressant biomarker studies. Robustness is the most important clinical consideration for biomarkers, and as more genome-wide investigations are conducted across independent cohorts, this will provide future opportunities for further replication and clinical consideration (especially when considered alongside other predictive biomarkers) for our proposed predictors of treatment response.