Integrated transcriptome and methylome analysis in youth at high risk for bipolar disorder: a preliminary analysis

First-degree relatives of patients with bipolar disorder (BD), particularly their offspring, have a higher risk of developing BD and other mental illnesses than the general population. However, the biological mechanisms underlying this increased risk are still unknown, particularly because most of the studies so far have been conducted in chronically ill adults and not in unaffected youth at high risk. In this preliminary study we analyzed genome-wide expression and methylation levels in peripheral blood mononuclear cells from children and adolescents from three matched groups: BD patients, unaffected offspring of bipolar parents (high risk) and controls (low risk). By integrating gene expression and DNA methylation and comparing the lists of differentially expressed genes and differentially methylated probes between groups, we were able to identify 43 risk genes that discriminate patients and high-risk youth from controls. Pathway analysis showed an enrichment of the glucocorticoid receptor (GR) pathway with the genes MED1, HSPA1L, GTF2A1 and TAF15, which might underlie the previously reported role of stress response in the risk for BD in vulnerable populations. Cell-based assays indicate a GR hyporesponsiveness in cells from adult BD patients compared to controls and suggest that these GR-related genes can be modulated by DNA methylation, which poses the theoretical possibility of manipulating their expression as a means to counteract the familial risk presented by those subjects. Although preliminary, our results suggest the utility of peripheral measures in the identification of biomarkers of risk in high-risk populations and further emphasize the potential role of stress and DNA methylation in the risk for BD in youth.


INTRODUCTION
Bipolar disorder (BD) is a devastating mental disorder with a prevalence of 1-2% and increased rates of several chronic comorbid medical conditions. 1 Even though BD is highly heritable, most studies on its genetic basis have been conducted with chronically ill adults and therefore it is not known whether abnormalities found in patients precede the onset of illness, emerge during early illness development or follow BD onset. Particularly, the role that genes play in triggering onset of BD is not clear, but is believed to involve the interaction between many susceptibility genes of small effect and a broad range of environmental factors. 2 First-degree relatives of BD patients are at increased risk for BD and other severe mental illnesses, and present a higher polygenic load of risk variants than control individuals. 3 Particularly, offspring of bipolar parents present a fourfold increased risk of developing BD compared to offspring of healthy parents. 4 Therefore, the search for biomarkers of risk in these individuals is urged and would allow an opportunity for prevention or early intervention.
DNA methylation, being modulated by both genetic background 5 and environmental exposure, 6 may be a marker of risk/resilience and a trigger for the development of BD. 7 Further, DNA methylation can change the expression of the key genes, potentially contributing to disease susceptibility. Previous studies have suggested that the combined assessment of gene expression and methylation data outperforms either data modality in identifying disease susceptibility loci, even in relatively small sample sizes. 8 Of note, the study of such markers in peripheral blood cells is warranted, given the easy access to tissue allowing for longitudinal comparisons in high-risk subjects. In addition, peripheral pathways, such as inflammatory and metabolic processes, have been consistently associated with BD and its risk, 9 and blood DNA methylation has been shown to correlate with brain volume 10 and symptoms of major psychiatric disorders, including depression 11 and BD. 12 We hypothesized that DNA methylation and gene expression events in peripheral pathways would discriminate between unaffected subjects at high risk for BD and healthy controls and could be promising biomarkers to monitor early presentation and prodromal symptoms in youth at risk for developing BD, because of the presence of a parent with a diagnosis of BD. Specifically, we aimed to integrate gene expression and DNA methylation data to identify risk genes of biological and functional relevance for the risk of BD in youth. patients diagnosed with pediatric BD (n = 6), unaffected offspring of a BD type I parent (n = 6; high risk) and healthy controls without any family history of psychiatric disorders in a first-degree relative (n = 6; low risk) matched for age, gender, race, ethnicity and pubertal development ( Table 1). Subjects were recruited as part of the Houston Area Pediatric Bipolar Registry. The Kiddie-Sads-Present and Lifetime Version 13 was utilized to confirm or rule out DSM-IV Axis I disorders among the youth, which was confirmed subsequently in a clinical evaluation with a research psychiatrist. The affective state was assessed with the Children's Depression Rating Scale and the Young Mania Rating Scale. 14,15 Functioning was assessed by the Global Assessment of Functioning scale. To assess environment, family functioning ratings were assessed using the parentrated scale Family Environment Scale, 16 which provides information about family strength and problem areas divided into three dimensions (family relationship, personal growth and system maintenance). These are assessed in 10 subscales, of which cohesion and conflict in the family relationship domain have been the most consistently altered factors in family functioning of BD. [17][18][19] For the BD offspring group, at least one biological parent met the DSM-IV criteria for BD type I, as determined by the Structured Clinical Interview for the DSM-IV Axis I Disorders interview. These high-risk subjects did not show any affective or non-affective diagnoses at the time of enrollment, including anxiety or externalizing problems, both of which have been shown to predict the prospective development of an affective disorder in this vulnerable population. 20 Exclusion criteria for all participants included: (a) current major medical problems; (b) previous history of neurologic disorders, including head injury with loss of consciousness; (c) pregnancy; and (d) family history of a hereditary neurological disorder. High-risk offspring were recruited as their BD parent presented for evaluation at the inpatient or outpatient programs of the UT Center of Excellence on Mood Disorders.

Transcriptome profiling
Heparin-anticoagulated blood from each fasting subject was used for the isolation of peripheral blood mononuclear cells (PBMCs) using the Ficoll-Paque (GE Healthcare, Little Chalfont, UK) density gradient centrifugation protocol, followed by isolation of RNA with the RNease Plus Mini kit (Qiagen, Hilden, Germany). After quantification on NanoDrop (Thermo, Waltham, MA, USA), the integrity of RNA samples was assessed by the Agilent RNA 6000 Nano Kit (Agilent Technologies, Santa Clara, CA, USA) on a Bioanalyzer (Agilent Technologies), and samples (RNA integrity number 49.8) were subsequently labelled into biotinylated cRNAs with the TargetAmp-Nano Labeling Kit (Epicentre, Madison, WI, USA). Genome-wide expression levels were measured using the Human HT-12 v4 Expression BeadChip Array (Illumina, San Diego, CA, USA) according to the manufacturer's instructions and scanned on an iScan Microarray Scanner (Illumina). Gene expression data were later quantile-normalized and exported into text files using GenomeStudio software v2011.1 (Illumina), and data analyses were performed using JMP Genomic (version 6.0, SAS Institute, Cary, NC, USA). Student's t-test was used in JMP to analyze values, with a Benjamini-Hochberg false discovery rate correlation of 1%, α = 0.05, and a cutoff of -log 10 (P-value) 41  Median (interquartile range).
Risk markers for bipolar disorder in youth GR Fries et al differences, differentially methylated probes (DMPs) were selected based on a differential P-valueo 0.01 and an absolute delta beta value (magnitude of the difference between groups) ⩾ 0.3. Correlation between the average signal for the gene expression and the beta value for each gene was performed using GenomeStudio v2011.1.

Identification of risk genes and pathway analysis
'Risk genes' were identified as those genes that were concordant in the lists of differentially expressed genes or DMPs between controls versus high-risk offspring and controls versus BD patients. These genes were then uploaded into Ingenuity Pathway Analysis (IPA, Qiagen) for the assessment of enrichments in canonical pathways and networks. This was followed by literature and database mining to check for evidence of associations between the risk genes identified in our analysis and previous studies in BD patients (described in the Supplementary Material).

Statistical analysis
Data analyses were carried out using IBM SPSS Statistics 23 (IBM, Armonk, NY, USA). Descriptive statistics were used to report demographic and clinical characteristics of the sample. Normality of data distribution was assessed with Shapiro-Wilk's test and histogram visualization. Categorical data were compared with chi-square tests or Fisher's exact tests. One-way analysis of variance was performed to compare parametric continuous demographic variables between groups. Independent Student's t-test was used to compare age between the adult patients and controls used in the lymphoblastoid cell line experiments. Factorial analysis of variance was used to analyze gene expression levels after treatment of cells with 5AzadC or dexamethasone. Analysis of covariance was also used to compare gene expression (MED1, GTF2A1, HSPA1L and TAF15) between groups including family cohesion, family conflict and Global Assessment of Functioning as covariates. Correlations between gene expression and methylation levels were assessed using Pearson's correlation test. Significance was set at Po0.05.

Sample
Demographic data from the subjects are presented in Table 1. While groups did not differ for age, gender, ethnicity, race, years of education or pubertal development (P40.05 for all comparisons), patients with pediatric BD presented significantly higher scores for manic and depressive symptoms, as well as impairments in functioning when compared to controls and unaffected offspring.
In addition, five of the patients were on medication (three on atypical antipsychotics and two on antidepressants).
Transcriptome profiling Out of the 34 694 genes interrogated, 128 were found to be differentially expressed between unaffected high-risk offspring and healthy controls (Supplementary Table 1). Fifty-six genes showed a significant difference between healthy controls and BD patients (Supplementary Table 2), whereas 12 genes were found to be differentially expressed between the high-risk offspring and BD patients (Supplementary Table 3). Thirty-three of the genes that were found to be differentially expressed between controls and unaffected high-risk offspring are also in the list of genes that were different between controls and BD patients ( Table 2), suggesting them as 'risk genes' (Figure 1a). Array data were validated by confirming the differences in four selected genes by real-time quantitative PCR (Supplementary Figure S1).
Methylome profiling DNA methylation analysis showed that controls and unaffected high-risk offspring differed for 75 probes (Supplementary Table 4), while healthy controls and BD patients showed a difference in 64 probes (Supplementary Table 5). Eighteen probes were concordant between the lists of DMPs in offspring versus controls and BD patients versus controls (Figure 1b). These probes are located within the following 10 annotated genes (Table 2): PQLC2L, PCNX, MAGI2, HOOK2, SLC45A4, GLUL, PGCP, LCE2D, NLK and ZNF195, suggesting these loci as markers of vulnerability (risk genes). Finally, unaffected offspring and BD patients differed in 47 probes (Supplementary Table 6).

Identification of risk genes and pathway analysis
We performed pathway analysis on the 43 'risk genes' using the IPA software to identify gene networks of potential relevance for BD (Table 3 and Supplementary Table 7). Two genes were not mapped in IPA and therefore were not included in the analysis (LOC100131360 and LOC728649). The top significant canonical pathway enriched in our analysis was the GR signaling pathway (P = 0.00194), followed by the glutamine biosynthesis I (P = 0.00198) and the estrogen receptor signaling pathways (P = 0.00207). Using the IPA Knowledge Base, most of the risk genes were shown to connect directly or indirectly with molecules and networks previously reported to be associated with BD ( Figure 2). These gene networks include genes involved in circadian rhythm, immune system and synaptic scaffolding. Evidence of previous reports of association of the risk genes identified in our analyses with BD was further determined by database mining from linkage, genome-wide association study and genome-wide expression study studies (Table 2 and Supplementary Material).
Differences between patients and high-risk youth While our study was designed to specifically identify risk markers that would discriminate between controls and patients/high-risk youth (risk genes), we also performed an exploratory analysis with the genes that differentiated patients from high-risk offspring (Supplementary Tables 3 and 6). Specifically, we performed pathway analysis with the 42 genes obtained after combining the lists of genes from the gene expression and DNA methylation analyses. Five of the genes were not mapped on the IPA software (FLJ40113, LOC100130138, LOC23117, LOC648226 and LOC729978), and therefore the analysis was run with the remaining 37 genes. Interestingly, top-ranked canonical pathways include Fcγ receptormediated phagocytosis in macrophages and monocytes, PTEN signaling, Cdc42 signaling, Tec kinase signaling and interleukin-15 production, which have all been previously directly or indirectly associated with BD pathophysiology or treatment. [23][24][25] Correlation between expression and DNA methylation On the basis of the suggested effects of DNA methylation on the modulation of gene expression, we compared the lists of differentially expressed genes and DMPs for each comparison performed. The only concordance identified was for the gene LSM5, which was simultaneously found to be differentially expressed and methylated between controls and BD patients. To further identify which genes were being directly modulated by DNA methylation, we performed an unbiased integrated analysis of gene expression and DNA methylation by correlating all of the genes from both lists, and then filtered the list to identify those correlations with an R Pearson4Abs 0.8. By doing so, we were able to identify genes showing a strong correlation between methylation and expression that had not survived the stringent parameters used for the previous individual analyses (especially considering the fold-change cutoff used for the differential methylation analysis). This analysis identified 135 genes (Supplementary Table 8 Modulation of the expression of risk genes in lymphoblastoid cells On the basis of the results of the pathway analysis, we sought to functionally investigate the modulation of the four genes assigned to the top-enriched canonical pathway (GR signaling): MED1, GTF2A1, HSPA1L and TAF15. Expression of these four genes was reduced in PBMCs from patients with BD and unaffected offspring compared to controls (Figure 1d). To assess the potential role of Nuclear casein kinase and cyclin-dependent kinase substrate 1   Abbreviation: BD, bipolar disorder.

Risk markers for bipolar disorder in youth GR Fries et al
DNA methylation in modulating the expression of these genes, we measured RNA levels after treating lymphoblastoid cells from adult patients and controls with 5AzadC, a DNA methyltransferase inhibitor. As expected, treatment with 5AzadC significantly reduced global DNA methylation (Figure 3b), whereas no alteration in cell viability was detected (Figure 3a and Supplementary Figure 2). Treatment with 5AzadC significantly reduced the expression levels of MED1 and TAF15 and increased the expression of HSPA1L in both patients and controls, whereas no statistical differences were found in the expression of GTF2A1 (Figure 3). Importantly, HSPA1L levels were significantly decreased in patients compared to controls irrespective of 5AzadC treatment.

GR responsiveness in lymphoblastoid cells
Taking all regulators into account, we sought to investigate parameters related to the GR activity and signaling in the lymphoblastoid cells from adult BD patients and controls. In order to do that, we treated cells in vitro with dexamethasone (a GR agonist) for 4 or 48 h and checked for the expression of known GR-responsive genes. Of note, treatment with dexamethasone did not significantly reduce cell viability (Supplementary Figure 3). The use of dexamethasone-induced expression of GR-responsive genes has been shown to successfully predict GR activity in the past. 26,27 Specifically, we measured the expression of FKB5, [26][27][28]30) and PER1, 31 all of which were responsive to dexamethasone after 4 h of treatment, as expected ( Figure 4). No difference between groups was found for the 4-h time point. However, cells from patients showed a significant reduction in the dexamethasone-induced expression of TSC22D3 (P = 0.043) and PER1 (P = 0.02) after 48 h of treatment compared to controls, while no difference between groups was found for FKB5 ( Figure 4). No difference between patients and controls was seen at baseline for any of the conditions tested. Altogether, these results suggest a subtle yet detectable GR inhibition in adult BD patients compared to controls, which is evident after stimulation with dexamethasone for 48 h.
Correlation with clinical parameters Finally, in order to identify the potential impact of the risk genes in clinical measures, we correlated the expression levels of MED1, GTF2A1, HSPA1L and TAF15 with variables related to family   PER1 and FKBP5). The levels of these genes were also measured by real-time quantitative PCR in PBMCs from children and adolescents that are healthy controls, unaffected offspring of parents with bipolar disorder and patients with pediatric bipolar disorder. *P o0.05 compared to vehicle in the same group. DEX, dexamethasone; PBMC, peripheral blood mononuclear cell; PCR, polymerase chain reaction. environment and functioning. Family cohesion scores were positively correlated with the expression levels of HSPA1L (P = 0.049), GTF2A1 (P = 0.042) and MED1 (P = 0.011), but not significantly with TAF15 (P = 0.066; Figure 1e). In contrast, family conflict scores were negatively correlated with the expression of the four genes (HSPA1L, P = 0.019; GTF2A1, P = 0.006; MED1, P = 0.001; TAF15, P = 0.013; Figure 1f). Of note, the correlations for each gene clearly discriminate controls from BD patients and high-risk offspring. In addition to cohesion and conflict, we also found sparse

DISCUSSION
To the best of our knowledge, this is the first study to integrate peripheral genome-wide expression and methylation to identify markers of risk in a population at high risk for BD. Our results show that youth at high risk for BD present common alterations with pediatric BD patients in peripheral gene expression and DNA methylation that differentiate them from healthy controls. A combined analysis of the alterations in expression and methylation led to the identification of 43 'risk genes' that are especially enriched for the GR signaling canonical pathway. Our results also show that the expression of the genes assigned to this pathway can be modulated by DNA methylation, which poses the theoretical possibility of manipulating their expression as a means to counteract the familial risk presented by those subjects. Although preliminary, our results suggest the utility of peripheral measures in the identification of biomarkers of risk in high-risk populations and further emphasize the potential role of stress in the risk of BD in youth.
As initially hypothesized, we found that offspring of bipolar parents are much more similar in terms of peripheral expression and methylation events to patients than controls, even when no psychiatric symptoms are yet manifested. Our database and literature mining showed that differential gene expression in peripheral blood cells from BD patients had been previously reported for some of the risk genes identified, including ZNF641 and ZNF234, members of the zinc-finger family of genes, of which ZNF804A has been associated with BD and psychosis in genomewide association study. 32,33 Although most of the risk genes had not been previously shown to directly confer risk for BD, many of them are within pathways previously implicated in BD (Figure 2). Our identification of novel genes may be because of the fact that we combined findings from gene expression and DNA methylation. Such a multi-omics approach is crucial not only because of the well-described interplay between DNA methylation and gene expression, but especially when considering the non-canonical roles of DNA methylation (not necessarily altering the expression of the gene at which it is located). 7 To our surprise, high-risk youth and controls showed a higher number of differentially expressed genes than controls and BD patients. Accordingly, one could hypothesize that most of the gene expression alterations found in the unaffected offspring may account for a compensatory mechanism for the high risk presented by them, ultimately characterizing a resilience factor (of note, all offspring assessed in this study were unaffected for any affective or non-affective psychiatric condition). It is possible that the shift to a full-blown diagnosis might be accompanied by a suppression of these identified expression alterations and the establishment of new illness-specific ones (Supplementary Table  S3). Longitudinal studies will be able to clarify this issue.
Interestingly, pathway analysis performed with the risk genes suggested that GR signaling is the top-ranked pathway associated with BD risk in the periphery. This result is in accordance with several studies that suggest that the hypothalamus-pituitaryadrenal (HPA) axis may have a key role in BD and its risk in firstdegree relatives. 28 In fact, HPA dysfunction, along with a dysfunction in circadian rhythm and the immune system, has been proposed as one of the main biological factors underlying the risk for BD in offspring of bipolar parents. 34,35 Moreover, highrisk offspring are more likely to have experienced episodic and chronic interpersonal stress, 36 and they have also been shown to present higher daytime cortisol levels than low-risk offspring. [37][38][39] Prospective studies have also shown that abnormalities in the HPA axis predict the onset of an affective disorder in different samples of high-risk youth, including the offspring of BD parents. [40][41][42][43] Of note, the alterations found in the expression of the four risk genes assigned to the GR signaling pathway (HSPA1L, TAF15, GTF2A1 and MED1) might be contributing to this purported HPA axis dysfunction. HSPA1L (heat shock 70-kDa protein 1-like) is a member of the heat shock protein 70 family and has been shown to inactivate GR through partial unfolding. 44 Likewise, TAF15 (TAF15 RNA polymerase II, TATA box binding protein-associated factor, 68 kDa) and GTF2A1 (general transcription factor IIA, 1, 19/37 kDa) have also been indirectly related to GR inhibition. [45][46][47] On the contrary, MED1 (mediator complex subunit 1) has been shown to increase the activation of the glucocorticoid-GR dimer in nuclei. 48 The interplay between these four genes, along with other known stress-responsive genes, might be leading to a GR dysfunction and an increased vulnerability to the long-lasting negative effects of stress.
In this context, we used a cell-based assay to further explore predictors of GR activity and the means by which expression of these GR-related genes can be modulated in BD patients and controls. As hypothesized, taking all regulators into account, our results suggest that GR is slightly hyporesponsive in the cells from adult patients compared to healthy controls, which is in accordance with previous studies. 28 Moreover, with the exception of GTF2A1, our results show that inhibiting DNA methylation can alter the expression of the GR-related risk genes. Specifically, as opposed to what is traditionally thought as the repressive effect of DNA methylation, the expression of MED1 and TAF15 was significantly decreased after inhibition of DNA methylation, whereas HSPA1L expression was increased after treatment. This indicates that differential methylation patterns might be contributing to the alterations seen in the high-risk youth, and suggests the possibility of targeting this process to prevent illness onset. Of note, HSPA1L expression was decreased in the adult bipolar patients compared to controls, as was seen in the pediatric BD and high-risk subjects, further validating a role for this gene in risk for BD.
It is hypothesized that the risk of BD results from the interaction between genetic alterations and environmental factors. 2,7 Accordingly, high-risk offspring who experienced high interpersonal chronic stress display a larger cortisol rise following awakening than those reporting low interpersonal chronic stress, 49 and low levels of structure provided by parents have been predictive of an elevated cortisol response following awakening and during a laboratory psychosocial stressor. 50 Altogether, these studies suggest an important role of family environment in modulating HPA axis activity in youth at high risk. Accordingly, the four GRrelated risk genes identified in our analysis showed significant correlations with family environment scores, most notably family cohesion and conflict, and these correlations clearly discriminate controls from BD patients and high-risk offspring. Therefore, it is possible that the familial risk presented by BD offspring can be counteracted by targeting the environment and thereby possibly the expression of risk genes. In fact, existing family-focused stressmanagement interventions for children at risk for BD 51 might have important relevance for future longitudinal studies examining gene expression and methylation in high-risk youth. Noteworthy, we need to consider that other clinical factors might also differentiate the patients and high-risk sample from the controls, including more serious adverse events such as early trauma and maltreatment (which were not available for our particular sample). Future studies should include such measures in the search for risk biomarkers in vulnerable youth.
The results of our study need to be discussed in light of some limitations. First, this is a preliminary cross-sectional analysis with a small sample size, and we cannot rule out the possibility of type I error (false-positives) in our findings. Accordingly, our results must be seen as exploratory and one needs to consider that statistically empowered studies are now required for replication and validation. In this sense, only longitudinal studies with larger sample sizes will be able to determine which of these identified alterations are of significant relevance for illness onset and/or resilience. Moreover, larger studies will also be able to account for the heterogeneity of BD patients when assessing differences in the risk transmitted to their offspring, which is particularly warranted. Of note, in the context of finding relevant markers to be assessed at an individual subject level (personalized medicine), one could argue that these markers should be detectable also in very small sample sizes. This is not to discredit the real limitation of our analysis, but rather to emphasize the potential clinical utility of our results. In this sense, an analysis such as ours is much more likely to identify broad pathways and mechanisms to be followed up in high-risk populations than to really pinpoint definitive specific genes (which would need to be identified at the genomewide level with much larger sample sizes). Accordingly, our further validation of the GR-related genes with the cell-based assays suggests the relevance and importance of our preliminary results, regardless of the potential limitations inherent to the sample size. Second, given the lack of genotype information, analysis of expression and methylation quantitative trait loci was not possible. Both of these analyses would be of importance in future studies in light of the proposed multifactorial model for the risk of complex psychiatric disorders. Third, our search for risk genes was performed in PBMCs, which do not necessarily represent a proxy of brain expression and methylation. Nevertheless, there is evidence showing concordances between both tissues, [52][53][54] and measures in blood have already been correlated with brain volume 10 and psychiatric symptoms. 12,55 Further, peripheral pathways such as inflammatory and metabolic processes have been consistently associated with BD and its risk, and are highly susceptible to exposure to environmental stress. Alterations in the function of genes within these peripheral pathways, including the GR signaling pathway, may lead to an inability to respond appropriately to a given environmental insult, causing behavioral alterations that may lead to the manifestation of symptoms seen in BD. Fourth, as most patients were not drug-free, it is possible that medication use and its duration (which we could not control for) might be inducing changes in some of the gene expression and methylation markers identified. Finally, as our study does not account for cellular heterogeneity in PBMCs, the methylation results may vary. 56 Moreover, future studies should include nextgeneration sequencing technologies for the assessment of gene expression, as opposed to array technology, as this would lead to identification of non-coding sequences or splicing-and allelespecific transcripts that might be of special relevance to the risk for BD.
In summary, our preliminary study provides evidence that peripheral gene expression and DNA methylation can discriminate between youth at high risk, patients with BD and healthy controls. Our results require replication and validation in larger cohorts because of our small sample size and the possibility of type I error.
With that in mind, this study suggests that such markers might underlie the familial risk of BD shown by high-risk populations. Specifically, alterations related to the GR signaling were observed, which may help explain the HPA axis alterations previously reported in youth at high risk for BD. The strong correlation between genomic and family environment measures suggests that targeting these parameters might be beneficial in preventing illness onset in this population, or may be targets for early intervention.