Introduction

Adverse life experiences alter the epigenetic profile1, 2, 3 in a manner that is salient for pathophysiology of post-traumatic stress disorder (PTSD).4, 5, 6 Changes in methylation status of the glucocorticoid receptor gene have been reported previously in combat veterans with PTSD.7 Methylation changes in these same genes were also observed in association with parental trauma, suggesting that such effects may be related to heritable risk profiles.8 Consistent claims were presented by in vivo studies.9, 10 Together, these discoveries drive a strong rationale for screening the epigenetic profiles of patients’ blood to identify next-generation strategies for PTSD risk factors, diagnostics and experimental therapeutics. A growing body of cohort-based studies has linked the epigenetic changes with PTSD development,11, 12, 13 mostly focusing on pre-determined targets such as immunity14, 15, 16 and neuroendocrinology.7, 8, 17, 18

For the present study, strict inclusion–exclusion criteria were used19, 20 to identify a training set comprising 48 male veterans with PTSD (PTSD+) and 51 age-/ethnicity-/gender-matched controls (PTSD−). Control veterans experienced war trauma but were negative for current and past PTSD (Supplementary Table S1). An independent test set comprising 31 PTSD+/29 PTSD− veterans was recruited using the same screening protocol.

Enriched by the differentially methylated genes (DMGs), the epigenetically altered networks are linked to nervous systems' development and function, PTSD-associated somatic complications and endocrine signaling. All of these networks mined from the training set were validated by the test set (Table 1). Subsequently, we consolidated the test and training sets to develop a union set and revaluated the methylation profile using the improved sample size. The result confirmed 65% of the pathways mined from the test and training sets. Going forward, we will consider the methylation profile from this union set as the discovery set to be confirmed in a new validation set, for which subjects are currently being recruited.

Table 1 The pathways of interest and their status of validation

Materials and methods

Ethical statement

The Institutional Review Boards of the US Army Medical Research and Materiel Command, the New York University Langone Medical Center (New York, NY, USA), the Icahn School of Medicine at Mt Sinai (New York, NY, USA) and the James J Peters Veterans Administration Medical Center (Bronx, NY, USA) approved this study. Study participants gave written and informed consent to participate. The study was conducted in accordance with the provisions of the Helsinki Declaration.

Cohort recruitment and analysis

The recruitment process involved several steps detailed in the Supplementary Table S1 and in previous communications.19, 20 The training set of 48 PTSD+/51 PTSD− and the test set of 31 PTSD+/29 PTSD− veterans was probed by whole-genome arrays (Agilent, Santa Clara, CA, USA) containing ~27k CpGIs. The outcome was normalized to minimize the confounding factors attributed to batch processing.21 Functional analysis was performed using those DMGs, which encoded CpGIs meeting the cutoff false discovery rate<0.1.

Next, we merged the training and test sets to develop a union set comprising 79 PTSD+/80 PTSD− veterans, which was probed by whole-genome arrays (Illumina, San Diego, CA, USA) containing 450 k probes. The outcome was corrected to minimize heterogeneous cell populations22 and age effects, and was screened at P<0.05 to find DMGs. Available GEO databases are as follows: GSE76401 and GSE85399. ClueGo v2.1.2 and Ingenuity pathway analysis were used for network construction, and pathways that we report met the cutoff of P<0.05.

Results

The primary purpose of the present communication was to identify the functional networks associated with combat-related PTSD, and thereby to provide a better understanding of PTSD pathophysiology. To meet this goal, we recruited 48 PTSD+/51 PTSD− veterans as a training set and 31 PTSD+/29 PTSD− veterans as a test set. To increase the statistical power and to minimize any bias of the Agilent high-throughput array platform, we took two measures. First, we constructed a union set by consolidating the training and test sets, following a recently published strategy.19, 20 Second, we retested the methylation profile, probing the union set using a different array platform manufactured by Illumina. Furthermore, this union set retains sufficient statistical power. Taking a moderate estimate of 50% s.d.'s in probe signals and a relatively conservative estimate for the mean difference (that is, top 1%), 76 people per group should give 95% power to detect an individual probe with a (Bonferroni-adjusted) genome-wide significance of P<1.162931e−07.

Functional analysis of the training set found a host of PTSD-related networks

In the investigation of the 48 PTSD+/51 PTSD− training set, we identified 5578 differentially methylated CpGIs annotated to 3662 genes. We collectively defined the 1698 promoter-bound CpGIs and 157 additional divergent promoter regions as the promoter regions (Supplementary Figure S4A). Altogether, 4721 CpGIs annotated 2401 DMGs that displayed a log2 ratio >0.1 and were defined as hypermethylated. Conversely, 857 CpGIs (672 DMGs) displaying a log2 ratio <0.1 were defined as hypomethylated. The remaining DMGs co-enriched by both hyper- and hypomethylated CpGIs were excluded from the subsequent functional analysis. For the functional analysis, we used those DMGs, which encoded promoter-bound CpGIs, estimated as nearly 60% of total DMGs. Significantly enriched networks with similar functional purposes were grouped together, resulting in four network clusters (Figure 1): nervous system functions (Figure 2a), PTSD-associated somatic complications (Figure 2b), PTSD-relevant endocrine signaling networks (Supplementary Figure S6A) and nervous system development (Supplementary Figure S6B).

Figure 1
figure 1

Functional enrichment analysis. In all, 352 DMGs encoding promoter-bound differentially methylated CpGIs curated from the training set were enriched for four functional clusters: PTSD-associated somatic complications, PTSD-relevant endocrine signaling, nervous system development and nervous system functions. These clusters were designed to group networks with overlapping functionality. All of these networks were validated by the test set. CRH, corticotrophin-releasing hormone; DMG, differently methylated gene; GC, glucocorticoid; HPA, hypothalamus–pituitary–adrenal; PTSD, post-traumatic stress disorder; REM, rapid eye movement.

Figure 2
figure 2

(a) Network cluster annotated to nervous system functions significantly enriched by DMGs in the training set. (b) Network cluster annotated to PTSD-associated somatic complications significantly enriched by DMGs in the training set. (c) Network cluster annotated to PTSD-relevant endocrine networks significantly enriched by DMGs in the training set. (d) Network cluster annotated to nervous system development networks significantly enriched by DMGs in the training set. In all the figure, red and green circles are hypermethylated and hypomethylated genes, respectively. Sizes of the circles labeled by the annotation terms are correlated with their significance of enrichment.DMG, differently methylated gene; PTSD, post-traumatic stress disorder.

Test set validated all the networks identified by the training set

There was a significant (P<0.001) overlap at the DMG level between the 48 PTSD+/51 PTSD− training set and the 31 PTSD+/29 PTSD− test set with 779 DMGs in common between the two sets assayed by the Agilent whole-genome array. Furthermore, a significant agreement was noted at the functional level as all of the networks mined from the training set emerged significantly enriched by DMGs identified from the test set (Table 1).

Union set probed by a different array platform validated a majority of networks identified by the training and test sets

The union set probed by the Illumina array resulted in 3339 DMG, 74.4% of which encoded hypermethylated CpGIs (Supplementary Figures S4B and C). One hundred ninety-one DMGs were in common between the training set and union set, and 107 DMGs were in common between the test set and union set (Supplementary Figure S5). There were 852 DMGs encoding promoter-bound CpGIs enriched in networks linked to addiction, long-term impact on cerebral functions, social withdrawal, diabetes, aging, inflammation, circadian rhythm, dopamine-serotonin signaling, neurogenesis, cannabinoid signaling, nerve impulse and synaptic plasticity. In addition, 407 DMGs in shelf and shore regions were enriched in networks associated with REM sleep, circadian rhythm, inflammation, hypothalamic–pituitary–adrenal axis and axon guidance. Altogether, the union set confirmed 15 out of 23 networks mined from the training set and validated by test set. All of the networks clustered under PTSD-associated somatic complications and nervous systems' development were confirmed by the training, test and union sets.

Methylation status of selected DMGs validated by targeted bisulfite sequencing

Forty-two DMGs were selected from the training set based on their methylation status and their relevance to PTSD. Their methylation status was verified by targeted bisulfite sequencing (Zymo Research, Irvine, CA, USA; Table 2).23, 24 Twenty genes out of forty-two DMGs were confirmed with the Agilent array data. Table 2 lists these genes and their relevance to PTSD and associated comorbidities.

Table 2 Differentially methylated genes validated by targeted sequencing

Discussion

Clinical measures were in agreement with the epigenetically altered networks and DMGs

Self-reported clinical measures indicated that veterans with PTSD were concurrently experiencing higher levels of fear, social withdrawal, anxiety, hostility, depression and anger than were controls. Epigenetic investigation of DNA extracted from whole blood revealed networks relevant to these PTSD-associated negative emotions. Greater waist size, waist-to-hip ratio and body mass index19 were found in PTSD cases as compared with controls and are consistent with the observed pathways associated with cardiac diseases, diabetes and metabolic syndrome. PTSD-associated immune dysregulation has been previously reported in epigenetic studies.14, 15, 19 Consistent with previous findings,14 our results found a host of innate immunity-associated genes, consisting of 60% of the entire set of DMGs found altered in PTSD patients. In extending this knowledge, we functionally linked a majority of these genes to mobilization of phagocytic macrophages and leukocytes.

In addition, we identified epigenetically altered networks linked to learning and memory that are relevant for PTSD-associated neurocognitive impairment. Previous epidemiology studies suggested that there was an increased risk of premature aging in PTSD.34, 35, 36 We identified two epigenetically altered networks relevant to aging. The first network is telomere management and interaction with pathways of two mediators, wnt/β-catenin37 and p53.38 The epigenetic profile of these aging markers35 was altered in PTSD. The second network is mitochondrial dysfunction, also epigenetically altered in PTSD veterans. Consistent with these markers of premature aging, we found evidence recently for decreased mitochondrial DNA copy numbers in PTSD veterans from this cohort, suggesting a role for energy deprivation in PTSD that escalates the aging process.39

Premature aging40, 41 and other PTSD-associated somatic complications, such as dysregulation of immunity,42 are known to be associated with circadian rhythm. Veterans with PTSD showed epigenetic regulation of some of the key molecular nodes responsible for setting the circadian clock. We identified DMGs encoding CREB3 and GRIN2A, which control photoreception,43 and that are involved in signaling to entrain the circadian clock regulation by CLOCK and PER1 genes.44

Epigenetic changes in neurogenic functional pathways were captured by the differential methylation of members of the neural helix–loop–helix family, including NEUROG1 and HES1 and their regulators ATOH-1, Pax6 and NKX2-2.45, 46 Epigenetic perturbations of networks related to the hypothalamic–pituitary–adrenal axis functions and the synthesis of key feedback regulators, such as corticotrophin and glucocorticoid, as well as epigenetic changes in the serotonergic and dopaminergic networks, may serve as targets for novel therapeutics for PTSD.47

Strengths, limitations and future work

The Diagnostic and Statistical Manual of Mental Disorders-IV diagnostic criteria48 were used to determine the PTSD status, an approach to clinical phenotyping, which has limitations. We attempted to maximize signal detection by employing stringent selection criteria including a requirement of Clinician-Administered PTSD scale scores of 40 or greater for PTSD cases and scores less than 20 for controls.19, 20 Our array-based approach selected two platforms that ensured extensive coverage of the genome and instilled higher confidence in the outcome. We also focused primarily on the promoter regions, as the methylation shifts near transcription start site are most likely to be associated with long-term gene silencing.49

Given the biological heterogeneity of PTSD, our findings are limited by the sizes of our discovery, test and union sets.50 The selection of the Illumina platform was driven by the following three factors: (i) this platform offered nearly twice the number of CpGIs to test in comparison to the Agilent platform; (ii) the significantly lower amount of input DNA required for the Illumina assay (500 ng DNA versus 5 μg for the Agilent, assay) satisfied our need to conserve gradually decreasing DNA stocks; and (iii) the growing preference for the Illumina assay in the epigenetics literature11, 51 was convincing for its selection. The present study recruited the largest cohort size used to date to study the PTSD pathophysiology. The statistical analysis has moderate statistical power attributed to the sample size, which was further enhanced by the strict regulations applied by the pathway enrichment analysis. The epigenetic contributions of many of those genes discovered have been reported as linked to PTSD via transcriptomic variations. In addition, many novel epigenetic markers linked to PTSD were presented here. Together, this study revealed some of the key aspects of PTSD, such as its long-term health implications, which could be best explained by the epigenetic model. However, it is challenging to draw robust mechanistic conclusions due to the non-longitudinal nature of the study; hence, there is a limited scope for making inferences about whether these epigenetic alterations are causes of or consequences of PTSD. This study is also lacking in prospective design, gender balance and systems-wide integration. The findings are compromised further by the fact that the array platforms are potentially unable to provide the extensive coverage typical of deep sequencing.

On the basis of these findings, future work should focus on those epigenetically altered networks presented herein, which showed clinical relevance to PTSD pathophysiology. Our study presented a knowledge-driven data-mining architecture particularly useful to identify potential biomarkers for a multifactorial disease such as PTSD. In particular, we demonstrated how to use the clinical and physical dimensions as the successful guiding cue to mine the molecular markers linked to disease pathophysiology. This data-mining approach will be practised further in our future study that will recruit a new validation set to confirm the results obtained from the union set serving as the better-powered discovery set. We will also recruit a cohort of female veterans to minimize gender bias. Additional data from blood counts and magnetic resonance imaging will be included. System-wide knowledge integration will be performed to identify PTSD biomarkers with the highest efficacy.52, 53, 54, 55, 56