DNA methylation signatures of aggression and closely related constructs: A meta-analysis of epigenome-wide studies across the lifespan

DNA methylation profiles of aggressive behavior may capture lifetime cumulative effects of genetic, stochastic, and environmental influences associated with aggression. Here, we report the first large meta-analysis of epigenome-wide association studies (EWAS) of aggressive behavior (N = 15,324 participants). In peripheral blood samples of 14,434 participants from 18 cohorts with mean ages ranging from 7 to 68 years, 13 methylation sites were significantly associated with aggression (alpha = 1.2 × 10−7; Bonferroni correction). In cord blood samples of 2425 children from five cohorts with aggression assessed at mean ages ranging from 4 to 7 years, 83% of these sites showed the same direction of association with childhood aggression (r = 0.74, p = 0.006) but no epigenome-wide significant sites were found. Top-sites (48 at a false discovery rate of 5% in the peripheral blood meta-analysis or in a combined meta-analysis of peripheral blood and cord blood) have been associated with chemical exposures, smoking, cognition, metabolic traits, and genetic variation (mQTLs). Three genes whose expression levels were associated with top-sites were previously linked to schizophrenia and general risk tolerance. At six CpGs, DNA methylation variation in blood mirrors variation in the brain. On average 44% (range = 3–82%) of the aggression–methylation association was explained by current and former smoking and BMI. These findings point at loci that are sensitive to chemical exposures with potential implications for neuronal functions. We hope these results to be a starting point for studies leading to applications as peripheral biomarkers and to reveal causal relationships with aggression and related traits.


Introduction
Aggression encompasses a range of behaviors, such as bullying, verbal abuse, fighting, and destroying objects. Early life social conditions, including low parental income, separation from a parent, family dysfunction, and maternal smoking during pregnancy are risk factors for childhood aggression [1][2][3]. High levels of aggression are a characteristic of several psychiatric disorders and may also be caused by traumatic brain injury [3], neurodegenerative diseases [4] and alcohol and substance abuse [5,6].
DNA methylation mediates effects of genetic variants in regulatory regions on gene expression [7] and is modifiable by early life social environment, as demonstrated by animal studies [8,9], and by chemical exposures including (prenatal) exposure to cigarette smoke, as illustrated by numerous human studies [10]. Despite the large tissue-specificity of DNA methylation, effects of genetic variants on nearby DNA methylation (cis mQTLs) correlate strongly between blood and brain cells [11]. DNA methylation signatures of chemical exposures [12] and maternal rearinging [9] show a certain (but less understood) degree of conservation across tissues.
Large-scale epigenome-wide association studies (EWASs) have become feasible through DNA methylation microarrays applied to blood samples from large cohorts, identifying thousands of loci where methylation in cord blood is associated with maternal smoking [13]. Methylation in blood is associated with depressive symptoms [14] and brain morphology [15], with some evidence for blood DNA methylation signatures being a marker for methylation levels [15] or gene expression [14] in the brain. For several traits, DNA methylation scores based on multiple CpGs from EWAS show better predictive value than currently available polygenic scores [16,17].
Small-scale studies (maximum sample size = 260) have provided some evidence that DNA methylation differences in blood, cord blood, and buccal cells are associated with severe forms of aggressive behavior and related problems in children and adults, including (chronic) physical aggression and early onset conduct problems [18][19][20], but studies on violent aggression in schizophrenia patients (N = 134) [21] and a population-based study of continuous aggression symptoms in adults (N = 2029) [22] did not detect epigenome-wide significant sites.
We performed an EWAS meta-analysis of aggressive behavior and closely related constructs. We chose to metaanalyze multiple measures of aggression across ages and sex to maximize sample size. The contribution of genetic influences to aggression is largely stable, at least throughout childhood [23], whereas epigenetic signatures may be dynamic and may differ across cell types and age. Therefore, we performed separate meta-analyses of peripheral blood collected after birth (N = 14,434) and cord blood (N = 2425), followed by a combined meta-analysis (N = 15,324) including an examination of heterogeneity of effects. Next, we tested the relationship between aggressive behavior and epigenetic clocks, as associations of lifetime stress [24], exposure to violence [25], and psychiatric disorders [26,27] with accelerated epigenetic ageing have been reported. We performed extensive functional followup by integrating our findings with data on gene expression, mQTLs and DNA methylation in brain samples.

Cohorts
Demographic information for the cohorts is provided in Table 1. Detailed cohort information is provided in eAppendix 1. Informed consent was obtained from all participants. The protocol for each study was approved by the ethical review board of each institution.

DNA methylation BeadChips
DNA methylation was assessed with Illumina BeadChips: the llumina Infinium HumanMethylation450 BeadChip (450k array; majority of cohorts), or the Illumina Methy-lationEPIC BeadChip (EPIC array). Most cohorts analyzed DNA methylation β-values, which range from 0 to 1, indicating the proportion of DNA that is methylated at a CpG in a sample. Cohort-specific details about DNA methylation profiling, quality control, and normalization are described in eAppendix 1 and summarized in eTable 2.
Epigenome-wide association analysis EWAS analyses were performed according to a standard operating procedure (http://www.action-euproject.eu/ content/data-protocols). In each cohort, the association between DNA methylation level and aggressive behavior was specified under a linear model with DNA methylation as outcome, and correction for relatedness of individuals where applicable. Two models were tested. Model 1 included aggressive behavior, sex, age at blood sampling (not in cohorts with invariable age), white blood cell percentages (measured or imputed), and technical covariates. Model 2 included the same predictors plus body-mass-index (BMI) and smoking status in adolescents and adults (current smoker, former smoker or never smoked). Cohort-specific details and R-code are provided in eAppendix 1 and eTable 3, respectively. The relationship between aggressive behavior and covariates is provided in eTable 4 based on data from the Netherlands Twin Register (N = 2059). Quality control and filtering of cohort-level EWAS summary statistics is described in eAppendix 2. The following probes were removed: on sex chromosomes, methylation sites with more than 5% missing data in a cohort, probes overlapping SNPs affecting the CpG or single base extension site with a minor allele frequency (MAF) > 0.01 in the 1000 G EU or GONL population [7], and ambiguous mapping probes reported with an overlap of at least 47 bases per probe [36]. The R package Bacon was used to compute the Bayesian inflation factor and to obtain bias-and inflation-corrected test statistics (eFig. 2) prior to meta-analysis [37]. Further data can be found in the supplementary material for this paper, eFigs. 1-18

Meta-analysis
Fixed-effects meta-analyses were performed in METAL [38]. We used the p-value-based (sample size-weighted) method because the measurement scale of aggressive behavior differs across studies. First, results based on peripheral blood and cord blood data were meta-analyzed separately. Second, a combined meta-analysis was performed of all data. The following cohorts had data available for both cord blood and peripheral blood (from the same children): INMA (which is part of HELIX) and ALSPAC. In the combined meta-analysis, the cord blood data from ALSPAC and INMA were excluded to avoid sample overlap. Statistical significance was assessed considering Bonferroni correction for the number of sites tested (alpha = 1.2 × 10 −7 ). Methylation sites that were associated with aggression at the less conservative false discovery rate (FDR) threshold (5%) were included in follow-up analyses. The I 2 statistic from METAL was used to describe heterogeneity.

Follow-up analyses
DNA methylation score analyses and epigenetic clock analyses are described in eAppendix 3 and eAppendix 4.
Follow-up analyses (eAppendix 5-eAppendix 10) were performed on meta-analysis top-sites (FDR < 0.05), including a comparison of top-sites with all previously reported associations in the EWAS atlas [39], follow-up analysis of top-sites in two clinical cohorts with blood methylation data (Table 2), a cross-tissue analysis (blood, buccal, brain), and association with gene expression level and mQTLs. Analyses of differentially methylated regions (DMRs) are described in eAppendix 8. Finally, we performed replication analysis of a previously reported DMR associated with aggression [20] (eAppendix 9).

Cord blood meta-analysis
The meta-analysis of cord blood (five cohorts; N = 2425) detected no significant CpGs (eTable 7). Examining top-sites from the peripheral blood meta-analysis, 12 of the significant, and 33 of the FDR top-sites were assessed in cord blood; 10 (83%), and 25 (71%), respectively, showed the same direction of association (Fig. 1b). Effect sizes in cord blood correlated significantly with effect sizes in peripheral blood (r = 0.74, p = 0.006 for epigenome-wide significant and r = 0.51, p = 0.003 for FDR top-sites).

Combined meta-analysis
In the combined meta-analysis of peripheral and cord blood data (total sample size = 15,324, eTable 6), methylation at 13 CpGs was associated with aggression after Bonferroni correction, including ten CpGs from the peripheral blood meta-analysis, and 43 passed a less conservative threshold (FDR 5%, Table 3). Among FDR top-sites from both analyses, 13 CpGs were only found in the combined metaanalysis but not in the peripheral blood meta-analysis, while five CpGs from the peripheral blood meta-analysis were no longer significant in the combined meta-analysis (Fig. 1c).

Overlap with CpGs detected in previous EWASs
We performed enrichment analyses against all previously reported associations with diseases and environmental exposures recorded in the EWAS Atlas [39]. The top ten most strongly enriched traits are shown in Fig. 1e. CpGs associated with aggressive behavior showed large overlap with CpGs previously associated with smoking (37 CpGs; corresponding to 77% of aggression-associated CpGs and 0.3% of CpGs that have been previously associated with smoking), and smaller overlap with other smoking traits (e.g. maternal smoking), other chemical exposures (e.g. perinatal exposure to polychlorinated biphenyls (PCBs) and polychlorinated dibenzofurans (PCDFs)). Further overlap includes CpGs associated with alcohol consumption, cognitive function, educational attainment, ageing, and metabolic traits (eTable 8).

Controlling for smoking and BMI
Model 2 was fitted to test whether the association between DNA methylation and aggressive behavior attenuated after adjusting for the most important postnatal lifestyle factors that influence DNA methylation (smoking and BMI). Examining 17,457 CpGs associated with smoking [40], previously reported effect sizes for smoking correlated significantly with effect sizes for aggression from our metaanalysis (r = 0.55, p < 1 × 10 −16 , eFig. 5a). Examining the 35 CpGs associated with aggression at FDR 5% in peripheral blood, all CpGs showed the same direction of association with aggression after adjusting for smoking and BMI (eTable 6, Fig. 1d). Effect sizes were attenuated to varying degrees (mean reduction = 44%, range = 3-83%). Changes in effect sizes are likely primarily driven by the correction for smoking, since only one top-site has been associated previously with BMI. Some CpGs showed little attenuation, in particular CpGs that have not been previously associated with smoking (e.g.; cg02895948; PLXNA2, cg00891184; KIF1B, cg1215892; intergenic, and cg05432213; ACT1; eFig. 5b). In model 2, between-study heterogeneity at top-sites was greatly reduced (adjusted: mean I 2 = 28%, range = 0-77%). No CpGs were

DNA methylation scores
We computed weighted sumscores in NTR (peripheral blood, mean age = 36.4, SD = 12, N = 2,059) based on summary statistics from the peripheral blood meta-analysis without NTR (Fig. 2). The best score, based on CpGs with p < 1 × 10 −3 in model 2 (745 CpGs), explained 0.29% of the variance in aggression (p = 0.02, not significant after multiple testing correction). This effect was attenuated when age and sex were added to the prediction equation.

Epigenetic clocks
Horvath and Hannum epigenetic age acceleration were not associated with aggression (eTable 9) in a meta-analysis of Effects sizes of top-sites from the meta-analysis of aggression in peripheral blood (x-axis) versus effects sizes from the meta-analysis of aggression in cord blood (y-axis). c Venn diagram showing the numbers and overlap of CpGs detected at FDR 5% in the meta-analysis of peripheral blood and the combined meta-analysis and cord blood and peripheral blood. d Effects sizes of top-sites from the meta-analysis of aggression in peripheral blood model 1 (x-axis) versus effects sizes from the meta-analysis of aggression in peripheral blood model 2; adjusted for smoking and BMI (y-axis). e Top enriched traits based on enrichment analysis with all 48 top-sites. The third column shows how many of the 48 CpGs have been previously associated with the trait in the first column. The last column shows the overlap as a percentage of the total number of CpGs previously associated with the trait in column 1 (e.g. 0.3% of all CpGs previously associated with smoking are also associated with aggression in the current meta-analysis). d In b and d, CpGs that have not been previously associated with smoking in the meta-analysis by Joehanes et al. [40] are plotted in red.

Follow-up in clinical cohorts
To assess the translation of our observations to aggressionrelated problem behavior in psychiatric disorders that show comorbidity with aggression, we performed follow-up analyses of top-sites in two clinical cohorts (

Cross-tissue analysis
To assess the generalizability of our observations in blood to other tissues, we examined the association with CBCL aggression in buccal DNA methylation data (EPIC array), available for 38 top-sites, in a twin cohort (N = 1237) and a child clinical cohort (N = 172; Table 2, eTable 12) [43]. We also tested associations with maternal smoking and with child nervous system medication (as indexed by the Anatomical Therapeutic Chemical classification system (ATC N-class)) Correlations between DNA methylation levels in blood and buccal cells, based on 450k data from matched samples (N = 22, age = 18 years) [44] were available for 36 of these CpGs. The average correlation was weak (r = 0.25, range = −0.40-0.76). Five CpGs showed a strong correlation between blood and buccal cells (r > 0.5, eTable 13), of which three have been previously associated with (maternal) smoking.
In line with the weak correlation between blood and buccal cell methylation for most top-sites, none of the top-sites was associated with aggression in buccal samples (alpha = 0.001, eTable 14). Regression coefficients based on analyses in buccal cells and blood overall showed no directional consistency (twin cohort: r = 0.03, p = 0.86; concordant direction: 47%, p = 0.87, binomial test, clinical cohort: r = 0.27, p = 0.10; concordant direction: 61%, p = 0.26). Exclusion of ancestry outliers did not change these results (eTable 14). Of the five CpGs with a large blood-buccal correlation, three showed the same direction of association with aggression in buccal cells from twins, four in clinical cases, and one CpG was nominally associated with aggression in buccal samples from twins; cg11554391 (AHRR), r blood-buccal = 0.69, β aggression = −0.0002, p = 0.007. The bars indicate how much of the variance in ASEBA adult self-report (ASR) aggression scores were explained by DNA methylation scores in NTR (N = 2059, peripheral blood, 450k array). Scores were created based on weights from the peripheral blood meta-analysis with NTR excluded (N = 12,375). The y-axis shows percentage of variance explained. Different colors denote DNA methylation scores created with different numbers of CpGs that were selected on their p value in the meta-analysis (see legend). From left to right, the first three plots show DNA methylation scores created based on weights obtained from the meta-analysis of EWAS model 1, and plots 4 till 6 show DNA methylation scores created based on weights obtained from the metaanalysis of EWAS model 2. Each DNA methylation score was tested for association with aggression in three model: the simplest model (first plot) included aggression as outcome variable, and DNA methylation score as predictor plus technical covariates and cell counts. The second model additionally included sex and age as predictors. The third model additionally included sex, age, and smoking as predictors. Stars denote nominal p values < 0.05 (not corrected for multiple testing).
We examined the correlation between DNA methylation levels in blood and brain (N = 122) [45] in published DNA methylation data from matched blood samples and four brain regions. Six aggression top-sites (13%) showed significantly correlated DNA methylation levels between blood and one or multiple brain regions: mean r = 0.52; range = 0.45-0.63, alpha = 2.6 × 10 −4 , eTable 15, eFig. 8), two of which have not been previously associated with smoking or BMI: cg14560430(TRIM71), and cg20673321(ZNF541).
DMRs DMR analysis showed that 14 DMPs from our combined meta-analysis reside in regions where multiple correlated methylation sites showed evidence for association with aggressive behavior. DMR analysis also detected additional regions that were not significant in DMP analysis (eTable 16-eTable 21). These analyses are described in detail in eAppendix 8.

Replication analysis
A previous EWAS based on Illumina array data detected a significant DMR in DRD4 in buccal cells associated with engagement in physical fights [20]. This locus did not replicate in our meta-analyses or in the two cohorts with buccal methylation data (eTable 22, eAppendix 9).

Gene expression
Based on peripheral blood RNA-seq and DNA methylation data (N = 2101) [7], 17 significant DNA methylation-gene expression associations were identified among 15 CpGs and ten transcripts (Table 3, eTable 23). For most transcripts, a higher methylation level at a CpG site in cis correlated with lower expression (82.4%): cg03935116 and FAM60A, cg00310412 and SEMA7A, cg03707168 and PPP1R15A, cg03636183 and F2RL3, two intergenic CpGs on chromosome 6, where methylation level correlated negatively with expression levels of FLOT1, TUBB, and LINC00243, and six CpGs annotated to AHRR were negatively associated with EXOC3 expression level. Positive correlations were observed between methylation levels at 2 CpGs on chromosome 7 and levels of RP4-647J21.1 (novel transcript, overlapping MYO1G) and between cg02895948 and PLXNA2.

Discussion
We identified 13 epigenome-wide significant sites (Bonferroni corrected) in the meta-analysis of blood and 13 in the combined meta-analysis of blood and cord blood (16 unique sites). We prioritized 48 top-sites (FDR 5%) for follow-up analyses. Methylation level at three top-sites was associated with expression levels of genes that have been previously linked to psychiatric or behavioral traits in GWASs: FLOT1 (schizophrenia [46]), TUBB (schizophrenia) [46], and PLXNA2 (general risk tolerance) [47]. Several other loci have functions in the brain and six CpGs showed correlated methylation levels between blood and brain.
The majority of top-sites (77%) were associated with smoking, 46% were associated with maternal smoking, 25% were associated with alcohol consumption, and 15% were associated with perinatal PCB and PCDF exposure. This overlap of aggression top-sites with smoking and other chemical exposures is noteworthy. Methylation levels of top-sites in the Aryl-Hydrocarbon Receptor Repressor gene AHRR and several other genes are known to be strongly associated with exposure to cigarette smoke [13,40] and persistent organic pollutants [48]. The best characterized exogenous ligands of the widely expressed Aryl-Hydrocarbon Receptor are environmental contaminants such as benzo[a]pyrene (B[a]P), and TCDD (dioxin), whose neurotoxic and neuroendocrine effects, including disruption of neuronal proliferation, differentiation, and survival, have been well characterized [49]. Human prenatal exposure to B [a]P is associated with delayed mental development, lower IQ, anxiety and attention problems [50]. Research on B[a]P neurotoxicity in adults is scarce but a study on coke oven workers found that occupational B[a]P exposure correlates with reduced monoamine, amino acid and choline neurotransmitter levels and with impaired learning and memory [51].
On average 44% (range = 3-82%) of the aggressionmethylation association was explained by current and former smoking and BMI. Our findings do not merely reflect effects of own smoking: 71% of the top-sites showed the same direction for the prospective association of cord blood methylation at birth and aggression in childhood, and 46% have been associated with maternal prenatal smoking. There is a weak observational association between maternal smoking and child aggression [52]. A limitation of our study is that the EWAS analyses did not adjust for prenatal and postnatal second-hand smoking, and did not adjust for smoking intensity and duration or other substance use. Future studies can examine if the link between prenatal maternal smoking and aggression is mediated by DNA methylation.
We found that DNA methylation scores for aggression explained less variation compared to DNA methylation scores for traits such as BMI, smoking, and educational attainment. For these traits, EWASs tended to identify more epigenome-wide significant hits [16,17]. The variance in aggression explained by DNA methylation scores was in the same order of magnitude as the variance in height explained by DNA methylation scores (based on EWASs of height in smaller samples), i.e. <1% [16]. More research is needed in particular to delineate a causal link between these methylation sites and aggressive behaviour, since our results may also reflect (residual) confounding by (exposure to secondhand) smoking. One approach to address this could be Mendelian Randomization, in which genetic information (SNPs) is used for causal inference of the effect of an exposure (e.g. DNA methylation) on an outcome (e.g., aggression). This approach previously supported a causal effect of maternal smoking-associated methylation sites in blood on various traits and diseases for which well-powered GWASs have been performed, including schizophrenia [53,54]. For aggressive behavior, the currently available [55] largest GWASs of aggressive behavior included 16,000 [56] and~75,000 participants [57], respectively. The GWAS by Ip et al. detected three significant genes in gene-based analysis, but both GWASs did not detect genome-wide significant SNPs and are likely still underpowered. In the future, larger GWASs of aggressive behavior and larger mQTL analyses will allow for powerful Mendelian Randomization for aggression-associated methylation sites.

Strengths and limitations
This is the largest EWAS of aggressive behavior to date. The large sample size was achieved by applying a broad phenotype definition, including participants from multiple countries and all ages in a meta-analysis, and analyzing DNA methylation data from blood. A limitation of this approach is that it reduces power to detect age-, sex-, and symptom-specific effects, and that genetic and environmental backgrounds of different populations, as well as non-identical processing methods of methylation data play a role. A limitation of population-based cohorts and even clinical populations is that individuals with extreme levels of aggressive behavior who cause most societal problems are likely underrepresented. Moreover, some studies used measures that tapped features that overlap with but are not necessarily indicative of aggression (e.g., personality traits, anger, oppositional defiant disorder). Future EWASs that specifically focus on more homogeneous aggression measures are therefore warranted. Our meta-analysis approach may identify a common epigenomic signature of aggression-related problems.
Follow-up analysis in independent datasets indicated that these findings do not generalize strongly to buccal cells, and results did not replicate in two clinical cohorts. These were small, used different aggression measures, and one used a different technology (sequencing) in females only.

Conclusions
We identified associations between aggressive behavior and DNA methylation in blood at CpGs whose methylation level is also associated with exposure to smoking, alcohol consumption, other chemical exposures, and genetic variation. Methylation levels at three top-sites were associated with expression levels of genes that have been previously linked to psychiatric or behavioral traits in GWAS. Our study illustrates both the merit of EWASs based on peripheral tissues to identify environmentallydriven molecular variation associated with behavioral traits and their challenges to tease-out confounders and mediators of the association, and causality. To have full insight into, and to control for confounders in behavioral EWAS meta-analyses (which, in addition to smokingexposure across the life course likely include other substance-use and socioeconomic conditions throughout life and other, perhaps less obvious ones) is challenging. Future studies, including those that integrate EWAS results for multiple traits and exposures, DNA methylation in multiple tissues, and GWASs of multiple traits are warranted to unravel the utility of our results as peripheral biomarkers for pathological mechanisms in other tissues (such as neurotoxicity) and to unravel possible causal relationships with aggression and related traits. We consider this study to be the starting point for such follow-up studies.

Code availability
The EWAS R-code is provided in eTable 3.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.