Influence of Severity of Type and Timing of Retrospectively Reported Childhood Maltreatment on Female Amygdala and Hippocampal Volume

Deleterious effects of adverse childhood experiences (ACE) on human brain volume are widely reported. First evidence points to differential effects of ACE on brain volume in terms of timing of ACE. Upcoming studies additionally point towards the impact of different types (i.e., neglect and abuse) of ACE in terms of timing. The current study aimed to investigate the correlation between retrospectively reported severity of type (i.e., the extent to which subjects were exposed to abuse and/or neglect, respectively) and timing of ACE on female brain volume in a sample of prolonged traumatized subjects. A female sample with ACE (N = 68) underwent structural magnetic resonance imaging and a structured interview exploring the severity of ACE from age 3 up to 17 using the “Maltreatment and Abuse Chronology of Exposure” (MACE). Random forest regression with conditional interference trees was applied to assess the impact of ACE severity as well as the severity of ACE type, (i.e. to what extent individuals were exposed to neglect and/or abuse) at certain ages on pre-defined regions of interest such as the amygdala, hippocampus, and anterior cingulate (ACC) volume. Analyses revealed differential type and timing-specific effects of ACE on stress sensitive brain structures: Amygdala and hippocampal volume were affected by ACE severity during a period covering preadolescence and early adolescence. Crucially, this effect was driven by the severity of neglect.


1.1.1
Diagnostic and consenting procedures Clinical diagnoses were assessed by trained diagnosticians using the Structure Clinical Interview for DSM-IV Axis I Disorders SCID-I; 1 , the Clinician Administered PTSD Scale CAPS; 2 , and the BPD section of the International Personality Disorder Examination IPDE; 3 . Self-report measures included retrospective questionnaires on childhood trauma (Childhood Trauma Questionnaire; CTQ; 4 , PTSD symptomatology (Davidson Trauma Scale; DTS; 5 , and severity of depressive mood (Beck Depression Inventory; BDI-II; 6 . Details on demographic data and clinical characteristics of the sample are reported in Table S1. The study was approved by the Ethical Board II of Heidelberg University, Germany. It was conducted according to the Declaration of Helsinki at the Central Institute of Mental Health in Mannheim. Written informed consent was obtained from the participants after the procedure had been fully explained. All participants received monetary remuneration for participation in the study.

1.1.2
Inclusion and exclusion criteria Participants with posttraumatic stress disorder (PTSD) were recruited from a larger randomized controlled trial evaluating dialectical behavioral therapy for PTSD (DRKS00010827). Trauma exposed healthy control participants (TC) were recruited via advertisements in local newspapers, flyers and internet. Exclusion criteria for all participants were metal implants, pregnancy, left-handedness, and claustrophobia. Exclusion criteria for PTSD participants covered current and lifetime schizophrenia or bipolar-I disorder, mental retardation, severe psychopathology, traumatic brain injuries or somatic illness that needs to be treated immediately in another setting (e.g., BMI<16), medical conditions making exposure-based treatment impossible, a suicide attempt within the last two months, and substance dependency with no abstinence within two months prior to the study. Exclusion criteria for the TC sample were any current or previous mental disorder, any psychotherapeutic experience or any intake of psychotropic medication for more detailed descriptions of the TC sample see: 7 .

Maltreatment exposure
The time course and severity of exposure to traumatic events was assessed using an adapted version of the Maltreatment and Abuse Chronology of Exposure interview (MACE; 8,9 . The inventory evaluates ten types of adverse childhood experiences (emotional neglect, physical neglect, parental physical abuse, siblings physical abuse, parental emotional abuse, siblings emotional abuse, sexual abuse, peer abuse, witnessing interparental violence and witnessing violence to siblings) during each year of childhood 3 up to age 17. Scores can be calculated for each ACE type, as well as a total score based on the sum score of all categories. Moreover, the duration, as well as the amount of ACE types experienced during childhood and adolescents can be calculated. With respect to the MACE severity score, test-retest reliability over a time period of 6 month has been found to be very reliable in an US population (Severity: r=.91 [95% CI 0.86-0.94]; p values < .001) 8 . Convergent validity scores were found to be good as the MACE severity score correlated 0.74 (95%, CI =0.69-0.78, p < 10 -16 ) with the Childhood Trauma Questionnaire (CTQ) scores and 0.71 (95%, CI = 0.68-0.73, p < .001) an US population 8 . The German version has also been tested and the convergent validity scores were found to be comparable (CTQ, r = 0.75, p < .001) 9 . Within the present investigation, ACE was quantified by a) an averaged MACE severity score indicating ACE across childhood and adolescence, (i.e. global ACE severity), and for each year of life, respectively (i.e. time-specific ACE severity) 8 . The scores range from 0 to 100. To address b) the conceptual framework of active and passive maltreatment 10,11 , we created two dimensions: Active maltreatment is represented by collapsing the subscales physical and sexual abuse (= abuse), while passive maltreatment is represented by collapsing the subscales physical and emotional neglect (= neglect). The scores have been averaged across childhood and adolescence, i.e. global abuse severity, and global neglect severity, as well as for each year of life, respectively i.e. time-specific abuse severity, and time-specific neglect severity. The neglect and abuse score ranges from 0 to 20.

1.2.2
Magnetic resonance imaging Data was collected using a Siemens 3 Tesla TRIO-Scanner (Siemens Medical Solutions, Erlangen, Germany) with a 12-channel head coil. Using three-dimensional magnetisation-prepared rapid-acquisition gradient echo (MPRAGE; T1-weighted contrast, TE: 30 ms; TR: 2000 ms; FA = 80°; FOV: 192 x 192 mm; number of slices 176, voxel size 1x1x1 mm³), a high-resolution anatomical scan was acquired for each participant. Head movement artefacts and scanning noise were restricted using head cushions and headphones.

1.2.3
Image processing Preprocessing of the anatomical T1 images was conducted in Statistical Parametric Mapping (SPM12; http://www.fil.ion.ucl.ac.uk/spm/), and images were segmented into grey matter volume (GM), white matter volume (WM), and cerebrospinal fluid (CSF). Whole brain volume of different compartments was determined by integrating all voxels of GM, WM volume and CSF images. Subsequently, the individual images were normalized to an IXI550 template (McConnell Brain Imaging Centre). The voxel values were modulated with the Jacobian determinant to preserve the amount of change during normalization. Additionally, regions of interest (ROI), i.e. the bilateral amygdala, hippocampus, and anterior cingulate cortex were defined using the WFU Pickatlas (http://fmri.wfubmc.edu/software/pickatlas). The volume of each ROI was estimated, via the integration of all voxel values within the ROI of the GM image. This was conducted for each subject and the estimated size of each ROI was related to the individuals total intracranial volume (GM+WM+CSF = TIV). Regional volumes corrected for TIV, as well as GM, and WM volume were extracted and exported into SPSS (version 23; SPSS Inc., USA), R (version 3.3.3, and Matlab (Matlab R2016b, Simulink) for statistical analyses. Brain volume estimates were further corrected for current age, i.e. the current age was regressed out and residuals were z-transformed and taken for further analyses.

1.3.2
Between Group Analyses To exploratory test, whether the presence of a PTSD diagnosis has an impact on the observed results, PTSD participants were contrasted to TC participants. Sample characteristics, i.e. sociodemographic variables (age, years of education), clinical characteristics (CTQ, DTS, MACE), were compared with t -statistics (Table S5). To test whether the groups differed with respect to the amount of ACE severity across the reported life-span, i.e. 3 up to 17 years of age a rmANOVA was applied with the between-subject factor 'group' and the within-subject factor 'age' (age 3 up to 17). To investigate, whether the amount of traumatization in relation to the type differed across the life-span between the groups, a rmANOVA with the between-subject factor 'group', and the withinsubject factor 'age' (3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17) and 'type' (abuse, neglect) was applied. Neuroimaging measures with respect to each TIV-adjusted regional ROI were analysed in separate rmANOVA with 'group' as between-subjects factor, and 'hemisphere' as within-subject factor and the covariate 'age'.

1.3.3
General information For further description of statistical effects in the ANOVA designs, post-hoc comparisons were calculated -if appropriate -by pairwise comparisons (Bonferroni-adjusted for multiple testing). Statistical significance was set to p < .05. All analyses were performed using SPSS (version 23; SPSS Inc., USA).

1.3.4
Sensitive period analysis To test the presence of sensitive periods in which exposure to ACE might be related to alterations in ROI brain volume, we applied random forest regression with conditional interference trees ('cforest' in R package 'party' 12,13 . This is a machine learning approach, in which an ensemble of unpruned regression trees (forest) is generated. This method is advantageous compared to conventional linear modelling to identify important predictors, as conditioned forest regression considers multicollinearity between predictor variables, does not require specific distribution assumptions or a definition of the relationship between the predictor and response, and can handle a large number of predictors modelling the outcome [13][14][15] . With respect to the concept how the forest is created, tree building particularly is based on the principle of recursive partitioning, meaning that the feature space (= space spanned by all predictor variables) is recursively partitioned in such that observations with similar response values are grouped 12,13 . Thus, smaller groups are generated, which are more homogenous with respect to the outcome. As a single decision tree provides a good fit to the data but is typically a weak predictor in regard to its generalizability, prediction in random forest regression is therefore improved by aggregating trees. Importantly, each tree in the forest is unique, as each tree is generated based on a subset of the entire dataset (bagging), while also the number of predictor variables available at each decision point is restricted. Predictive performance of the model is estimated on the sample that is left out (out-of-bag sample) and thus random forest regression provide an internal estimate, which has found to be highly correlated with either crossvalidation or test set estimates 14,15 . Importance of a given predictor is identified by the variable importance score (VI) 12,13 : The score refers to the decrease in model accuracy following the permutation of a given predictor variable. Thereby, if the permutation of a predictor variable causes model accuracy to decrease, it is considered "important", i.e. it has a higher VI score, while if permuting has no or little impact on model accuracy it is also not considered as "important". Each random forest model consisted of 500 trees with 4 variables randomly selected for decisions making at each node 16 . To identify, whether the magnitude of VI could have occurred by chance, we applied permutation tests in which the outcome measure (ROI volume) was permuted 1000 times and VI scores for each predictor were assessed 17 . P-values were determined in terms of the empirical distribution (by the fraction of permutation-based VI scores greater than the not permuted score) 17,18 . It has to be noted that random forest regression does not provide information on the nature of the relationship, as it is a machine learning algorithm aiming at the detection of relevant predictors, with no a priori assumption of the type of the relationship and thus also considering complex relationships (linear, nonlinear, interaction between predictor variables) . To illustrate the relationship, we therefore examined whether the identified predictor variables and brain volumes might be significantly linearly or quadratically related, while it has to be kept in mind that the relationship may be also more complex. To test the latter, we investigated whether the relationship between the identified ages and ROI volume could better be described by a linear or quadratic model. We set up two general linear models (GLM), one containing a single linear predictor variable, and the second containing an additional quadratic term. To test whether the quadratic term significantly added to the understanding of the relationship between brain volume and identified ages, we tested whether the amount of additional variance explained by the quadratic term (second model) was significant via the F-distribution 19 .

2.1
Maltreatment exposure history Traumatized subjects reported a history of prolonged traumatization during childhood and adolescence (number of years: mean =12.81, SD =3.42), while they were exposed to a variety of ACE types (mean =6.01, SD =2.34) ( Table S4) (for differences between PTSD and TC participants please see SI 2.2 and Table S5). A detailed characterization of the ACE severity revealed that the amount of traumatization differed across years of age (F(14,938) = 32.19, p < .001, Figure S1A): ACE severity at the beginning, i.e. age 3-6, as well as at the end of the recollected time span, i.e. age 15-17, was lower than during most of the remaining ages (p-values < .045). A significant interaction between severity of type and timing (type x age: F(14,938) = 7.84, p < .001, Figure S1B) revealed that participants reported higher neglect compared to abuse severity at age 3 (p < .01) and between 12 and 17 years of age (p's < .035). Abuse severity at the beginning, i.e. age 3-5 of the recollected life span and at the end of the recollected life span, i.e. age 15-17, was lower than for most of the remaining years of age (pvalues < .022). With respect to the neglect severity, the reported neglect at the beginning, i.e. age 3-5, as well as the end of the recollected life span, i.e. age 16, and 17 was lower than for most of the remaining years of age (pvalues < .039). For detailed comparisons, respectively, please see Table S1 (global ACE severity), Table S2 (global abuse severity) and Table S3 (global neglect severity).

Figure S1
Chronology of ACE regarding ACE severity (A.), and severity of ACE type, i.e. abuse (blue), and neglect (red) (B).

2.2
Maltreatment exposure history and clinical characteristics: Group Comparison For differences in socio-demographic and clinical characteristics please see Table S5. In general, both groups reported exposure to various trauma types (PTSD: AM = 6.83, SD = 2.24; TC: AM = 4.69, SD = 1.85), as well as exposure to maltreatment for a long time period (PTSD: AM = 13.79, SD = 2.76; TC: AM = 11.23, SD = 3.83). Contrasting both groups revealed that PTSD participants reported more trauma types, as well as a longer period of traumatization compared to trauma controls (Table S5). Contrasting both groups with respect to the global ACE severity across the recollected lifespan revealed that PTSD compared to TC individuals reported more ACE, while this effect was influenced by years of age (group: F(1,66) = 28.03, p < .001; group x age: F(14,924) = 3.00, p < .001). Taking the severity of type of ACE (global abuse severity vs global neglect severity) into account, while contrasting both groups revealed that groups differed with respect to global ACE severity in general (group: F(1,66) = 24.78, p < .001), while this was further a trend towards the influence of the type (group x type: F(1,66) = 3.67, p = .060): PTSD participants reported both, more abuse as well as neglect compared to SEVERITY OF TYPE AND TIMING OF ACE AND BRAIN VOLUME 5 TC participants (p < .001). While PTSD participants reported more neglect compared to abuse (p = .002), TCs did not differ regarding the recollected amount of abuse compared to neglect (p = .963).   F(1,66) = .06, p = .816, Figure S2 B).

2.4.1
Anterior Cingulate Cortex Sensitive period analyses revealed that time-specific ACE severity at 10 years of age was important in predicting left, while time-specific ACE severity at 3 years of age was important in predicting right ACC volume (for pvalues of VI scores and trends see Table S6). With respect to global predictors, global ACE severity was found to be an important predictor for left ACC volume, while the predictor group was found to be important in predicting right ACC volume by trend (Table S6).

Anterior Cingulate Cortex
Sensitive period analyses revealed that time-specific abuse severity at 7 years of age, and time-specific neglect severity at 3, and 4 years of age were important in predicting left, while time-specific neglect severity at 3, and 4 years of age were important in predicting right ACC volume (for p-values of VI scores and trends see Table S6). With respect to global predictors, global abuse severity was found to be an important predictor on a marginal significant level for left, and global neglect severity for right ACC volume, scores while the predictor group was found to be important in predicting right ACC volume only (Table S6).

2.6
Importance of the Severity of a specific ACE Type in Combination with Timing in Predicting Brain Volume To exploratory investigate whether the observed importance of neglect in predicting amygdala and hippocampal volume during specific time periods was mainly related to the inclusion of the severity of abuse into the type x timing model, we additional run separate random forest regression analyses including either a) the severity of neglect during specific time periods, or b) the severity of abuse during specific time periods as predictor variables in predicting amygdala or hippocampal volume.
2.6.1 Abuse 2.6.1.1 Amygdala Volume Sensitive period analyses revealed no time-specific ACE severity was important in predicting left, or right amygdala volume (for p-values of VI scores and trends see Table S9). With respect to global predictors, the predictor group was found to be important in predicting both left, and right amygdala volume (Table S9, Figure  S3 A).

2.6.1.2
Hippocampus Volume Sensitive period analyses revealed that time-specific abuse severity at age 16 was important in predicting left hippocampal volume (for p-values of VI scores and trends see Table S9). With respect to global predictors, the predictor group was found to be important in predicting right hippocampus volume (Table S9, Table S9). With respect to global predictors, none of the latter were found to be important in predicting left or right amygdala volume (Table S9, Figure S4 A).

Hippocampus Volume
Sensitive period analyses revealed that time-specific neglect severity at age 10 was important in predicting right hippocampus volume (for p-values of VI scores and trends see Table S9). With respect to global predictors, none of the latter were found to be important in predicting left or right hippocampal volume (Table S9, Figure S4 B).   Table S9. Sensitive period analysis and severity of ACE type, respectively, i.e. time-specific neglect severity and time-specific abuse severity, on ROI volume using random forest regression with conditional interference trees indicating significant of identified predictors based on randomized resampling.