Cortical and subcortical brain structure in generalized anxiety disorder: findings from 28 research sites in the ENIGMA-Anxiety Working Group

The goal of this study was to compare brain structure between individuals with generalized anxiety disorder (GAD) and healthy controls. Previous studies have generated inconsistent findings, possibly due to small sample sizes, or clinical/analytic heterogeneity. To address these concerns, we combined data from 28 research sites worldwide through the ENIGMA-Anxiety Working Group, using a single, pre-registered mega-analysis. Structural magnetic resonance imaging data from children and adults (5–90 years) were processed using FreeSurfer. The main analysis included the regional and vertex-wise cortical thickness, cortical surface area, and subcortical volume as dependent variables, and GAD, age, age-squared, sex, and their interactions as independent variables. Nuisance variables included IQ, years of education, medication use, comorbidities, and global brain measures. The main analysis (1020 individuals with GAD and 2999 healthy controls) included random slopes per site and random intercepts per scanner. A secondary analysis (1112 individuals with GAD and 3282 healthy controls) included fixed slopes and random intercepts per scanner with the same variables. The main analysis showed no effect of GAD on brain structure, nor interactions involving GAD, age, or sex. The secondary analysis showed increased volume in the right ventral diencephalon in male individuals with GAD compared to male healthy controls, whereas female individuals with GAD did not differ from female healthy controls. This mega-analysis combining worldwide data showed that differences in brain structure related to GAD are small, possibly reflecting heterogeneity or those structural alterations are not a major component of its pathophysiology.


INTRODUCTION
Research on brain structure in generalized anxiety disorder (GAD) has generated inconsistent findings, possibly due to small sample sizes as well as clinical and analytic heterogeneity. The Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) collaboration addresses these challenges in a range of disorders by pooling neuroimaging data across research sites worldwide [1][2][3][4][5]. Here, we employed the ENIGMA approach to investigate differences between individuals with GAD and healthy controls in indices of brain structure in a report from the ENIGMA-Anxiety Working Group [6]. We conducted a structural magnetic resonance imaging (MRI) mega-analysis 1 using data from 28 www.nature.com/tp Translational Psychiatry research sites worldwide. The current study compared regional and vertex-wise cortical thickness, cortical surface area, and subcortical volume in individuals with GAD and healthy controls, using methods that accommodate data heterogeneity across the research sites.
GAD is a highly prevalent and impairing anxiety disorder notable for its relationship to multiple forms of psychopathology [7,8]. Where most anxiety disorders develop in late childhood, the median age of onset of GAD is in adulthood [9]. Like other diagnoses, GAD is characterized by clinical heterogeneity, as individuals with GAD could display many different symptoms profiles. Moreover, most individuals with GAD suffer from at least one other mental disorder, particularly other anxiety disorders, major depressive disorder (MDD), and substance use [7,8,10]. Longitudinal and family studies show that genetic risks of GAD overlap in part with those of MDD and other anxiety disorders [11][12][13][14]. Most prior structural MRI studies rely on voxel-based morphometry (VBM) and reported altered gray matter volume in a wide variety of brain regions in individuals with GAD compared to healthy controls [15][16][17]. Some, but not others, showed increased gray matter volume in the amygdala and prefrontal cortex (PFC) as well as decreased gray matter volume in the hippocampus [15][16][17]. Findings on cortical thickness and the surface area appeared similarly inconsistent [17][18][19][20]. One potential explanation for this inconsistency is clinical heterogeneity.
Small sample sizes and analytical heterogeneity across individual studies may also generate inconsistent findings. The ENIGMA collaboration provides a solution to these problems, by facilitating the pooling of neuroimaging data across multiple research sites [5,21]. This is typically done using meta-analyses, where each participating research site first processes and analyzes their local data through a previously agreed common pipeline [1][2][3]. While this approach addresses concerns with small sample sizes and analytic heterogeneity, it uses pooled data for each site. This prevents the modeling of covariates (such as comorbid disorders) at the individual subject level. The current study addressed the latter problem through a mega-analysis, which can be more powerful than meta-analyses [22], and which allows modeling of covariates at the subject rather than site-averaged level. The mega-analytic approach is used less frequently, as working with individual participant data creates methodological challenges (e.g., study planning and implementation, international transfer of data, quality control of large amounts of data) and requires more computational resources than site-averaged data [23].
The current study assembled raw structural MRI data from 28 research sites, and conducted a pre-registered data analysis [24]. We compared regional and vertex-wise cortical thickness, cortical surface area, and subcortical volume between individuals with GAD and healthy controls while examining interactions with age and sex. Based on prior studies [15][16][17], we hypothesized that individuals with GAD would show differences in subcortical volume in the amygdala and hippocampus and in cortical thickness and surface area in the PFC compared to healthy controls. We also expected the association between GAD and structural measures to differ as a function of participant age, but we had no specific hypotheses on the direction of this interaction as previous studies examined the effect of GAD within and not across age groups [19,20]. The analysis in the present work used a whole-brain approach, while accounting for the multiple tests defined in the pre-registration [24].

MATERIALS AND METHODS Participants
The current study is a pre-registered mega-analysis of structural MRI data that had been collected at 28 research sites and repositories from Brazil, Europe, and the USA [24]. As ENIGMA-GAD is an ongoing collaboration, new research groups are encouraged to join. Some site-specific results have been reported before, including from the National Institute of Mental Health (NIMH) team leading the current project [16,18,[25][26][27][28][29]. However, no reports have examined results across these and additional samples using a pre-registered plan. Twenty-five ENIGMA-GAD sites sent raw individual participant MRI data. Additionally, raw structural MRI data were downloaded from three publicly available imaging repositories to increase sample size and thus, allow more stable estimates of eventual effects: Adolescent Brain Cognitive Development Study (ABCD) [30,31], Child Mind Institute Healthy Brain Network (CMI-HBN) [32], and Duke Preschool Anxiety Study [33]. All 25 research sites signed an individual data use agreement with the NIMH that included regulations about data use, subject identification, data transfer methods, data ownership, and confidentiality and security practices [23]. Data use guidelines of the repositories were followed. All adult participants and parents of child participants provided written informed consent at their local research site, and the individual research protocols were approved by local institutional review boards and ethics committees.
Data were included if individuals were diagnosed with current or past GAD 2 , not necessarily as the primary diagnosis. Exclusion criteria for individuals with GAD were current or past autism spectrum disorders, bipolar disorder, psychosis, or schizophrenia. These decisions regarding inclusion and exclusion reflected past results from ENIGMA, where robust differences in morphometry were found in studies of the excluded conditions [1,4]. Comparison subjects were excluded if they had any current or past mental disorder. Diagnoses were based on standardized interviews with a clinician at each research site (see Bas-Hoogendam et al. (2020) for an overview).
We received data from 5523 participants before pre-registration [24] ( Table 1 shows the number of participants in each step of the analysis). There were some small changes to this number after pre-registration and before pre-processing of the data (see Supplementary Information for the exact numbers and reasons per site for the differences). Table 2 shows the reasons for excluding data. The main pre-registered analysis with random slopes for all independent variables per site and random intercepts per scanner included 1020 individuals with GAD (685 females, M age = 23.65 years, SD age = 13.15) and 2999 healthy controls (1617 females, M age = 14.76 years, SD age = 10.01), ranging from 5 to 90 years (Fig. 1). Table 3 shows descriptive statistics, Table 4 comorbid diagnoses for individuals with GAD, and Table 5 medication status for participants included in this main analysis. More sites and participants could be included in the secondary analysis with fixed slopes for all independent variables and random intercepts per scanner: 1112 individuals with GAD (753 females) and 3282 healthy controls (1805 females), ranging from 5 to 90 years (M = 18.47, SD = 12.72). These additional participants were from sites that had sample sizes that were too small to allow modeling random slopes (see statistical analysis); these sites could only be included with fixed slopes. Supplementary Table 1 shows the descriptive statistics, Supplementary Table 2 the comorbid diagnoses for individuals with GAD, and  Supplementary Table 3 medication status for the participants who were included in this secondary analysis. The three imaging repositories consisted of data from multiple scanners: ABCD (29 scanners), CMI-HBN (4 scanners), and Duke (2 scanners). In addition, two sites also contributed data from multiple scanners: Brazilian High Risk Cohort Study (BHRCS; 2 scanners) and Section on Development and Affective Neuroscience (SDAN; 4 scanners).

Non-imaging data
All research sites were asked to provide information with respect to several variables of possible interest, such as demographic information (age, sex, IQ, education in years), diagnoses, and information from a clinical interview concerning anxiety (GAD, social anxiety disorder [    a These participants were classified as patients in the "number of images of high quality" in Table 1, but did not have a GAD diagnosis.
b For 12 participants in the initial number of images there were no behavioral data available, 5 of these participants had images of high quality.
c These participants were classified as HC in the "number of images of high quality" in Table 1, but had a diagnosis other than GAD.
Baylor and CMI-HBN). If the information on medication was missing for all participants within a site, medication was not included as an independent variable in the analyses for that site.

Image processing
All raw structural MRI images that were received were organized according to the Brain Imaging Data Structure (BIDS) specification and MRI Quality Control (MRIQC) [34] was used for quality checking. All images were subsequently processed with FreeSurfer version 6.0.0 [35] to compute regional measures of cortical thickness, cortical surface area, and subcortical volume. For participants with multiple images available, we selected the image with the highest quality based on the Euler number [36], which is calculated separately for left and right hemispheres. To compare a single value across multiple images, we first selected the worst (farthest from zero, lowest quality) Euler number per image. Then, we selected the image with the best (closer to zero, highest quality) Euler number. All data were visually inspected for gross over-or underestimation of the white/pial surfaces (largely due to motion artifacts). We also performed a semi-automated quality checking of the data by using the ratio between the Euler characteristic and the number of vertices in the surfaces before topology correction, defining site-specific thresholds using a ROC curve constructed using the results of the visual inspection [23]. We resampled the cortical measurements of thickness and area to an icosahedron recursively subdivided four times (fsaverage4), which was used as a common grid for interpolation [37]. Table 2 shows the number of participants excluded based on visual and automatic quality checking.

Statistical analysis
We compared cortical thickness, cortical surface area, and subcortical volume between individuals with GAD and healthy controls, and examined interactions with age and sex. The dependent variables in the main analysis were cortical thickness and surface area of the 68 regions of the Desikan-Killiany parcellation [38], as well as subcortical volumes for 16 subcortical regions [35]. Two sets of independent variables were considered, each in its own model. The first set consisted of GAD, sex, age, age-squared, and their interactions, with covariates comprising IQ, years of education, medication use at the time of the scan, each of the comorbid disorders (SAD, PD, AG, SPH, MDD, OCD, PTSD, SUD), and scanner. The second model was the same as the first, but further included global brain measures (i.e., total surface area, mean cortical thickness, and total intracranial volume) as nuisance variables. Both models in this main analysis used random slopes (per site) and random intercepts (per scanner); see Supplementary Table 4 for an overview of the analysis. Variance groups, one per scanner, were used to accommodate the possibility of different variances across scanners (heteroscedasticity; smallest variance group had two observations). Together with permutation testing, this eschews the need for explicit data harmonization. We tested six contrasts per model: main effect of GAD (positive and negative), twoway interaction between GAD and sex (positive and negative), two-way interaction between GAD and age, and the three-way interaction between GAD, age, and sex. The linear and quadratic effects of age were combined using an F-test. All analyses were performed using the software Permutation Analysis of Linear Models (PALM) 3 with 500 permutations. The p-values were computed after fitting a generalized Pareto distribution to the tail of the permutation distribution [39] thus dispensing with the need of performing a computationally prohibitive large number of permutations. We repeated this main analysis with vertex-wise cortical surface area and thickness as dependent variables (2562 vertices). Independent variables, variance groups, contrasts, and the number of permutations remained the same. We used family-wise error rate (FWER) correction to address multiple testing. Correction considered all tests within each modality (i.e., 68 cortical regions each for cortical thickness and surface area, and 16 subcortical volumes), all three sets of modalities, and all 12 contrasts. As the correction considers all sets of modalities (or dependent variables) and all contrasts, it is termed MC-FWER (family-wise error rate across modalities and contrasts) [40]. Results at lower levels of correction for multiple testing (e.g., only within a modality, or only across contrasts) are reported in the Supplementary Information (Supplementary Figs. 1 and 2).
All sites provided information on GAD, age, and sex, but the inclusion of the other independent variables varied across sites according to data availability (see Supplementary Table 5 for an overview of the exact independent variables included per site). Participants with missing values in the independent variables (exact variables differed per site) were excluded. Ultimately, the main analysis included 1020 individuals with GAD and 2999 healthy controls. Because 192 participants had to be excluded from the main analysis due to missing IQ and/or education in years, we repeated the main analysis with these two variables removed for all sites; the respective results for the regional and vertex-wise data are reported in the Supplementary Information.
In addition, we ran a secondary analysis with the same dependent and independent variables, but this time using fixed slopes across sites, while keeping the random intercepts per scanner. This analysis allowed the inclusion of more sites and participants, but it assumes that effects are the same (fixed) across all sites. This secondary analysis included 1112      A. Harrewijn et al. Table 5. Descriptive statistics for medication at scan for all sites that were included in the main analysis.

Main analysis
This study compared regional and vertex-wise cortical thickness, cortical surface area, and subcortical volume between individuals with GAD (n = 1020) and healthy controls (n = 2999) while also examining interactions between GAD, age, and sex. The analysis modeled random slopes for all independent variables per site and random intercepts per scanner. No effects of GAD, nor interactions between GAD, age, or sex on the regional and vertex-wise cortical surface area, cortical thickness, and subcortical volume were significant (see Figs. 2, 3 and Supplementary Figs. 3, 4 for vertexwise effect sizes). The results remained non-significant when analyses were performed (a) with only the basic independent variables (GAD, age, sex, and their interactions) and (b) when adding the interaction between GAD and medication. The results for the main effects of medication and comorbid disorders and the interaction between GAD and medication were also nonsignificant.

Secondary analysis
The secondary analysis included more participants (1112 individuals with GAD and 3282 healthy controls) and implemented approaches more similar to those in other reports from the ENIGMA collaboration. These analyses included fixed slopes for all independent variables and random intercepts per scanner. For the regional data, a significant negative interaction was found between GAD and sex in the volume of the right ventral diencephalon (R 2 = 0.006, p MC-FWER = 0.0496 for the whole model fit), in the model without global brain measures as nuisance variables. Male individuals with GAD showed greater volume in the right ventral diencephalon compared to male healthy controls, whereas there was no difference between the groups for females (Fig. 4). The same secondary analysis with fixed slopes for all independent variables and random intercepts per scanner was performed for vertex-wise cortical surface area and thickness data. There were no significant effects of GAD, nor interactions between GAD, age, or sex.

DISCUSSION
The current study compared regional and vertex-wise cortical thickness, cortical surface area, and subcortical volume between individuals with GAD and healthy controls. Data from 28 sites were Fig. 2 Effect sizes from the main analysis for vertex-wise cortical surface area with the design that included global brain measures as nuisance variables. None of these was statistically significant.
A. Harrewijn et al. combined in a pre-registered analysis that used random slopes and random intercepts to model cross-site heterogeneity. The main analysis showed no effect of GAD on indices of brain structure, nor interactions among GAD, age, or sex. We also conducted a secondary analysis with fixed slopes and random intercepts. This secondary analysis included more sites and thus more participants. This secondary analysis indicated that males with GAD have greater volume, on average, in the right ventral diencephalon compared to healthy males, whereas female individuals with GAD and healthy females did not differ. Regional and vertex-wise indices of brain structure did not differ between individuals with GAD and healthy controls in the main analysis after multiple comparison corrections. When we did not fully correct for multiple testing, by ignoring the multiplicity of contrasts, there was an interaction between GAD and sex in the left lateral orbitofrontal cortex surface area (Supplementary Results). Prior studies have shown mixed results on the effect of GAD on cortical thickness and surface area [18][19][20], whereas some studies using VBM have revealed altered gray matter volume in the PFC, amygdala, and hippocampus [15][16][17]. Small sample sizes and analytical and clinical heterogeneity may account for differences across studies. Here, we leveraged a mega-analysis to mitigate these challenges and found no effect of GAD when accounting for comorbid disorders. The null finding in this study might indicate that these indices of brain structure do not differentiate individuals with GAD from healthy controls. In contrast, ENIGMA studies on MDD have shown significant differences between individuals with MDD and healthy comparisons in hippocampal volume and cortical thickness in bilateral medial OFC, cingulate cortex, insula, and temporal lobes. This could indicate that MDD is more related to structural brain differences than GAD, but this should be confirmed in future studies combining data. Future mega-analyses in GAD could focus on other imaging modalities (e.g., resting-state fMRI, task-based fMRI) or finer imaging phenotypes (e.g., subfields, shape analysis), combine data across imaging and other data types, or use structural covariance analysis or other higher-order constructs for better group differentiation. Some of these analyses have already been started within the ENIGMA-Anxiety Working Group [6,41].
The secondary analysis with fixed slopes and random intercepts indicated that male individuals with GAD had, on average, greater volume in the right ventral diencephalon compared to male healthy controls, whereas there was no difference between groups for females. The effect size was relatively small, which may explain why this effect arose only in the secondary analysis with more participants and fewer variables in the model. The ventral diencephalon includes the hypothalamus 4 , which plays an important role in the neuroendocrine stress response [42]. Previous studies of GAD found lower hypothalamic volumes [43], and these volumes were negatively associated with anxiety severity in healthy adults [44]. However, these findings focus on the hypothalamus specifically rather than the broader ventral diencephalon region [43], and these samples included mostly females with GAD. Our finding could represent an example of the "gender paradox" hypothesis. This hypothesis posits that across psychiatric disorders, the less frequently affected sex is the one that manifests more severe features of the disorder [45]. This finding is also in line with studies showing differences in structural connectivity in boys but not girls with anxiety disorders [46].
The age range in this sample was large. Some of the largest sites (ABCD, BHRCS) contributed mainly data from young healthy controls, even though the median onset of GAD is in adulthood. We accounted for this by including age and interactions with age in the analysis. Additionally, quadratic effects of age were added because age effects might not be linear [47]. However, nonlinearities that were not modeled could have influenced the data. There might be differences between childhood-onset and adult-onset GAD, but not enough data on the age of onset was available to investigate this further. Hence, interpretation should be in the light of the composition of the sample, which is not a random draw from any specific population, and the data available for analysis. Future mega-analyses could try to collect more detailed clinical data to further investigate childhood-onset and adult-onset GAD.
A few limitations should be noted. First, individuals with current and lifetime GAD were grouped together in the analysis, which could have increased heterogeneity in the GAD group. In addition, the distinction between individuals between current and lifetime GAD might be particularly difficult in the 9-10-year-old children from the ABCD data set. Only 12.9% of the individuals had a diagnosis of lifetime (and not current) GAD at the time of the scan and the results of the main analysis did not change when only individuals with current GAD were included. Second, methods for collecting imaging and non-imaging data differed across research sites. Even though we have accounted for this in the analysis by including random intercepts and slopes per site, it is possible that residual site-specific non-linear effects may still have been present in the data. Third, the results of the secondary analysis with fixed slopes and random intercepts are mostly influenced by the larger samples, such as the ABCD data set (n = 1451). However, the main analysis with random slopes and random intercepts is robust to this type of bias. Fourth, variance groups were not taken into account when estimating effect sizes, so the effect sizes could be diminished. Fifth, data quality might be different between sites, which could influence the results, despite the fact that we took certain aspects of heterogeneity into account in the analyses.
To summarize, there was no effect of GAD on regional or vertexwise cortical thickness, cortical surface area, and subcortical volume, nor interactions among GAD, age, or sex. This is in line with inconsistent findings from prior studies and the clinical heterogeneity of GAD. The secondary analysis showed an interaction between GAD and sex in the ventral diencephalon. Male individuals with GAD showed greater volume in the right ventral diencephalon compared to male healthy controls, whereas there was no detectable difference between female individuals with GAD and healthy controls. Together, these findings show that associations between indices of brain structure and GAD are small, underscoring the subtlety of its effects and perhaps also the clinical heterogeneity of GAD as a phenotype. Showing these null results in a large mega-analysis is important to inform future studies on GAD to focus on other neuroimaging modalities and/or other phenotyping approaches that favor dimensionality.

1.
A meta-analysis involves the computation of a statistic from several cohorts, prior to merging the statistics into an overall estimate of effect size for a variable of interest. A megaanalysis involves a centralized analysis of individual-level data across a range of cohorts, modeling the effect of each cohort and using all the available data to estimate an overall effect size. 2. We repeated the analyses with only individuals with current GAD (n = 881; one participant from the ABCD data set had to be excluded, because they were the only participant from one scanner, resulting in a variance group of 1 observation). Similar to the results from individuals with both current and lifetime GAD, the main analysis showed no significant effects of GAD, nor significant interactions between GAD, age, or sex on the regional and vertex-wise cortical surface area, cortical thickness, and subcortical volume. In addition, the secondary analysis with 982 individuals with current GAD also revealed an interaction between GAD and sex in the volume of the right ventral diencephalon, R 2 = 0.007, p MC-FWER = 0.038 (for the whole model fit). However, in contrast to the analysis with individuals with both current and lifetime GAD, the vertex-wise secondary analysis Fig. 4 An interaction between generalized anxiety disorder (GAD) and sex in volume in the right ventral diencephalon was observed; male individuals with GAD showed greater volume (mm 3 ) compared to male healthy controls, whereas there was no difference between the groups for females. Figure shows data after nuisance variables have been considered (residuals). Average volume of the right ventral diencephalon across individuals with GAD and HC: 3988.6 mm 3 . Note: Error bars reflect standard error.
revealed an interaction between GAD and age in cortical surface area in one vertex in the superior frontal gyrus in the model with global brain measures (−21.23, 30.13, 48.52; coordinates from FreeSurfer's FreeView). 3. https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/PALM. 4. Other ventral diencephalon structures include the mammillary bodies, subthalamic nuclei, substantia nigra, red nucleus, lateral and medial geniculate nuclei. Some white matter structures such as the zona incerta, crus cerebri, lenticular fasciculus, and the medial lemniscus are also included in this region, as well as segments of the optic tract.

CODE AVAILABILITY
Code for data cleaning and analysis will be made available upon request.