Effects of eight neuropsychiatric copy number variants on human brain structure

Many copy number variants (CNVs) confer risk for the same range of neurodevelopmental symptoms and psychiatric conditions including autism and schizophrenia. Yet, to date neuroimaging studies have typically been carried out one mutation at a time, showing that CNVs have large effects on brain anatomy. Here, we aimed to characterize and quantify the distinct brain morphometry effects and latent dimensions across 8 neuropsychiatric CNVs. We analyzed T1-weighted MRI data from clinically and non-clinically ascertained CNV carriers (deletion/duplication) at the 1q21.1 (n = 39/28), 16p11.2 (n = 87/78), 22q11.2 (n = 75/30), and 15q11.2 (n = 72/76) loci as well as 1296 non-carriers (controls). Case-control contrasts of all examined genomic loci demonstrated effects on brain anatomy, with deletions and duplications showing mirror effects at the global and regional levels. Although CNVs mainly showed distinct brain patterns, principal component analysis (PCA) loaded subsets of CNVs on two latent brain dimensions, which explained 32 and 29% of the variance of the 8 Cohen’s d maps. The cingulate gyrus, insula, supplementary motor cortex, and cerebellum were identified by PCA and multi-view pattern learning as top regions contributing to latent dimension shared across subsets of CNVs. The large proportion of distinct CNV effects on brain morphology may explain the small neuroimaging effect sizes reported in polygenic psychiatric conditions. Nevertheless, latent gene brain morphology dimensions will help subgroup the rapidly expanding landscape of neuropsychiatric variants and dissect the heterogeneity of idiopathic conditions.


INTRODUCTION
Genomic copy number variants (CNVs) are deletions or duplications of DNA segments of more than 1000 base pairs. Rare CNVs with large effects have been associated with a range of often overlapping developmental psychiatric phenotypes and conditions, including autism spectrum disorder (ASD) and schizophrenia (SZ) [1][2][3][4]. A looming question in psychiatric genetics pertains to the underlying basis of polygenicity: How do different variants lead to risk for the same psychiatric condition? Some of the most frequent risk factors for neuropsychiatric disorders identified in pediatric clinics include CNVs at the 22q11.2, 16p11.2, 1q21.1, and 15q11.2 genomic loci [5,6]. They affect the dosage of 60, 29, 12 and 4 genes, respectively [7][8][9]. The largest increases in risk for SZ have been documented for the 22q11.2 deletion (30 to 40-fold) followed by 16p11.2 duplication (10-fold), 1q21.1 deletion and 15q11.2 deletion (1.5-fold) [2]. ASD risk is highest for 16p11.2 deletions and duplications (10-fold) followed by 1q21.1 duplications and 22q11.2 duplications (3 to 4-fold) [1,2,[10][11][12][13]. The nature and specificity of CNV effects on cognitive and behavioral traits is an area of intense investigation. All CNVs studied to date affect cognition to varying degrees and a broad range of cognitive functions [14,15]. A recent study found that the range of affected traits was broadly similar for 13 CNVs at 8 loci and specific genotypes accounted for a low proportion of phenotypic variance [3]. These variants are therefore opportunities to investigate brain phenotypes conferring high-risk for mental illness.
Neuroanatomical alterations associated with 16p11.2 and 22q11.2 show overlap with those observed in idiopathic ASD and SZ [17-19, 21, 25]. Finally, most of the effects are observed irrespective of psychiatric diagnoses and symptoms [12], suggesting that the final clinical outcome may result from the effect of CNVs and additional factors.
Neuroimaging studies across genomic variants are scarce. An investigation of 49 unaffected carriers of SZ-associated CNVs across five genomic loci in the UK biobank showed smaller volumes of the thalamus, hippocampus, and nucleus accumbens [26]. Functional connectivity similarities have also been demonstrated between 16p11.2 and 22q11.2 deletions as well as with idiopathic ASD and SZ [27]. Alternatively, a recent study suggests a relatively distinct association between neuroimaging alterations and six different CNVs [28].
In this study, we aimed to characterize shared and distinct neuroanatomical alterations associated with eight CNVs at four genomic loci. We analyzed high-resolution structural brain scans from the largest multi-site dataset of CNV carriers (n = 484, of which 87 have not yet been published) and controls (n = 1296) to date. Different approaches were implemented, from simple casecontrol contrasts to one-view and multi-view multivariate pattern learning [29,30]. First, we compared brain morphometry features associated with each deletion and duplication using univariate linear models. Second, we quantified the shared variation of brain morphometry associated with eight CNVs using principal component analysis (PCA). To complement this single-view approach, a multi-view pattern-learning algorithm was carried out for the joint analysis of genetic and morphometry brain data, to identify latent 'gene-morphometry dimensions' (canonical correlation analysis, CCA). Primary analyses were performed using VBM for consistency with previous studies [19]. In addition, we carried out the same multivariate analyses using freesurfer-derived cortical SA and thickness to ensure that shared variation was not limited to one neuroimaging modality or analytical pipeline.

METHODS Participants
Deletions and duplications carriers' neuroimaging data included in the study were selected on the following breakpoints ( Clinically ascertained CNV carriers were recruited as either probands referred for genetic testing, or as relatives. Controls were either noncarriers within the same families or individuals from the general population. We pooled data from five cohorts. CNVs from non-clinical populations were identified in the UK Biobank [31,32].

MRI data
Details for methods and analyses are provided in Supplementary material and Supplementary Methods 1-8. Data sample included T1-weighted (T1w) images at 0.8-1 mm isotropic resolution across all sites. Population description is available in Table 1 and Supplementary Table 1.

Data quality check
All data included in the analysis were quality checked by the same researcher (CM). A total of 107 structural brain scans from carriers and controls were excluded from further analysis based on visual inspection that identified significant artifacts compromising accurate tissue classification and boundary detection (Supplementary materials).

Statistical analysis for global brain measures
Global brain aggregate measures (TIV, total gray matter (GM) volume, total SA, and mean CT) were adjusted for age, age 2 , and sex as fixed effects and scanning site as random factor. Non-clinically ascertained subjects from the UKBB are on average 30 years older than the clinically ascertained subjects. Because of this age difference we used age matched control groups for univariate analysis. Global measure z-scores for each CNV for clinically and non-clinically ascertained CNVs were calculated using 331 and 965 controls, respectively. All statistical analyses were performed in R, version 3.4.4 (https://www.r-project.org/), or in MatlabR2018b.

Voxel-based measures and statistical analyses
We performed whole-brain voxel-based analysis testing for voxel-wise volume differences within the mass-univariate analysis framework implemented in SPM (Supplementary Method 4). Cohen's d (i.e., effect size) [38] maps were obtained by converting SPM T-maps using the CAT12 toolbox for SPM (http://www.neuro.uni-jena.de/cat/).

Surface-based measures and statistical analyses
In parallel to VBM, we used surface-based GLM-based analysis to test differences in CT and SA (SurfStat toolbox [39]).

Comparison of ranked Cohen's d maps across CNVs
To adjust for the unequal power to detect alterations across different CNV groups, which have different sample and effect sizes, we ranked the Cohen's d values of all voxels (/vertices) for each statistical maps (CNV versus controls contrast). We then tested for spatial overlap between maps across CNVs after thresholding the tails of the distribution at the 15th & 85th quantiles. Dice index was calculated using publicly available Matlab scripts (https://github.com/rordenlab/spmScripts).

Null hypothesis testing using spin permutations and label shuffling
We used spin permutation and label shuffling [40,41] to calculate empirical p values for (1) the deletion and duplication convergence pattern and (2) the correlation/dice index between two maps.

Quantifying shared variation across CNVs using principal components (PC)
PCs were derived to quantify shared morphometry variation across CNVs. We used Cohen's d values of 130 neuroanatomical GM regions (neuromorphometrics atlas) of eight CNVs as input-variables (z-scored Cohen's d contrasts adjusted for total GM and nuisance variables; FactoMineR package in R). The variance explained (coefficient of determination, R-squared) for each CNV-associated Cohen's d map by PCs was obtained by running a linear model (lm) in R; with PC1 and PC2 as independent explanatory variables and the CNV Cohen's d map as a dependent variable.

Jointly modeling of gene-morphology dimensions using CCA
We re-purposed CCA to simultaneously model the shared and distinct impact of the CNVs in causing distributed alterations in brain morphometry (130 grey matter regions) [29,30]. This principled doublymultivariate approach, widely used in neuroimaging studies [29,30], was performed to identify modes of coherent co-variation that jointly characterize how CNVs and patterns of regional volumes systematically co-occur across subjects. Henceforth, we refer to the ensuing modes of covariation as 'CCA dimensions' or 'gene-morphology dimensions'.

RESULTS
CNV effects on global brain morphometry Deletions and duplications of each genomic loci showed opposing effects on one or more global metrics: TIV, total GM volume, total SA, or mean CT (Fig. 1, Supplementary Table 2). The directionality of global effects differed across loci (Fig. 1a-c). Effects on GM and SA were less pronounced once adjusted for TIV ( Supplementary  Fig. 1).
Overlapping deletion effects on regional morphometry Whole-brain VBM analyses contrasting each deletion and duplication group with controls showed mostly distinct brain patterns across CNVs (Fig. 2a, c, e, Supplementary Table 3). To investigate potential overlap across the four genomic regions, we ranked Cohen's d maps and overlapped voxels with similar rankings. Using a threshold for voxels with Cohen's d < 15th and >85th percentiles separately (Fig. 3c, e, g, i), we observed significant overlap between deletions (p value SHUFFLE < 10e−4, Fig. 3a). Volumes of the middle and anterior cingulate extending to the supplementary motor cortex and of the cerebellum were 60.6 (7) 2/7 1.55(0.14) decreased in all deletions while volume was increased in the thalamus (Fig. 3a). Sensitivity analyses tested the effect of ascertainment and control groups: (1) We recomputed the deletion convergence map using 1q21.1 deletion carriers from UK Biobank instead of those clinically ascertained ( Table 1). The new deletion convergence map was similar to the initial one presented above with a dice index of 39.4% (p value SPIN < 10e−4); (2) We excluded all subjects with autism, SZ, or other psychiatric diagnoses. Again, this did not change the overlap (Supplementary Fig. 2); (3) We tested the effects of the control group by recomputing contrasts only using controls from the same site (instead of the initial ANOVA pooling all controls together and controlling for site). This again did not alter the convergence maps ( Supplementary Fig. 3). Finally, we performed the same analysis using Freesurfer-derived SA and CT measures. We also identified spatial overlaps but regions identified were different especially for CT (Supplementary Table  4  Overlapping duplication effects on regional morphometry Contrasts computed for duplications (Fig. 2b, d, f) showed smaller effect sizes compared to deletions. The same analysis using Cohen's d values <15th and >85th percentiles (Fig. 3d, f, h, j) demonstrated spatial overlap across all four duplications (p value SHUFFLE < 10e−4, Fig. 3b). The resulting pattern was mainly distinct from the one observed in deletions and was characterized by smaller volumes in anterior insula and frontal operculum, and larger volumes in the middle cingulate gyrus and supplementary motor cortex compared to controls.
Sensitivity analysis testing the effect of clinical ascertainment, psychiatric diagnoses, control groups, and volume versus Freesurfer-derived measures demonstrated that results were robust ( Supplementary Figs. 2-8).
The deletion/duplication ratio of Cohen's d distributions ranged from 1.24 to twofold across the four genomic loci (F-test, p < 10e −16 , Fig. 3c-j, Supplementary Table 7). Similar effect-size ratios were also observed for SA alterations (Supplementary Table 7), except for the 15q11.2 locus.
We tested opposing (mirror) effects on VBM contrast maps between deletion and duplications. The strongest anticorrelation of Cohen's d values was observed for 16p11.2 (p value SPIN < 10e −4) followed by 15q11.2 (p value SPIN < 10e−4), 1q21.1 (p value SPIN < 0.033) and 22q11.2 (p value SPIN < 0.038) (Supplementary Fig. 9 and Tables 8-10). Mirror effects were observed in clinically and non-clinically ascertained CNV carriers, as well as for SA at all four Fig. 1 1q21.1, 16p11.2, 22q11.2, and 15q11.2 exert rich effects on global brain measures. Total intracranial volume (a), total surface area (b), total grey matter volume (c) and mean cortical thickness (d) for clinically and non-clinically ascertained CNVs. Z scores for clinically and non-clinically ascertained CNVs were calculated using 331 and 965 controls, respectively, adjusting for age, age 2 , sex and site as a random factor. Y axis values are z scores. X axis are CNV groups. Significant difference between CNV group and corresponding control group is indicated with a star. Horizontal bars with stars show significant differences between deletions and duplications within the same locus. TIV total intracranial volume, SA surface area, GM grey matter, CT cortical thickness.
genomic loci but not for CT (Supplementary Tables 8-10). Hence mirror effects were observed in global metrics and, independently, in regional alterations.
Quantifying distinct and shared effects on brain morphometry associated with eight CNVs We performed a multivariate PCA based on Cohen's d profiles obtained from contrasts between the eight CNV groups and controls (using 130 neuromorphometric regional volumes, Supplementary Table 11). The first two components explained 31.8 and 28.7% of the variance of Cohen's d maps, respectively. The third component dropped to 13.8% and was therefore not investigated further.
Deletions and duplications at each genomic loci showed opposite loading on PC1 or PC2 (Fig. 4c). Regions with the highest loadings on PC1 and PC2 were also those identified in the convergence maps presented above: in particular the middle cingulate gyrus and the supplementary motor cortex. Anterior and posterior insula, cerebellum, fusiform gyrus and thalamus were also top regions altered across subsets of CNVs (Fig. 4a, b and Supplementary Table 12). The variance explained by both components for each CNV's Cohen's d map ranged from 27% to 82% (Fig. 4d). Finally, we performed the same analysis using Freesurfer-derived SA and CT measures which also provided latent dimensions with comparable variance explained, opposing loadings for deletions and duplications of each genomic loci ( Supplementary Fig. 10). However, CNV loadings differ across brain morphometry metrics.

Gene-morphology dimensions across eight CNVs
As a next step, we performed a multi-view pattern-learning analysis, jointly analyzing the genetic and morphometry brain data. This doubly multivariate method allowed testing whether shared dimensions could be identified in a data-driven approach, without performing any individual contrast. We interrogated 2 hypotheses: (1) CNVs show levels of shared brain effects at the Fig. 2 Cohen's d maps of VBM regional brain differences in deletion and duplication carriers at the 1q21.1, 16p11.2, and 22q11.2 loci compared to controls. Regional brain differences adjusted for total grey matter volume. Left and right columns show results for deletions (a, c, e) and duplication (b, d, f) carriers, respectively. Color maps show the significant effects of each CNV, thresholded at q < 0.05 FWE. Color scale represents positive and negative Cohen's d effect sizes were estimated. The linear model was adjusted for sex, linear, and quadratic expansion of age and total grey matter volume. 15q11.2 was not displayed because only a few voxels survived family-wise error (FWE) correction. Corresponding maps for surface area and cortical thickness are reported in Supplementary Figs. 4

and 5.
morphometry level and (2) deletions and duplications show opposing effects. We investigated the same 130 regional volumes in 484 carriers of CNVs at four genomic loci. To test hypothesis (2), deletions and duplications were coded as opposing gene dosage. CCA confirmed both hypotheses by identifying two significant 'gene-morphometry dimensions' (r = 0.84, 0.79, p value < 0.05, Fig.  4e, f). Regional brain contributions to canonical dimension 1 and 2 were well correlated with those of PC2 and 1, respectively (r = 0.83, r = −0.81).
Top ranking brain regions contributing to either of the two CCA dimensions of morphological variation included supplementary motor cortex, posterior and anterior insula, middle cingulate gyrus, calcarine cortex, cuneus and accumbens (Supplementary Fig. 11 and Supplementary Table 13). 16p11.2 and 22q11.2 preferentially contributed to dimension 1 and 2 respectively, and 1q21.1 loaded similarly on both dimensions. 15q11.2 CNVs showed the smallest loadings on both dimensions (Fig. 4e).

DISCUSSION
Here, in the largest cross-CNV-neuroimaging study to date, we tested potentially shared effects of eight neuropsychiatric CNVs on brain morphometry. CNVs showed a combination of distinct and shared profiles of brain alterations, as demonstrated by the spatial overlap of Cohen's d maps across deletions and duplications. A multivariate approach (PCA) quantified distinct and shared alterations across subsets of CNVs and identified two latent dimensions explaining 31.8 and 28.7% of Cohen's d map's variance. A second multivariate approach (CCA), jointly analyzing genetic and morphometry data, confirmed the latent CNV-brain dimensions identified by PCA. Genomic loci contributed to the latent CCA dimensions in proportion to their effect sizes. Even for small effect-size deletions at the 1q21.1 and 15q11.2 loci, the PCA components explained between 43 and 65% of their Cohen's d profile. All three approaches-spatial overlap, CCA, and PCAidentified a similar set of regions altered by CNVs including the cingulate gyrus and supplementary motor cortex.

Distinct and shared effects of CNVs
Our results show that two-thirds of the average CNV effects on brain morphometry are distinct. This is consistent with a recent study showing relative specificity of association between brain patterns of gene expression and patterns of cortical anatomy changes across six CNVs and chromosomal aneuploidies [28]. Onethird of the effects on brain morphometry is shared as demonstrated by latent gene-morphology dimensions identified across subsets of CNVs. There is no single dimension explaining CNV effects. Instead, subsets of CNVs load on either dimension, which may suggest similar brain mechanisms within subgroups of CNV. Yet CNVs within subgroups were not characterized by the same risk for ASD or SCZ.
These results have implications for our conceptualization of polygenic psychiatric conditions. Indeed, studies estimate that 70-100% of any 1-MB window in the human genome encompasses variants (including CNVs) contributing to increased risk for SZ and autism [4,42]. Gene-morphology dimensions alone, can not explain the fact that subgroups of CNVs are associated with a similar range of behavioral symptoms [43], and psychiatric disorders [1,2,4,44]. In fact, the large proportion of distinct CNVneuroimaging effects suggests that a broad diversity of brain mechanisms increase the risk for autism and SZ. Extreme examples include CNVs associated with opposing loadings on the same latent gene-morphology dimension while increasing risk for the same psychiatric condition (ie. 16p11.2 deletions, duplications, and autism). The presence of such genomic variants in studies of ASD and SZ may explain heterogeneity and small neuroimaging effect sizes [45,46]. Why opposing effects on the same latent brain dimension increase risk for the same psychiatric condition is an unsolved question. Further observations on a broad variety of genomic variants are required to address this question.
Brain hubs vulnerable to altered gene dosage Insula, cingulate, fusiform gyrus, and hippocampus are regions showing alterations across SZ, bipolar disorders, major depression, and obsessive-compulsive disorders [45,47]. The cingulate, insula, and fusiform gyrus were also among regions markedly altered across eight CNVs. CNVs have either negative or positive effects on these brain regions, however, the number of CNVs included in this study did not allow us to associate the directionality of these effects with phenotypic traits. Alterations of the cingulate cortex have been associated with genetic and environmental risk for SZ [48]. The supplementary motor cortex has been shown to play a critical role in 16p11.2, 22q11.2 CNVs as well as autism and SZ by functional connectivity studies, but not by cross-diagnostic neuroimaging structural studies [49,50]. Several cerebellar regions (vermis lobule VIII-X and cerebellar cortex) are highly sensitive to CNVs, which may be due to the cerebellum's protracted development [51]. The cerebellum has either been excluded or not reported by cross-disorder structural neuroimaging studies, but volume alterations have been associated with autism and SZ separately [52,53]. Multiple genetic mouse models of autism, as well as Down Syndrome, also show abnormal cerebellar development [54]. The same level of spatial overlap was observed for SA and CT but implicated mostly distinct sets of brain regions. This is in line with the distinct genetic contributions previously demonstrated for these cortical metrics [55].
Dissociation between global and regional effects Results suggest that global and local effects may be mechanistically unrelated. 1q21.1 deletions and duplications highlight the contrast between very large effects on global measures, with small Fig. 4 Principal component analysis and canonical correlation analysis of brain alterations due to eight CNVs. a PCA dimension 1 and 2 regional relevances projected on axial brain slices. The darker the red or blue color, the stronger the positive or negative association with the PCA dimensions. PCA was run on z-scored Cohen's d values, with the eight CNVs as variables and 130 neuroanatomical GM regions as observations. GM region volumes were adjusted for total grey matter, age, age 2 , sex, and site. The first two components explained respectively 31.77 and 28.66 % of the variance. b Loading of eight CNVs on the two PCA dimensions. Values are PC loading magnitudes and represent the contribution of a CNV to the PC. c Variance explained (coefficient of determination, R-squared) of each CNV Cohen's d profile by PC1 and PC2. Values and color scale represent the "percent of variance". d Loadings of the first and second CCA dimension on four CNV genomic loci. Shows contribution of a CNV loci to the canonical dimension. e Loading of Neuromorphometrics Regions of Interests (ROIs) on the two PCA dimensions. ROIs are averaged across the left and right hemisphere for visualization. The font size is correlated to the region's contribution to dimensions. ROI names are color coded as being part of the deletion (red), duplication (blue) and both deletion and duplication (magenta) convergence patterns. f Scatterplot showing the participant/specific expressions of each of the 484 carriers of eight different CNVs along two dominant gene-morphometry canonical correlation (CC) dimensions established using 130 neuroanatomical GM regions of CNV carriers. GM region volumes were adjusted for total grey matter, age, age 2 , sex, and site. The empty and full symbols represent deletions and duplication, respectively. The grey hexagonal bin plot represents the frequency of controls (n = 1296). Controls were not used to calculate the CCA and were projected post hoc on the two dimensions using CCA prediction. CCA ROI loadings are reported in Supplementary Fig. 10. Results for surface area and cortical thickness are reported in Supplementary Fig. 9 (PCA), 14-15 (CCA).
regional effects once adjusted for total GM. Dissociation is also observed between the directionalities of global and regional effects: all deletions are associated with a smaller cingulate and supplementary motor cortex volume irrespective of their effect on TIV and GM. Animal studies have proposed mechanisms for global [8,56], but not regional effects of CNVs.

Limitations
Multiple sites included in the study may have introduced noise, but previous studies have shown that site effects do not influence the neuroanatomical patterns associated with CNVs at the 16p11.2, 22q11.2, and 15q11.2 loci [12,19,23]. While shared variation could have been influenced by clinical ascertainment or psychiatric diagnoses, our sensitivity analyses showed that this is not the case. The effect of medication on CNVs brain alterations could not be investigated in the current study as medication information was not available for the whole dataset. We were underpowered to properly investigate potential sex-related effects of 1q21.1 and 15q11.2 on brain morphometry. Of note, previous neuroimaging studies of large 22q11.2 and 16p11.2 samples were unable to identify any sex-related effects [19,25].
15q11.2 deletions and duplications have small effect sizes and larger samples would improve the accuracy of the brain morphometry signature. Systematic analysis through the two most widespread computational neuroanatomy frameworks (voxel-based and surface-based) shows that effects could not be attributed to the processing pipeline. Extending our approach to the rapidly expanding number of rare genomic variants associated with psychiatric disorders is required to draw a robust conclusion on the distinct and shared effects of CNVs on brain structure.

CONCLUSIONS
The simultaneous analyses and comparisons of several genomic variants demonstrate distinct CNV-associated alteration profiles as well as shared latent gene-morphology dimensions relevant to subsets of CNVs. Large proportions of distinct effects may provide some answers to the small neuroimaging effect sizes reported in idiopathic psychiatric conditions. The mechanisms underlying the identified latent dimensions remain unknown and pathway convergence may occur early on at the transcriptome and protein level, or at later stages (i.e., brain architecture or behavior). The hotly debated omnigenic model postulates that convergence may occur at early stages due to highly interconnected cell regulatory networks [57]. These approaches may help subgroup genomic variants based on their morphometry signature and dissect the heterogeneity of psychiatric conditions.