Dose response of the 16p11.2 distal copy number variant on intracranial volume and basal ganglia

Carriers of large recurrent copy number variants (CNVs) have a higher risk of developing neurodevelopmental disorders. The 16p11.2 distal CNV predisposes carriers to e.g., autism spectrum disorder and schizophrenia. We compared subcortical brain volumes of 12 16p11.2 distal deletion and 12 duplication carriers to 6882 non-carriers from the large-scale brain Magnetic Resonance Imaging collaboration, ENIGMA-CNV. After stringent CNV calling procedures, and standardized FreeSurfer image analysis, we found negative dose-response associations with copy number on intracranial volume and on regional caudate, pallidum and putamen volumes (β = −0.71 to −1.37; P < 0.0005). In an independent sample, consistent results were obtained, with significant effects in the pallidum (β = −0.95, P = 0.0042). The two data sets combined showed significant negative dose-response for the accumbens, caudate, pallidum, putamen and ICV (P = 0.0032, 8.9 × 10−6, 1.7 × 10−9, 3.5 × 10−12 and 1.0 × 10−4, respectively). Full scale IQ was lower in both deletion and duplication carriers compared to non-carriers. This is the first brain MRI study of the impact of the 16p11.2 distal CNV, and we demonstrate a specific effect on subcortical brain structures, suggesting a neuropathological pattern underlying the neurodevelopmental syndromes.

Large effect-size CNVs conferring risk for neurodevelopmental disorders including major psychiatric disorders [35] are rare (<0.25% in frequency). Assembling sufficiently powered MRI samples to detect effects of rare CNVs on brain morphometry is challenging. For instance, in the Icelandic population [2] and the UK Biobank [36], the frequencies of the 16p11.2 distal deletion are 0.019% and 0.012%, respectively. Likewise, the reciprocal duplication is found at a frequency of 0.038% and 0.030%, respectively. Hence, studying rare pathogenic CNVs like 16p11.2 distal calls for collaborative efforts. The ENIGMA-CNV consortium has collected a sample that currently includes 16,046 subjects with CNV and brain MRI data.
In recent studies, 'mirrored phenotypes' were described in 16p11.2 distal CNV allele carriers for both weight [17,31] and head circumference [17]. On average, deletion carriers had increased BMI and head circumference, whereas duplication carriers had lower weight and smaller head circumference. Here we investigated gene dose response effects (i.e., effects dependent on the number of genomic copies at the 16p11.2 distal locus) on brain structural measures, including subcortical brain volumes, total surface area, mean cortical thickness and intracranial volume (ICV) in n = 6906 participants from primarily nonfamilial population samples, in addition to clinical cohorts, to resolve CNV effects relative to a general population average.

Discovery sample description
Supplementary

CNV calls and validation
See Supplementary note 2 for details on CNV calling and quality control. In short, carriers in the 16p11.2 European consortium cohort were identified based on report from the cytogeneticist. All other cohorts had CNVs called in a unified manner using PennCNV [37]. Appropriate population frequency (PFB)-files (Human Genome Build NCBI36/ hg18) and GC (content)-model files for each data set were selected from the PennCNV homepage (Supplementary  Table 1).
Samples were filtered based on standardized quality control metrics and CNVs overlapping the region of interest (16p11.2 distal BP2-BP3 and BP1-BP3) were identified and visualized with the R package iPsychCNV. The minimally affected 16p11.2 distal region was covered well by all the arrays in the study (Supplementary Figure 1)

Statistical analysis
Imaging data processing and CNV calling were performed locally whereupon downstream analysis was performed centrally in a mega-analysis with de-identified data.
The primary analysis for this paper focused on the full set of subjects including family members and data sets with patients to maximize the power to detect effects. Only one of a pair of duplicates was kept. Individuals with a minimum overlap of 0.4 to regions (R package iPsychCNV) with known pathogenic CNVs (Supplementary Table 3) were excluded from the analysis regardless of copy number status. Only scanner sites with individuals carrying a 16p11.2 distal deletion or duplication were included. See Supplementary Note 4 for description of control analyses excluding either (a) individuals with an established neurodevelopmental diagnosis, (b) children below age 18, (c) first and second-degree relatives or (d) carriers of the 1.7 MB 16p11.2 distal-proximal (BP1-BP5) CNV or (e) matching each carrier with four controls or (f) testing the effect of ancestry.
Brain measures were normalized in R 3.2.3 by an inverse normal transformation of the residual of a linear regression on the phenotype correcting for covariates. The final covariance-corrected values (covariates = age, age squared, sex, scanner site and ICV) were used in downstream analysis and are reported for each measure. ICV was not included as a covariate in the analysis of ICV. For analytic purposes, total cortical surface area and total average thickness were normalized in the same way as subcortical volumes. We also performed analysis excluding ICV from the covariates.
For the dose response analysis (i.e., the effect on brain structure of 16p11.2 distal copy number variation), a linear regression on the copy number state of the individuals (deletion = 1, normal = 2, duplication = 3) was performed using the following model: covariance-corrected brain measure~copy number (deletion = 1, non-carrier = 2, duplication = 3).
For comparison between groups, a two sample two-sided t-test assuming equal variance in all carrier/non-carrier groups was employed (R 3.3.2) where deletion or duplication carriers were compared either to each other or to noncarriers. Results were considered statistically significant if they exceeded a Bonferroni-corrected P-value (P = 0.05/10 regions = 0.005). We report the uncorrected p-values throughout the manuscript.
Effect size is calculated as the absolute effect size (the difference in mean between the two copy number groups in the t-test-which, in this case, equals Cohen's d as the standard deviation of the normalized brain measures is one) and the estimate of beta in the linear regression. Plots were generated using R library ggplot2 v2.2.1 [39].

deCODE replication sample
An independent sample of three 16p11.2 distal deletion and, six duplication carriers, as well as 832 non-carriers was obtained from deCODE Genetics, Iceland. CNVs were called with PennCNV as described previously and visually inspected. All 16p11.2 distal carriers were of the minimal 16p11.2 distal (BP2-BP3) CNV type. The individuals were scanned at one scanner site as previously described [7]. The statistical analysis was performed as for the primary discovery sample.

Meta-analysis
A fixed effects model was used to generate summary effect size estimates using a restricted maximum likelihood estimator in the R-package metafor-package [40] (version 1.9-9) using the effect size and calculated SD (for comparison between groups) or standard error (for dose response) from the discovery and replication sample as input. More details can be found in Supplementary Note 5.

IQ, BMI and gene expression analysis
Individuals aged 18-65 years were recruited for cognitive phenotyping based on a large genotyped sample from deCODE. The Icelandic version of the Wechsler Abbreviated Scale of Intelligence (WASIIS) [41,42] was administered to 1693 non-carriers and all CNV carriers except one deletion carrier. Another 455 controls and one deletion carrier were tested with two subtests, Vocabulary and Matrix Reasoning, from the Wechsler Adult Intelligence Scale (WAIS-III) [43]. More details on tests are available in Supplementary Note 6. Carriers of known pathogenic CNVs (Supplementary Table 3) besides 16p11.2 distal as well as individuals with neurodevelopmental or psychiatric diagnoses were excluded from the analysis. IQ data were not normally distributed and, consequently, the nonparametric Kruskal-Wallis test (R, v3.2.3) was used to test differences in IQ between carrier groups. To test pairwise differences (deletion carriers versus non-carrier-controls, duplication carriers versus non-carrier-controls, deletion versus duplication carriers), we used Wilcoxon rank test in R. We applied a significance threshold of 0.05, without correction for multiple testing since this was secondary analyses. For description of BMI and gene expression analysis, see Supplementary Note 7.

Study participants
In the ENIGMA-CNV discovery data set, we identified 12 16p11.2 distal deletion carriers and 12 duplication carriers scanned at 14 MRI scanners, and 6882 non-carriers investigated at the same MRI scanners. Demographic data are shown in Table 1. Most CNV carriers exhibited the minimal 16p11.2 distal CNV type (BP2-BP3) (Fig. 1), four CNVs were of the extended type (BP1-BP3) and six CNVs extended into the 16p11.2 proximal region (BP1-BP5) (Supplementary Table 2, Supplementary Figure 1). None of the participants carried additional known pathogenic CNVs (Supplementary Table 3).
Of 24 CNV carriers, 10 had an established neurodevelopmental diagnosis (Supplementary Table 2). The remaining carriers either did not have one or were recruited in studies from which diagnostic information was not available (Supplementary Table 2, Table 1).
There was a significant age difference between the groups (ANOVA, P = 0.003); the non-carriers were older (mean age 43.5 years) in comparison to the deletion (27.8 years) and duplication carriers (31.2 years). In addition, an established neurodevelopmental diagnosis was found in a significantly smaller proportion of non-carriers (4.9%) in comparison to deletion (58%) and duplication carriers (25%) ( Table 1). Brain imaging results in the discovery sample After correction for age, age squared, sex and scanner site, we found a significant negative correlation between the number of 16p11.2 distal copies (deletion = 1, non-carrier = 2, duplication = 3) and ICV (β = −0.71, P = 5.1 × 10 −4 ) ( Table 2, Fig. 2a) after correction for multiple testing (significance threshold P < 0.005 = 0.05/10 brain structures analysed), showing smaller ICV in duplication carriers compared to deletion carriers. The uncorrected ICV plotted against age stratified by scanner site are shown in Fig. 2b.
We evaluated whether the 16p11.2 distal CNV affected seven subcortical (accumbens, caudate, putamen, pallidum, amygdala, hippocampus and thalamus) and two cortical (total surface area and mean cortical thickness) phenotypes. After adjusting subcortical and cortical volumes for age, age squared, sex, scanner site and ICV, the volumes of caudate, pallidum and putamen were negatively associated with the number of 16p11.2 distal copies with significance at the multiple testing threshold (β = −0.87, P = 2.0 × 10 −5 ; β = −1.06, P = 2.2 × 10 −7 and β = −1.37, P = 1.8 × 10 −11 , respectively) ( Table 2, Fig. 2a). Plotting the unadjusted volumes of caudate, pallidum and putamen against the age of participants revealed a consistent pattern (deletion carriers with larger and duplication carriers with smaller subcortical volumes in comparison to non-carriers) for all scanner sites for putamen and pallidum volume and almost at all sites for caudate ( Supplementary Figure 3a-c). This shows that our findings are robust and not dependent upon a few scanner sites.
To test for the presence of nonlinear differences between deletion and duplication carriers, and deletion or duplication carriers and non-carriers, we conducted individual t-tests between these groups. We confirmed a negative dose response with increasing copy number for the volumes of the caudate, pallidum, putamen and for ICV but no additional structures revealed significant associations at significance threshold of <0.005 (Table 3, Supplementary  Table 6 To confirm the validity of the results, we carefully checked for the impact of removing subjects known to carry a neurodevelopmental diagnosis, children below age 18, first and second-degree relatives or CNV carriers whose CNVs extended into the 16p11.2 proximal region (Supplementary Tables 5 and 6). Likewise, we redid the analysis matching each carrier with four non-carriers for sex, age, diagnosis status (with/without neurodevelopmental diagnosis) and scanner site and finally in a separate analysis we controlled for population stratification in cohorts with available ancestry (Supplementary Table 7). None of these analyses changed the main results. The effect size (β of the linear regression) is presented. A linear regression based on the copy number state of the individuals (deletion = 1, noncarrier = 2, duplication = 3) was performed on normalized brain measures correcting for age [2], age, sex and scannersite (and ICV) in the ENIGMA-CNV (discovery) and deCODE (replication) cohorts. Results were considered statistically significant if they were below a Bonferronicorrected P-value of 0.005 (0.05/10 regions). A final effect size estimate of the combined sample was obtained using a fixed effects meta-analysis framework CI confidence interval, Q statistics for the test for heterogeneity, p(Q) p-value for the test for heterogeneity, I2 heterogeneity levels *P < 0.005 **P < 0.0005

Replication in an independent cohort
We performed replication of the subcortical findings in an Icelandic MRI sample (deCODE) comprising 841 individuals (3 deletion and 6 duplication carriers, 832 non-carriers) ( Table 1). The negative correlation between the number of 16p11.2 distal copies and the volume of pallidum was confirmed (β = −0.95, P = 0.0042) (Fig. 3, Table 2) at a significance threshold of <0.005. For volumes of the caudate, putamen and for ICV, effects were in the same direction as in the discovery sample, albeit not significant (β = −0.46, P = 0.17; β = -0.70, P = 0.034; β = −0.54, P = 0.10, respectively) (Fig. 3, Table 2). Apart from cortical surface area, we observed the same direction of effect in the replication sample as in the discovery sample ( Table 2, Fig. 3). For nonlinear differences, all directions of effect were the same for subcortical volumes in the discovery and replication data sets (Table 3, Supplementary Figure 4a-c). The combined analysis of the ENIGMA and the deCODE samples is shown in Table 2 and Fig. 3. Volumes of caudate, pallidum, putamen and ICV decreased with increasing number of 16p11.2 distal copies (β = −0.76, P = 8.9 × 10 −6 ; β = −1,03, P = 1.7 × 10 −9 ; β = −1.19, P = 3.5 × 10 −12 ; β = −0.66, P = 1.0 × 10 −4 , respectively). In the combined analysis, the volume of the accumbens also revealed a significant association with the 16p11.2 distal copy number (β = −0.54, P = 0.0032) (Tables 2 and 3, Fig. 3, Supplementary Figure 4a-c). Fig. 2 Measures of caudate, pallidum, putamen and ICV show a dose response to differences in copy number in the 16p11.2 distal region. All analyses were corrected for age, age squared, sex, scanner site and ICV (except for ICV). Deletion carriers (del) in red, non-carriers (con) in grey and duplication carriers (dup) in blue, respectively. a Boxplots of subcortical volumes, surface area and thickness and ICV. The normalized brain values are presented. Boxplots represent the mean. Significant differences after Bonferroni correction between groups are noted as *P < 0.005, **P < 0.0005. Centre line represents median, box limits are the upper and lower 25% quartiles, whiskers the 1.5 interquartile range and the points are the outliers. b Bivariate plot of age versus uncorrected ICV

Intelligence quotient (IQ)
Full scale IQ data were available for four 16p11.2 distal deletion and twelve duplication carriers and 2148 noncarriers from the Icelandic sample. None of these individuals had an established neurodevelopmental diagnosis, or other known pathogenic CNVs (as defined by Supplementary Table 3). Analysis showed a significant difference in IQ between groups (P = 0.0042). Both deletion (median IQ = 68.5) and duplication carriers (median IQ = 93) presented a significantly lower IQ (P = 0.011, P = 0.035) than noncarriers (median IQ = 101.5) (Supplementary Table 9

Body mass index (BMI)
BMI data for mega-analysis were available for six cohorts from ENIGMA-CNV counting seven deletion and seven duplication carriers in addition to 1880 individuals without a 16p11.2 distal CNV (Supplementary Table 9). BMI zscores were different between the carrier groups (Kruskal-Wallis, P = 0.009, Supplementary Figure 6, Supplementary Table 10). Duplication carriers had

Discussion
This is the first study to determine the brain structure underpinnings of the 16p11.2 distal recurrent CNVs. We found a common denominator for 16p11.2 distal carriers across different clinical phenotypes in a dose response effect of copy number on the volumes of basal ganglia (caudate, putamen and pallidum) ( Table 2, Figs. 2 and 3). The observed associations were independent of the presence of neurodevelopmental diagnosis and the ancestry of participants (Supplementary Tables 5 and 7). These effects were consistent in the independent replication sample (Fig. 3, Tables 2 and 3). Together with the result of lower IQ in carriers, these findings provide new insight into genetic mechanisms of brain structures and pathobiological processes involved in neurodevelopmental disorders. There are nine genes in the core 16p11.2 distal (BP2-BP3) region. We tested the expression level of these Fig. 3 Forest plots on the dose response of copy number on subcortical volumes, surface area, thickness and ICV. The effect size (β of the linear regression) at each site for each measure is shown by the position on the x-axis. Standard error is shown by the horizontal line. A summary polygon shows the results when fitting a random-effects model to the two groups: ENIGMA-CNV discovery and deCODE replication samples. del, con and dup denote the number of individuals in each analysis. *P < 0.005, **P < 0.0005. Effect size and confidence intervals are to the right in blood in available transcript data from two of our deletion carriers (BP1-BP3) and compared with 234 non-carriers (Supplementary Figure 7, Supplementary Table 11). Several transcripts of the nine genes were relatively decreased in blood. Due to the low numbers, only trends can be suggested from these data but we find the down-regulation of LAT gene expression to~65% in 16p11.2 distal carriers (Supplementary Figure 7, Supplementary Table 11) particular interesting due to recent results in zebrafish: These showed that of the nine genes in 16p11.2 distal, only overexpression of LAT induced a decrease in cell proliferation in the brain with a concomitant microcephaly phenotype [44]. In parallel, LAT knockout mice showed brain anatomy changes [44]. According to the Allen Brain Atlas, LAT shows the highest expression in cerebellum and structures of the basal ganglia (data not shown). Thus, high expression of LAT overlaps with the position of brain structural changes identified in the present study. This further implicates LAT, an immune signaling adaptor, as a possible dosagedependent driver of the CNV-associated brain phenotypes including the basal ganglia.
By comparing the effect of a range of CNVs on the brain, it is possible to identify patterns of effects related to the genes involved, and thus learn about biological mechanisms. All three CNVs previously shown to have an effect on ICV, 16p11.2 proximal [6,8], 22q11 [9] and Williams Syndrome [45], have concomitantly identified an effect on either cortical surface area and/or cortical thickness. We observed no effect on cortical surface area or cortical thickness in 16p11.2 distal carriers ( Table 2, Figs. 2 and 3). This does not rule out the presence of smaller effects on individual cortical areas. However, it may suggest a different impact on brain development mechanisms between these three CNVs [6,8,9,45] and 16p11.2 distal.
CNVs in the two neighboring regions 16p11.2 distal and 16p11.2 proximal show overlapping phenotypes: they both dispose to various neurodevelopmental diseases and both show a positive dose response for head circumference and weight [17]. In addition, we found negative dose response effects for the 16p11.2 distal CNV on ICV, putamen and caudate volumes with effect size estimates comparable to those previously reported for the 16p11.2 proximal CNV [8]. Recently, a lymphoblastoid cell line study of chromosomal interaction in the 16p11.2 region suggested that the two adjacent 16p11.2 distal and 16p11.2 proximal regions ( Fig. 1) interact [17]. The identified brain commonalities could further support a mechanism in which the similar phenotypic patterns are caused by disruption of the chromatin structure surrounding the entire 16p11.2 region [17,32].
In this study, the CNVs (3 deletions and 3 duplications) of six carriers extend into the adjacent 16p11.2 proximal region. Redoing the analysis with exclusively 16p11.2 distal carriers did not result in a change in effect size (Supplementary Tables 5 + 6), suggesting that 1.7 MB distalproximal (BP1-BP5) CNV carriers are not the main cause of the signal. A previous analysis from Loviglio et al. [44] suggests an additive effect of two 16p11.2 regions (distal + proximal) on human head circumference and weight. Together with our data, this indicates both separate and overlapping effects for the two CNVs and underlines the importance of studying specific CNVs independently despite overlapping phenotypes.
Interestingly, deletions and duplications of both 16p11.2 proximal and distal CNVs are associated with ASD [17,19]. However, only the proximal duplication and the distal deletion are associated with schizophrenia [20,22]. This difference in phenotype association between these bordering CNVs may indicate specific differences in the pathological mechanisms of ASD and schizophrenia.
We observed a decrease in absolute effect sizes for putamen and pallidum after removing individuals with a neurodevelopmental diagnosis (Supplementary Table 5). This is consistent with the enlargements of putamen and pallidum associated with duration of illness in schizophrenia [46], which may partly reflect the cumulative effect of antipsychotic medication on basal ganglia volumes [47]. The observed dose response effect on ICV is in agreement with previous findings on head circumference [17], as are the dose response effect on BMI (Supplementary Figure 6, Supplementary Table 10) [17,31]. One of the strengths of this study is the inclusion of non-clinical samples allowing for estimates closer to the actual carrier population. Unfortunately, the small number of CNV cases does not provide enough power to investigate preferential alterations in deletion or duplication carriers.
To conclude, the present findings of negative doseresponse effects of copy number on ICV and volumes of caudate, pallidum and putamen, with no effect on cortical measures, suggest a specific effect on basal ganglia structures of the 16p11.2 distal CNV. These results provide novel insight into genetic factors determining basal ganglia volumes and suggest specific pathobiological mechanisms involved in the development of neurodevelopmental disorders.  We wish to acknowledge IDIVAL Neuroimaging Unit for imaging acquirement and analysis.We want to particularly acknowledge the patients and the BioBankValdecilla (PT13/0010/0024) integrated in the Spanish National Biobanks Network for its collaboration. QTIM: The QTIM study was supported by grants from the US National Institute of Child Health and Human Development (R01 HD050735) and the Australian National Health and Medical Research Council (NHMRC) (486682, 1009064). Genotyping was supported by NHMRC (389875). Lachlan Strike is supported by an Australian Postgraduate Award (APA). AFM is supported by NHMRC CDF 1083656. We thank the twins and siblings for their participation, the many research assistants, as well as the radiographers, for their contribution to data collection and processing of the samples. . We acknowledge the technical support and service from the Genomics Core Facility at the Department of Clinical Science, the University of Bergen for the 16p11.2 European Consortium; for the ENIGMA-CNV working group Ida Elken Sønderby (NORMENT, K.G. Jebsen Centre for changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.