Carriers of large recurrent copy number variants (CNVs) are at increased risk for developing autism spectrum disorders (ASD), schizophrenia or intellectual disability [1]. While the same CNV may confer risk for each of these neurodevelopmental disorders, carriers show remarkable phenotypic variability [2,3,4,5]. Magnetic resonance imaging (MRI) can help unravel possible underlying brain consequences associated with carrying these CNVs and provide novel insight into neuropathological mechanisms [2, 6,7,8,9,10].

The neural correlates of a recurrent CNV at the distal 16p11.2 locus have remained unexplored, despite being potentially very informative. Low copy repeats at the 16p11.2 locus drive the formation of recurrent CNVs (Fig. 1) [5, 11,12,13,14,15] whose carriers experience increased risk for various neurodevelopmental disorders [3, 16,17,18,19,20,21,22,23,24,25] or somatic traits and diseases [17, 26,27,28,29,30]. Within the 16p11.2 region, the segment with breakpoints (BP) at 28.3 and 28.9 Mb (hg18, BP1–BP3) is referred to as the distal region. Within this region, there is a minimal core segment from 28.7 to 28.9 Mb (hg18, BP2–BP3) (Fig. 1) that contains nine genes. The deletion is associated with obesity [26, 27], intellectual disability [26] and schizophrenia [20, 22] and the duplication has been associated with lower body mass index (BMI) [17, 31]. Both 16p11.2 distal CNVs are associated with autism spectrum disorder [17] and have been found in individuals with epilepsy [23]. Several studies have been published about the microstructural effect on the brain [6, 8, 32] and cognition [30, 33, 34] of the 16p11.2 proximal CNV (29.5–30.1 MB) (Fig. 1). In contrast, the biological basis of the 16p11.2 distal phenotypes, including a possible effect on brain structure and cognition remains unknown.

Fig. 1
figure 1

Recurrent CNVs in the 16p11.2 region. CNVs are indicated with reddish lines. All coordinates (in MB) are from the human genome build hg18/NCBI 36. This study includes CNVs overlapping the core 16p11.2 distal region (BP2–BP3) of 220 kb (blue box). These CNVs include the 16p11.2 distal BP2–BP3 (~220 kb), the 16p11.2 distal BP1–BP3 (~550 kb), the 16p11.2 distal BP1–BP4 (~800 kb) and the 16p11.2 distal-proximal BP1–BP5 (~1.7 MB) CNVs

Large effect-size CNVs conferring risk for neurodevelopmental disorders including major psychiatric disorders [35] are rare (<0.25% in frequency). Assembling sufficiently powered MRI samples to detect effects of rare CNVs on brain morphometry is challenging. For instance, in the Icelandic population [2] and the UK Biobank [36], the frequencies of the 16p11.2 distal deletion are 0.019% and 0.012%, respectively. Likewise, the reciprocal duplication is found at a frequency of 0.038% and 0.030%, respectively. Hence, studying rare pathogenic CNVs like 16p11.2 distal calls for collaborative efforts. The ENIGMA-CNV consortium has collected a sample that currently includes 16,046 subjects with CNV and brain MRI data.

In recent studies, ‘mirrored phenotypes’ were described in 16p11.2 distal CNV allele carriers for both weight [17, 31] and head circumference [17]. On average, deletion carriers had increased BMI and head circumference, whereas duplication carriers had lower weight and smaller head circumference. Here we investigated gene dose response effects (i.e., effects dependent on the number of genomic copies at the 16p11.2 distal locus) on brain structural measures, including subcortical brain volumes, total surface area, mean cortical thickness and intracranial volume (ICV) in n = 6906 participants from primarily non-familial population samples, in addition to clinical cohorts, to resolve CNV effects relative to a general population average.

Material and methods

Discovery sample description

Supplementary Table 1 contains information on study design, participants, genotyping array, PFB-file and reference for previous description on individual inclusion and exclusion parameters for all 34 world-wide data sets in ENIGMA-CNV; altogether 16,046 individuals with CNV and MRI imaging data from the ENIGMA consortium. The 16p11.2 distal sample consisted of a subset of these individuals with twelve deletion carriers, twelve duplication carriers and 6882 non-carriers from eleven different cohorts and 14 scanner sites collected up until 1 July 2017. More demographic details are supplied in Supplementary Note 1 and Supplementary Table 2 (on CNV carriers).

CNV calls and validation

See Supplementary note 2 for details on CNV calling and quality control. In short, carriers in the 16p11.2 European consortium cohort were identified based on report from the cytogeneticist. All other cohorts had CNVs called in a unified manner using PennCNV [37]. Appropriate population frequency (PFB)-files (Human Genome Build NCBI36/hg18) and GC (content)-model files for each data set were selected from the PennCNV homepage (Supplementary Table 1).

Samples were filtered based on standardized quality control metrics and CNVs overlapping the region of interest (16p11.2 distal BP2–BP3 and BP1–BP3) were identified and visualized with the R package iPsychCNV. The minimally affected 16p11.2 distal region was covered well by all the arrays in the study (Supplementary Figure 1). No carriers carried other genomic imbalances (as defined by Supplementary Table 3) except six 16p11.2 distal-proximal CNV (BP1–BP5) carriers from the 16p11.2 European consortium sample (Supplementary Table 2).

Image acquisition and processing

Supplementary Table 4 outlines-specific technical details concerning scanners and acquisition parameters. The brain measures examined were obtained from structural MRI data collected at participating sites and processed locally following the ENIGMA protocol. The analysis was based on standardized image analysis, FreeSurfer, quality assurance and statistical methods as per the harmonized neuroimaging protocols developed within ENIGMA2 [38] and ENIGMA3 ( More details are supplied in Supplementary Note 3.

Statistical analysis

Imaging data processing and CNV calling were performed locally whereupon downstream analysis was performed centrally in a mega-analysis with de-identified data.

The primary analysis for this paper focused on the full set of subjects including family members and data sets with patients to maximize the power to detect effects. Only one of a pair of duplicates was kept. Individuals with a minimum overlap of 0.4 to regions (R package iPsychCNV) with known pathogenic CNVs (Supplementary Table 3) were excluded from the analysis regardless of copy number status. Only scanner sites with individuals carrying a 16p11.2 distal deletion or duplication were included. See Supplementary Note 4 for description of control analyses excluding either (a) individuals with an established neurodevelopmental diagnosis, (b) children below age 18, (c) first and second-degree relatives or (d) carriers of the 1.7 MB 16p11.2 distal-proximal (BP1–BP5) CNV or (e) matching each carrier with four controls or (f) testing the effect of ancestry.

Brain measures were normalized in R 3.2.3 by an inverse normal transformation of the residual of a linear regression on the phenotype correcting for covariates. The final covariance-corrected values (covariates = age, age squared, sex, scanner site and ICV) were used in downstream analysis and are reported for each measure. ICV was not included as a covariate in the analysis of ICV. For analytic purposes, total cortical surface area and total average thickness were normalized in the same way as subcortical volumes. We also performed analysis excluding ICV from the covariates.

For the dose response analysis (i.e., the effect on brain structure of 16p11.2 distal copy number variation), a linear regression on the copy number state of the individuals (deletion = 1, normal = 2, duplication = 3) was performed using the following model: covariance-corrected brain measure ~ copy number (deletion = 1, non-carrier = 2, duplication = 3).

For comparison between groups, a two sample two-sided t-test assuming equal variance in all carrier/non-carrier groups was employed (R 3.3.2) where deletion or duplication carriers were compared either to each other or to non-carriers. Results were considered statistically significant if they exceeded a Bonferroni-corrected P-value (P = 0.05/10 regions = 0.005). We report the uncorrected p-values throughout the manuscript.

Effect size is calculated as the absolute effect size (the difference in mean between the two copy number groups in the t-test—which, in this case, equals Cohen’s d as the standard deviation of the normalized brain measures is one) and the estimate of beta in the linear regression. Plots were generated using R library ggplot2 v2.2.1 [39].

deCODE replication sample

An independent sample of three 16p11.2 distal deletion and, six duplication carriers, as well as 832 non-carriers was obtained from deCODE Genetics, Iceland. CNVs were called with PennCNV as described previously and visually inspected. All 16p11.2 distal carriers were of the minimal 16p11.2 distal (BP2–BP3) CNV type. The individuals were scanned at one scanner site as previously described [7]. The statistical analysis was performed as for the primary discovery sample.


A fixed effects model was used to generate summary effect size estimates using a restricted maximum likelihood estimator in the R-package metafor-package [40] (version 1.9-9) using the effect size and calculated SD (for comparison between groups) or standard error (for dose response) from the discovery and replication sample as input. More details can be found in Supplementary Note 5.

IQ, BMI and gene expression analysis

Individuals aged 18–65 years were recruited for cognitive phenotyping based on a large genotyped sample from deCODE. The Icelandic version of the Wechsler Abbreviated Scale of Intelligence (WASIIS) [41, 42] was administered to 1693 non-carriers and all CNV carriers except one deletion carrier. Another 455 controls and one deletion carrier were tested with two subtests, Vocabulary and Matrix Reasoning, from the Wechsler Adult Intelligence Scale (WAIS-III) [43]. More details on tests are available in Supplementary Note 6. Carriers of known pathogenic CNVs (Supplementary Table 3) besides 16p11.2 distal as well as individuals with neurodevelopmental or psychiatric diagnoses were excluded from the analysis. IQ data were not normally distributed and, consequently, the non-parametric Kruskal–Wallis test (R, v3.2.3) was used to test differences in IQ between carrier groups. To test pairwise differences (deletion carriers versus non-carrier-controls, duplication carriers versus non-carrier-controls, deletion versus duplication carriers), we used Wilcoxon rank test in R. We applied a significance threshold of 0.05, without correction for multiple testing since this was secondary analyses. For description of BMI and gene expression analysis, see Supplementary Note 7.


Study participants

In the ENIGMA-CNV discovery data set, we identified 12 16p11.2 distal deletion carriers and 12 duplication carriers scanned at 14 MRI scanners, and 6882 non-carriers investigated at the same MRI scanners. Demographic data are shown in Table 1. Most CNV carriers exhibited the minimal 16p11.2 distal CNV type (BP2–BP3) (Fig. 1), four CNVs were of the extended type (BP1–BP3) and six CNVs extended into the 16p11.2 proximal region (BP1–BP5) (Supplementary Table 2, Supplementary Figure 1). None of the participants carried additional known pathogenic CNVs (Supplementary Table 3).

Table 1 Demographic data, discovery and replication data

Of 24 CNV carriers, 10 had an established neurodevelopmental diagnosis (Supplementary Table 2). The remaining carriers either did not have one or were recruited in studies from which diagnostic information was not available (Supplementary Table 2, Table 1).

There was a significant age difference between the groups (ANOVA, P = 0.003); the non-carriers were older (mean age 43.5 years) in comparison to the deletion (27.8 years) and duplication carriers (31.2 years). In addition, an established neurodevelopmental diagnosis was found in a significantly smaller proportion of non-carriers (4.9%) in comparison to deletion (58%) and duplication carriers (25%) (Table 1).

Brain imaging results in the discovery sample

After correction for age, age squared, sex and scanner site, we found a significant negative correlation between the number of 16p11.2 distal copies (deletion = 1, non-carrier = 2, duplication = 3) and ICV (β = −0.71, P = 5.1 × 10−4) (Table 2, Fig. 2a) after correction for multiple testing (significance threshold P < 0.005 = 0.05/10 brain structures analysed), showing smaller ICV in duplication carriers compared to deletion carriers. The uncorrected ICV plotted against age stratified by scanner site are shown in Fig. 2b.

Table 2 Dose response of 16p11.2 distal copy number on subcortical volumes
Fig. 2
figure 2

Measures of caudate, pallidum, putamen and ICV show a dose response to differences in copy number in the 16p11.2 distal region. All analyses were corrected for age, age squared, sex, scanner site and ICV (except for ICV). Deletion carriers (del) in red, non-carriers (con) in grey and duplication carriers (dup) in blue, respectively. a Boxplots of subcortical volumes, surface area and thickness and ICV. The normalized brain values are presented. Boxplots represent the mean. Significant differences after Bonferroni correction between groups are noted as *P < 0.005, **P < 0.0005. Centre line represents median, box limits are the upper and lower 25% quartiles, whiskers the 1.5 interquartile range and the points are the outliers. b Bivariate plot of age versus uncorrected ICV

We evaluated whether the 16p11.2 distal CNV affected seven subcortical (accumbens, caudate, putamen, pallidum, amygdala, hippocampus and thalamus) and two cortical (total surface area and mean cortical thickness) phenotypes. After adjusting subcortical and cortical volumes for age, age squared, sex, scanner site and ICV, the volumes of caudate, pallidum and putamen were negatively associated with the number of 16p11.2 distal copies with significance at the multiple testing threshold (β = −0.87, P = 2.0 × 10−5; β = −1.06, P = 2.2 × 10−7 and β = −1.37, P = 1.8 × 10−11, respectively) (Table 2, Fig. 2a). Plotting the unadjusted volumes of caudate, pallidum and putamen against the age of participants revealed a consistent pattern (deletion carriers with larger and duplication carriers with smaller subcortical volumes in comparison to non-carriers) for all scanner sites for putamen and pallidum volume and almost at all sites for caudate (Supplementary Figure 3a-c). This shows that our findings are robust and not dependent upon a few scanner sites.

To assess non-specific associations, we re-analyzed subcortical volumes without correcting for ICV. As expected, the absolute effect sizes of copy number on the volumes of caudate, pallidum and putamen increased (β = −1.18, P = 6.8 × 10−9; β = −1.27, P = 4.7 × 10−10 and β = −1.57, P = 1.4 × 10−14, respectively) and the association with the volumes of the rest of subcortical structures, except amygdala and hippocampus, became significant (Supplementary Table 5).

To test for the presence of nonlinear differences between deletion and duplication carriers, and deletion or duplication carriers and non-carriers, we conducted individual t-tests between these groups. We confirmed a negative dose response with increasing copy number for the volumes of the caudate, pallidum, putamen and for ICV but no additional structures revealed significant associations at significance threshold of <0.005 (Table 3, Supplementary Table 6, Supplementary Figure 3a-c).

Table 3 T-test on subcortical volumes between different 16p11.2 distal copy number groups

To confirm the validity of the results, we carefully checked for the impact of removing subjects known to carry a neurodevelopmental diagnosis, children below age 18, first and second-degree relatives or CNV carriers whose CNVs extended into the 16p11.2 proximal region (Supplementary Tables 5 and 6). Likewise, we redid the analysis matching each carrier with four non-carriers for sex, age, diagnosis status (with/without neurodevelopmental diagnosis) and scanner site and finally in a separate analysis we controlled for population stratification in cohorts with available ancestry (Supplementary Table 7). None of these analyses changed the main results.

Replication in an independent cohort

We performed replication of the subcortical findings in an Icelandic MRI sample (deCODE) comprising 841 individuals (3 deletion and 6 duplication carriers, 832 non-carriers) (Table 1). The negative correlation between the number of 16p11.2 distal copies and the volume of pallidum was confirmed (β = −0.95, P = 0.0042) (Fig. 3, Table 2) at a significance threshold of <0.005. For volumes of the caudate, putamen and for ICV, effects were in the same direction as in the discovery sample, albeit not significant (β = −0.46, P = 0.17; β = -0.70, P = 0.034; β = −0.54, P = 0.10, respectively) (Fig. 3, Table 2). Apart from cortical surface area, we observed the same direction of effect in the replication sample as in the discovery sample (Table 2, Fig. 3). For nonlinear differences, all directions of effect were the same for subcortical volumes in the discovery and replication data sets (Table 3, Supplementary Figure 4a-c).

Fig. 3
figure 3

Forest plots on the dose response of copy number on subcortical volumes, surface area, thickness and ICV. The effect size (β of the linear regression) at each site for each measure is shown by the position on the x-axis. Standard error is shown by the horizontal line. A summary polygon shows the results when fitting a random-effects model to the two groups: ENIGMA-CNV discovery and deCODE replication samples. del, con and dup denote the number of individuals in each analysis. *P < 0.005, **P < 0.0005. Effect size and confidence intervals are to the right

The combined analysis of the ENIGMA and the deCODE samples is shown in Table 2 and Fig. 3. Volumes of caudate, pallidum, putamen and ICV decreased with increasing number of 16p11.2 distal copies (β = −0.76, P = 8.9 × 10−6; β = −1,03, P = 1.7 × 10−9; β = −1.19, P = 3.5 × 10−12; β = −0.66, P = 1.0 × 10−4, respectively). In the combined analysis, the volume of the accumbens also revealed a significant association with the 16p11.2 distal copy number (β = −0.54, P = 0.0032) (Tables 2 and 3, Fig. 3, Supplementary Figure 4a-c).

Intelligence quotient (IQ)

Full scale IQ data were available for four 16p11.2 distal deletion and twelve duplication carriers and 2148 non-carriers from the Icelandic sample. None of these individuals had an established neurodevelopmental diagnosis, or other known pathogenic CNVs (as defined by Supplementary Table 3). Analysis showed a significant difference in IQ between groups (P = 0.0042). Both deletion (median IQ = 68.5) and duplication carriers (median IQ = 93) presented a significantly lower IQ (P = 0.011, P = 0.035) than non-carriers (median IQ = 101.5) (Supplementary Table 9, Supplementary Figure 5) at a significance threshold of P < 0.05.

Body mass index (BMI)

BMI data for mega-analysis were available for six cohorts from ENIGMA-CNV counting seven deletion and seven duplication carriers in addition to 1880 individuals without a 16p11.2 distal CNV (Supplementary Table 9). BMI z-scores were different between the carrier groups (Kruskal–Wallis, P = 0.009, Supplementary Figure 6, Supplementary Table 10). Duplication carriers had significantly lower BMI z-scores (median; SD = −0.65; 1.61) than the non-carriers (0.43; 1.19; P = 0.048). Also, the duplication carriers tended to have lower BMI z-scores than the deletions (−1.56; 1.1, P = 0.052; Supplementary Figure 6, Supplementary Table 10) and deletion carriers tended to have higher BMI z-scores than non-carriers (P = 0.18) (Supplementary Table 10). For detailed information on individual carriers, see Supplementary Table 2.


This is the first study to determine the brain structure underpinnings of the 16p11.2 distal recurrent CNVs. We found a common denominator for 16p11.2 distal carriers across different clinical phenotypes in a dose response effect of copy number on the volumes of basal ganglia (caudate, putamen and pallidum) (Table 2, Figs. 2 and 3). The observed associations were independent of the presence of neurodevelopmental diagnosis and the ancestry of participants (Supplementary Tables 5 and 7). These effects were consistent in the independent replication sample (Fig. 3, Tables 2 and 3). Together with the result of lower IQ in carriers, these findings provide new insight into genetic mechanisms of brain structures and pathobiological processes involved in neurodevelopmental disorders.

There are nine genes in the core 16p11.2 distal (BP2–BP3) region. We tested the expression level of these in blood in available transcript data from two of our deletion carriers (BP1–BP3) and compared with 234 non-carriers (Supplementary Figure 7, Supplementary Table 11). Several transcripts of the nine genes were relatively decreased in blood. Due to the low numbers, only trends can be suggested from these data but we find the down-regulation of LAT gene expression to ~65% in 16p11.2 distal carriers (Supplementary Figure 7, Supplementary Table 11) particular interesting due to recent results in zebrafish: These showed that of the nine genes in 16p11.2 distal, only over-expression of LAT induced a decrease in cell proliferation in the brain with a concomitant microcephaly phenotype [44]. In parallel, LAT knockout mice showed brain anatomy changes [44]. According to the Allen Brain Atlas, LAT shows the highest expression in cerebellum and structures of the basal ganglia (data not shown). Thus, high expression of LAT overlaps with the position of brain structural changes identified in the present study. This further implicates LAT, an immune signaling adaptor, as a possible dosage-dependent driver of the CNV-associated brain phenotypes including the basal ganglia.

By comparing the effect of a range of CNVs on the brain, it is possible to identify patterns of effects related to the genes involved, and thus learn about biological mechanisms. All three CNVs previously shown to have an effect on ICV, 16p11.2 proximal [6, 8], 22q11 [9] and Williams Syndrome [45], have concomitantly identified an effect on either cortical surface area and/or cortical thickness. We observed no effect on cortical surface area or cortical thickness in 16p11.2 distal carriers (Table 2, Figs. 2 and 3). This does not rule out the presence of smaller effects on individual cortical areas. However, it may suggest a different impact on brain development mechanisms between these three CNVs [6, 8, 9, 45] and 16p11.2 distal.

CNVs in the two neighboring regions 16p11.2 distal and 16p11.2 proximal show overlapping phenotypes: they both dispose to various neurodevelopmental diseases and both show a positive dose response for head circumference and weight [17]. In addition, we found negative dose response effects for the 16p11.2 distal CNV on ICV, putamen and caudate volumes with effect size estimates comparable to those previously reported for the 16p11.2 proximal CNV [8]. Recently, a lymphoblastoid cell line study of chromosomal interaction in the 16p11.2 region suggested that the two adjacent 16p11.2 distal and 16p11.2 proximal regions (Fig. 1) interact [17]. The identified brain commonalities could further support a mechanism in which the similar phenotypic patterns are caused by disruption of the chromatin structure surrounding the entire 16p11.2 region [17, 32].

In this study, the CNVs (3 deletions and 3 duplications) of six carriers extend into the adjacent 16p11.2 proximal region. Redoing the analysis with exclusively 16p11.2 distal carriers did not result in a change in effect size (Supplementary Tables 5 + 6), suggesting that 1.7 MB distal-proximal (BP1–BP5) CNV carriers are not the main cause of the signal. A previous analysis from Loviglio et al. [44] suggests an additive effect of two 16p11.2 regions (distal + proximal) on human head circumference and weight. Together with our data, this indicates both separate and overlapping effects for the two CNVs and underlines the importance of studying specific CNVs independently despite overlapping phenotypes.

Interestingly, deletions and duplications of both 16p11.2 proximal and distal CNVs are associated with ASD [17, 19]. However, only the proximal duplication and the distal deletion are associated with schizophrenia [20, 22]. This difference in phenotype association between these bordering CNVs may indicate specific differences in the pathological mechanisms of ASD and schizophrenia.

We observed a decrease in absolute effect sizes for putamen and pallidum after removing individuals with a neurodevelopmental diagnosis (Supplementary Table 5). This is consistent with the enlargements of putamen and pallidum associated with duration of illness in schizophrenia [46], which may partly reflect the cumulative effect of antipsychotic medication on basal ganglia volumes [47]. The observed dose response effect on ICV is in agreement with previous findings on head circumference [17], as are the dose response effect on BMI (Supplementary Figure 6, Supplementary Table 10) [17, 31]. One of the strengths of this study is the inclusion of non-clinical samples allowing for estimates closer to the actual carrier population. Unfortunately, the small number of CNV cases does not provide enough power to investigate preferential alterations in deletion or duplication carriers.

To conclude, the present findings of negative dose-response effects of copy number on ICV and volumes of caudate, pallidum and putamen, with no effect on cortical measures, suggest a specific effect on basal ganglia structures of the 16p11.2 distal CNV. These results provide novel insight into genetic factors determining basal ganglia volumes and suggest specific pathobiological mechanisms involved in the development of neurodevelopmental disorders.