1q21.1 distal copy number variants are associated with cerebral and cognitive alterations in humans

Low-frequency 1q21.1 distal deletion and duplication copy number variant (CNV) carriers are predisposed to multiple neurodevelopmental disorders, including schizophrenia, autism and intellectual disability. Human carriers display a high prevalence of micro- and macrocephaly in deletion and duplication carriers, respectively. The underlying brain structural diversity remains largely unknown. We systematically called CNVs in 38 cohorts from the large-scale ENIGMA-CNV collaboration and the UK Biobank and identified 28 1q21.1 distal deletion and 22 duplication carriers and 37,088 non-carriers (48% male) derived from 15 distinct magnetic resonance imaging scanner sites. With standardized methods, we compared subcortical and cortical brain measures (all) and cognitive performance (UK Biobank only) between carrier groups also testing for mediation of brain structure on cognition. We identified positive dosage effects of copy number on intracranial volume (ICV) and total cortical surface area, with the largest effects in frontal and cingulate cortices, and negative dosage effects on caudate and hippocampal volumes. The carriers displayed distinct cognitive deficit profiles in cognitive tasks from the UK Biobank with intermediate decreases in duplication carriers and somewhat larger in deletion carriers—the latter potentially mediated by ICV or cortical surface area. These results shed light on pathobiological mechanisms of neurodevelopmental disorders, by demonstrating gene dose effect on specific brain structures and effect on cognitive function.


Introduction
Inter-individual differences in brain structure are highly heritable 1 , but identifying the genes that contribute to brain development is challenging. Genomewide association studies (GWAS) of brain anatomical structures indicate the influence of many singlenucleotide polymorphisms (SNPs) with small effect sizes 2,3 , but the links to brain function remain weak. Evidence is emerging that some rare copy number variants (CNVs)-that is, regions of the genome that are either deleted or duplicated-are associated with both substantial brain size and shape differences; for example, the 7q11. 23 4,5 , 22q11.2 6,7 , 15q11.2 [8][9][10][11] and 16p11.2 proximal [12][13][14] and distal CNVs 15 . Many of these CNVs also have a wide-ranging phenotypic impact, including poorer cognitive abilities 8,[16][17][18] and increased risk of neurological or neurodevelopmental disorders. The strong impact of these CNVs on brain structure and behaviour make them valuable for studies of the molecular mechanisms contributing to aberrant human neurodevelopment.
The 1q21.1 distal CNV has a known large effect on head circumference, as evident from a high prevalence of microand macrocephaly in deletion and duplication carriers, respectively [19][20][21] . This, along with its position in a region that is rich in genes unique to the human lineage (i.e. absent in primates) 22,23 , makes the 1q21.1 distal CNV particularly interesting for the study of aberrations in human brain structure. However, its relatively low frequency, 1 in~3400, (deletions) and 1 in 2100 (duplications) 8,16 , has hampered the study of its effects on brain structure.
1q21.1 distal deletion and duplication carriers are both at higher risk for several neurodevelopmental disorders including schizophrenia 24,25 , intellectual disability (ID), developmental delay, speech problems, autism spectrum disorders, motor impairment 19,[26][27][28] and epilepsy 26,29 , in addition to the separate risk for the duplication carriers for ADHD 30 , bipolar disorder and major depression 31,32 . Further, general cognitive ability (IQ) was lower in carriers in a small clinical study 19 and in the UK Biobank 33 . In addition, 1q21.1 distal CNVs display a positive dose response on head circumference [19][20][21] , height and weight 34,35 and are associated with various somatic diseases and traits, including bone and muscle deviations 34 and cataract 36 (deletion only), diabetes 36 (duplication only) and heart disease 36-39 (both). Conversely, several studies report carriers without any clinically evident phenotypes 19,38 and considerable heterogeneity 40,41 , suggesting incomplete penetrance and variable expressivity. The Df(h1q21)+/− mouse, deleted in the syntenic 1q21.1 distal region, displays some phenotypes similar to human CNV carriers, including reduced head-to-tail length and altered dopamine transmission in response to psychostimulants, as seen in people with schizophrenia 42 .
The 1q21 region in humans is rich in low copy number repeats 20,43 and contains several recurrent CNVs with differing breakpoints 21,37 . Thus, gene estimates vary, but the core interval encompasses at least 12 protein-coding genes including several human-specific genes such as HYDIN2 21,37 , NOTCH2NLs 22,23 and the DUF1220/Olduvain domain-containing NBPF-encoding genes 44-46 -the two latter were recently shown to have evolved as a twogene unit 47 . Particularly interesting in the context of brain development are the recently characterized NOTCH2NL genes, absent in human's closest living relatives and shown to prolong cortical neurogenesis 22,23 .
Despite the strong effects on neurodevelopmental traits and disorders, the impact of the 1q21.1 CNVs on human brain structure is largely unknown. Here, we present the first large-scale systematic neuroimaging study of 1q21.1 distal CNV carriers, investigating brain structure in >37,000 individuals including 28 deletion and 22 duplication carriers. We mapped the effect of the 1q21.1 distal CNV on subcortical volumes, intracranial volume (ICV) and global and regional measures of mean cortical thickness and surface area. We investigated variation in cognitive task performance and supplemented with exploratory mediation analysis of the brain on cognition in the UK Biobank. Given prior findings [19][20][21]48 , we explored a dose-dependent effect of copy number on brain structures and decreased cognitive performance for both 1q21.1 distal deletion and duplication carriers in comparison to non-carriers.

Sample description
The brain structural sample comprises a total of 39 cohorts with genotyping and magnetic resonance imaging (MRI) data-38 from the ENIGMA-CNV consortium in addition to a subsample of the UK Biobank 49 (project ID #27412). Demographic characteristics for each cohort are described in Supplementary Table 1 with a reference to participants' collection and datasets including individual inclusion and exclusion parameters. Extended information on diagnosis and family information can be found in Supplementary Note 1 and age distribution of the cohorts in Supplementary Fig. 1 Table 2). In the meta-analysis, an independent Icelandic sample from deCODE Genetics consisting of two deletion carriers and five duplication carriers in addition to 1150 non-carriers was added.

Genotyping and QC
The genotypes were obtained by genotyping with commercially available platforms, performed at participating sites for each cohort (Supplementary Table 1). Individuals were excluded exclusively based on quality control (QC) parameters from the CNV calling. No exclusion was done due to ancestry in the primary analysis, but the effect of ancestry was evaluated in a separate analysis (see below).

CNV calls and validation in the core ENIGMA-CNV sample
Almost all cohorts had CNVs called and identified in a unified manner as described previously 15 . In brief, CNVs were called using PennCNV 50 and appropriate population frequency (PFB) files and GC (content) model files (Supplementary Table 3 Fig. 2). CNVs overlapping the region of interest (1q21.1 distal and 1q21.1 distal and proximal) were identified with the R package iPsychCNV, visualized and manually inspected.

Image acquisition and processing
All brain measures were obtained from structural T1weighted MRI data collected at participating sites around the world and analysed with the standardized image analysis, FreeSurfer, quality assurance and statistical methods as per the harmonized neuroimaging protocols developed within ENIGMA2 3 and ENIGMA3 (http://enigma.ini.usc. edu/protocols/imaging-protocols/). Further detail on data processing is provided in Supplementary Note 4. Details on study, scanner, vendor, field strength, sequence, acquisition parameters and FreeSurfer versions used are outlined in Supplementary Table 4.

Statistical analysis
Imaging data processing and CNV calling were performed locally and de-identified CNV and imaging data were provided for a central mega-analysis. One of a pair of duplicates was kept. Relatives were removed from the sample used for the main analysis. In addition, we conducted a number of sensitivity analyses to test the robustness of the results (Supplementary Note 5 and  Supplementary Tables 5-8). Individuals with a minimum overlap of 0.4 to regions with known pathogenic CNVs (Supplementary Table 2) were excluded from the analysis regardless of copy number status as were individuals from scanner sites without 1q21.1 distal CNV carriers.
Brain measures were normalized in R v3.3.2 by an inverse normal transformation of the residual of a linear regression on the phenotype correcting for covariates as done previously 15 . For the primary analysis, covariates were age, age 2 , sex, scanner site and ICV. In the analysis of ICV, ICV was not included as a covariate. These final covariance-corrected values were used in downstream analysis and are reported for each measure. For comparison between groups, normalization was carried out including only the groups addressed (deletion and noncarriers, duplications and non-carriers) except for the deletion versus duplication comparison, where values from normalization of the entire dataset were used due to the low numbers.
For the copy number dosage effect analysis (i.e. the effect on brain structure of 1q21.1 distal copy number variation), a linear regression on the copy number status of the individuals (deletion = 1, normal = 2, duplication = 3) was performed using the following model: covariance-corrected, normalized brain measure~copy number (deletion = 1, non-carrier = 2, duplication = 3). For comparison between groups, a two-sample, two-sided t test assuming equal variance in all carrier/non-carrier groups was employed (R v3.3.2) where deletion or duplication carriers were compared either to each other or to non-carriers. To correct for the multiple comparisons, we calculated the number of independent outcome measures through the spectral decomposition of a correlation matrix using MatSpDlite (https://neurogenetics.qimrberghofer.edu.au/matSpDlite/) of the three global, seven subcortical and 68 regional cortical measures. Based on the ratio of observed eigenvalue variance to its theoretical maximum, the estimated equivalent of independent measures was 36. Thus, we set the significance threshold at α = 0.05/36 = 0.0014. We report the uncorrected P values throughout the manuscript. Effect size is calculated as the absolute effect size (the difference in mean between the two copy number groups in the t test-which, in this case, equals Cohen's D as the standard deviation of the normalized brain measures is one) and the estimate of beta in the linear regression. Plots were generated using R library ggplot2 v2.2.1 51 . Regional cortical visualization was done with the R package ggseg v1.5.1.
In a novel analysis, the independent Icelandic data were processed and analysed as the main dataset. We metaanalysed the results using the R package metafor v2.0.0, as previously 15 .

Cognitive task performance data
We downloaded behavioural performance measures on seven cognitive tests (the pairs matching task, the reaction time task, reasoning and problem-solving tests, the digit span test, the symbol digit substitution test and the trail making A and B tests) from the UK Biobank repository, performed by at least 10% of the participants. The results were processed following the general approach by Kendall et al. 16 . For more details, see Supplementary Note 6. For the analysis of the seven cognitive measures, we set the significance threshold to α = 0.05/7 = 0.007.

Mediation analysis
Mediation analyses were done with the R package mediation v4.4.7. Brain measures were normalized as described above and cognitive tasks were corrected for age, age 2 and sex prior to input into the analysis. We report the proportion of the total effect of the CNV on cognitive task performance mediated by the brain measures ('path ab'/'path c'), with P values calculated through quasi-Bayesian approximation using 5000 simulations. We set the significance threshold at α = 0.05/ ((2 + 4) × 6) = 1.4 × 10 −3 given the test of two structures for deletion and four for duplication carriers on six cognitive tests. The digit span test was excluded since no 1q21.1 CNV carriers had results from both this cognitive test and brain structural data.

Sample characteristics
The main 1q21.1 distal (146.5-147.4 Mb, hg19) brain structural dataset consisted of 28 deletion and 22 duplication carriers and 37,088 non-carriers (derived from the same scanner sites as the CNV carriers) from ENIGMA-CNV and UK Biobank ( Table 1, separate demographics in Supplementary Table 9). The age of CNV carriers was lower (41.7 ± 19.0 (deletions), 55.4 ± 12.7 (duplications), respectively) than that of non-carriers (61.1 ± 12.1) ( Table 1). Eleven deletion carriers and seven duplication carriers had a known neurological, neurodevelopmental or psychiatric diagnosis or had been recruited in a clinical CNV study. The remaining carriers either did not have an established diagnosis or were recruited in studies from which diagnostic information was unavailable (Table 1  and Supplementary Table 10). Of the 37,088 non-carriers, 6.5% (2425) had an established neurological, neurodevelopmental or psychiatric disorder.
A number of sensitivity analyses were run on the main dataset, namely: (a) Matching each carrier with one non-carrier for age, sex, scanner site and ICV or age, sex, scanner site; (b) including only: (i) non-affected individuals (i.e. excluding individuals with a known neurodevelopmental or neurological disorder Boxplots of subcortical volumes, cortical surface area and mean cortical thickness and ICV are shown. Deletion carriers (del) in red, non-carriers (nc) in grey and duplication carriers (dup) in blue, respectively. The normalized brain values are presented. Boxplots represent the mean. Copy number dosage effect is noted at the bottom of each panel. Significant differences after correction between groups are noted as *P < 0.0014, **P < 0.00014, ***P = 0.000014. Centre line represents the median, box limits are the upper and lower 25% quartiles, whiskers the 1.5 interquartile range and the points are the outliers. All analyses were corrected for age, age squared, sex, scanner site and ICV (except for ICV).
The 1q21.1 distal CNV is associated with regional brain structures The largest dosage effects for the regional cortical surface area were found in the frontal lobes followed by the cingulate cortex-with additional significant effects in three regions of the parietal and temporal lobes ( Fig. 2 and Supplementary Table 7). Likewise, through t tests, the largest effects in both deletion and duplication carriers in comparison to non-carriers were observed in the frontal and cingulate cortices ( Fig. 2 and Supplementary Table 8).
For regional cortical mean thickness, we identified significant negative dosage effects in the superior temporal region and significant positive dosage effects for the pericalcarine region ( Fig. 2 and Supplementary Tables 7  and 8). Similarly, significant increases in mean cortical thickness were observed in deletion carriers versus noncarriers in the pars triangularis and superior temporal regions and a significant decrease in the pericalcarine region ( Fig. 2 and Supplementary Table 8). All regional results were corrected for age, age 2 , sex, scanner site and ICV. Sensitivity analyses similar to those performed for subcortical regions confirmed the robustness of the results (Supplementary Tables 7 and 8). 1q21.1 distal CNV associated with cognitive performance and mediation by brain structures Deletion and duplication carriers had different cognitive profiles in comparison to non-carriers when testing for association in seven different neuropsychological tests available from the full UK Biobank sample: deletion carriers had significantly poorer performance in three tests: symbol digit substitution, trail making B and pairs matching, while duplication carriers had significantly poorer performance in two tests: reaction time and the reasoning and problem-solving task ( Table 2).
Testing the effect of brain structures on cognitive tests in UK Biobank participants, larger ICV and total surface area were associated with better performance on almost all tests (Table 3 and see Supplementary Table 13 for sample size details). A larger hippocampus was associated with better performance for symbol digit substitution, trail making A and B (Table 3) and a larger caudate was associated with higher performance on the trail making A (Table 3). Results from the t tests and linear regression of 1q21.1 copy number variation on regional cortical surface area and cortical thickness. First and third rows: Effect sizes (Cohen's d for the t tests, beta coefficient for the dosage/linear regression). Second and fourth rows: Statistical significance in -log 10 of the P value. Significant areas in rows 1 and 3 are marked with black lines with increasing thickness for increasing significance (P < 0.0014). The column names indicate the comparisons with del = deletion carriers, nc = non-carriers, dup = duplication carriers. All measures were corrected for age, age 2 , sex, scanner site and ICV.
Next, we tested whether the brain structures significantly associated with 1q21.1 distal CNV carriers might mediate the effect of the CNV on cognition. For two of the three tests associated with deletion carrier status, there were significant mediation effects (significance threshold 1.4 × 10 −3 ): cortical surface area and ICV accounted for 5 and 10%, respectively, of the poorer performance of deletion carriers on symbol digit substitution, and 7 and 17%, respectively, of their poorer performance on the trail making B test (Table 3).

Discussion
Our main finding was a significant positive dosage effect in humans of 1q21.1 distal copy number on ICV and cortical surface area, with the largest differences in frontal and cingulate cortical surface area. We also identified a significant negative dosage effect on caudate and hippocampal volumes. A number of sensitivity analyses confirmed the robustness of the results. Both 1q21.1 distal deletion and duplication carriers showed poorer cognitive performance, although on different tests, with an indication that decreased ICV/cortical surface area might mediate the effect in deletion carriers.

The 1q21.1 distal CNV causes copy dosage effect on brain structures
We found a strong effect of the 1q21.1 distal CNV on the total cortical surface area, while no overall effect on mean cortical thickness was observed. A specific increase in the size of the cortical surface area with little effect on cortical thickness is observed throughout mammalian evolution including the primate lineage leading to humans 52 . This possibly reflects that cortical thickness and surface area appear to be driven by distinct genetic processes 53 . This pattern may be the result of an increased number of symmetric or self-renewing cell-division cycles, leading to an expansion of the neural progenitor pool and subsequently to an increase in the number of cortical neurons-in line with the radial unit hypothesis 52 . Interestingly, although not significant, mean cortical thickness tended to decrease in deletion carriers in the frontal cortical surface areas with the highest effect sizes, resembling a pattern found in lissencephaly 54 . This could suggest that large regional decreases in cortical surface area correlate inversely with mean cortical thickness.
The biomechanical forces of brain growth are thought to form the expansion of the cranium so that the skull grows in harmony with the expanding brain 55 . Thus, the positive copy number dosage effect on cortical surface area may directly trigger the effect on head circumference [19][20][21] and ICV of 1q21.1 distal carriers due to modifications in pressure. Altered mechanical pressure might also cause the negative copy number dosage effect on the hippocampus and caudate volumes, effects on subcortical volumes also observed in a UK Biobank exploratory study on six individuals with a 1q21.1 distal duplication 56 .

Human-specific genes may affect the cortical surface area and cross-species effects
The positive copy number dosage effect on brain structure with the same direction as for weight and height 34,35 likely results from altered gene expression as observed in 1q21.1 distal CNV cell lines 48 . In an independent experiment on fetal tissue, we also observed dynamic expression patterns of the genes in the 1q21.1 interval consistent with potential roles in cortical neurogenesis and development (Supplementary Note 7 and Supplementary Figs. 5 and 6).
GWAS based on the hg19 genome assembly have not identified hits in the 1q21.1 genomic region for ICV 57 , total cortical or regional surface area 53,58 . Assembly of the 1q21.1 region 59 and thus gene discovery is complicated due to the presence of numerous low copy number repeats 20,43 and has been faulty until the GRCh38 genome assembly. This may explain the lack of GWAS hits in the region.
Candidates for a dosage-dependent amplifier of the CNV-associated brain phenotypes are the recently identified human-specific NOTCH2NL genes that confer delayed neuronal differentiation and increased progenitor self-renewal 22,23 -in line with the radial unit hypothesis 52 . The areas with the highest regional effect sizes overlap with the areas of the highest expression of NOTCH2NLA and C in utero 22 in concordance with an early  Fig. 7 and Supplementary  Notes 8 and 9) and recent findings of decreased total brain volume focused on the temporo-parietal and subcortical areas in the deletion mouse 60 suggest that genes overlapping between human and mice (nine of ten mice genes are syntenic to the human region 42 ) and not specific to humans are also involved in the altered skull and brain morphology. However, although diameter and volume are not directly comparable, the 17% decrease in ICV in human 1q21.1 deletion carriers would still point towards a substantial role of human-specific genes or genes with altered functions in comparison to mice. This underlines the need for additional data to disentangle which specific genes are involved in the skull and brain structural phenotypes. Of note, we also observed shorter bones overall in the 1q21.1 deletion mice (Supplementary Fig. 8 and Supplementary Note 9), expanding on previous head-totail length data 42 , and lower bone mineral density in female mice (Supplementary Fig. 9 and Supplementary Note 9), which mirror bone characteristics from human deletion carriers 34 increasing the number of observed cross-species effects between the 1q21.1 mice and human 1q21.1 deletion carriers.

1q21.1 distal CNV deletion and duplication carriers show deficits in different cognitive functions
Our findings of widespread lower performance across several tests in different domains for both carrier groups in the volunteer-based UK Biobank sample are in line with cognitive results from a recent study 33 and support that cognitive function in CNV carriers largely without a neurodevelopmental diagnosis may still be compromised 8,16 . Interestingly, the frontal and cingulate regions 61 , with the greatest cortical effect sizes for distal 1q21.1, correlate particularly with cognitive function and have gone through the greatest expansion during human development and evolution 62 . Our analyses indicated that the decreases in cognitive task performance are partially mediated by the observed differences in ICV and cortical surface area, reflecting the positive correlation between brain volume and intellectual function in line with previous findings 63 . The decrease in performance for several cognitive tasks in duplication carriers despite a larger ICV and cortical surface area suggests that the positive correlations may only be applicable within a certain narrower range. Interestingly, recent genetic analysis of NOTCH2NL in archaic and modern humans revealed ongoing adaptive evolution towards a lower dosage of the protein 64 , suggesting negative effects of excessive NOTCH2NL protein.
Our brain structural findings in 1q21.1 distal CNV carriers overlap with brain alterations in associated disorders: for example, ADHD 65 , autism spectrum disorders 66 , schizophrenia 67 , bipolar disorder 68 , major depressive disorder 69 and subtypes of epilepsy 70 , but the exact overlaps differ between carrier groups. Of note, 1q21.1 distal deletion and duplication carriers display direct, opposite effects on several brain structures, while at risk for the same neurodevelopmental diseases. Other pathogenic CNVs also display overlapping disease risk and similar opposite copy number effects 6,[8][9][10][11][12][13][14][15] including effects on the cortical surface area in 22q11 and 16p11.2 proximal CNV carriers 6,[12][13][14] . These CNVs impact different genes, but may converge on the same downstream pathways altering cortical surface area formation, similar to what has been reported for behavioural and neurocognitive phenotypes 28 .
This also suggests that other risk factors interplay to cause disease. It also supports that subgroups within neurodevelopmental disorders can be defined based on genetic profile and brain structural differences.
We demonstrate large effects of 1q21.1 distal CNVs on brain structure and cognition in humans including a mediation effect. These findings provide insight into molecular mechanisms involved in critical stages of human brain development and mapping of gene dosages to brain structural fingerprints.

Data availability
The authors declare that the data supporting the findings of this study are available within the paper and its Supplementary information files. The data were gathered from various resources, and material requests will need to be placed with individual PIs. I.E.S. can provide additional detail upon correspondence. Data from PING are available at NIMH Data Archive: https:// ndar.nih.gov/edit_collection.html?id=2607 Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.