Introduction

Inter-individual differences in brain structure are highly heritable1, but identifying the genes that contribute to brain development is challenging. Genome-wide association studies (GWAS) of brain anatomical structures indicate the influence of many single-nucleotide polymorphisms (SNPs) with small effect sizes2,3, but the links to brain function remain weak. Evidence is emerging that some rare copy number variants (CNVs)—that is, regions of the genome that are either deleted or duplicated—are associated with both substantial brain size and shape differences; for example, the 7q11.234,5, 22q11.26,7, 15q11.28,9,10,11 and 16p11.2 proximal12,13,14 and distal CNVs15. Many of these CNVs also have a wide-ranging phenotypic impact, including poorer cognitive abilities8,16,17,18 and increased risk of neurological or neurodevelopmental disorders. The strong impact of these CNVs on brain structure and behaviour make them valuable for studies of the molecular mechanisms contributing to aberrant human neurodevelopment.

The 1q21.1 distal CNV has a known large effect on head circumference, as evident from a high prevalence of micro- and macrocephaly in deletion and duplication carriers, respectively19,20,21. This, along with its position in a region that is rich in genes unique to the human lineage (i.e. absent in primates)22,23, makes the 1q21.1 distal CNV particularly interesting for the study of aberrations in human brain structure. However, its relatively low frequency, 1 in ~3400, (deletions) and 1 in 2100 (duplications)8,16, has hampered the study of its effects on brain structure.

1q21.1 distal deletion and duplication carriers are both at higher risk for several neurodevelopmental disorders including schizophrenia24,25, intellectual disability (ID), developmental delay, speech problems, autism spectrum disorders, motor impairment19,26,27,28 and epilepsy26,29, in addition to the separate risk for the duplication carriers for ADHD30, bipolar disorder and major depression31,32. Further, general cognitive ability (IQ) was lower in carriers in a small clinical study19 and in the UK Biobank33. In addition, 1q21.1 distal CNVs display a positive dose response on head circumference19,20,21, height and weight34,35 and are associated with various somatic diseases and traits, including bone and muscle deviations34 and cataract36 (deletion only), diabetes36 (duplication only) and heart disease36,37,38,39 (both). Conversely, several studies report carriers without any clinically evident phenotypes19,38 and considerable heterogeneity40,41, suggesting incomplete penetrance and variable expressivity. The Df(h1q21)+/− mouse, deleted in the syntenic 1q21.1 distal region, displays some phenotypes similar to human CNV carriers, including reduced head-to-tail length and altered dopamine transmission in response to psychostimulants, as seen in people with schizophrenia42.

The 1q21 region in humans is rich in low copy number repeats20,43 and contains several recurrent CNVs with differing breakpoints21,37. Thus, gene estimates vary, but the core interval encompasses at least 12 protein-coding genes including several human-specific genes such as HYDIN221,37, NOTCH2NLs22,23 and the DUF1220/Olduvain domain-containing NBPF-encoding genes44,45,46—the two latter were recently shown to have evolved as a two-gene unit47. Particularly interesting in the context of brain development are the recently characterized NOTCH2NL genes, absent in human’s closest living relatives and shown to prolong cortical neurogenesis22,23.

Despite the strong effects on neurodevelopmental traits and disorders, the impact of the 1q21.1 CNVs on human brain structure is largely unknown. Here, we present the first large-scale systematic neuroimaging study of 1q21.1 distal CNV carriers, investigating brain structure in >37,000 individuals including 28 deletion and 22 duplication carriers. We mapped the effect of the 1q21.1 distal CNV on subcortical volumes, intracranial volume (ICV) and global and regional measures of mean cortical thickness and surface area. We investigated variation in cognitive task performance and supplemented with exploratory mediation analysis of the brain on cognition in the UK Biobank. Given prior findings19,20,21,48, we explored a dose-dependent effect of copy number on brain structures and decreased cognitive performance for both 1q21.1 distal deletion and duplication carriers in comparison to non-carriers.

Materials and methods

Sample description

The brain structural sample comprises a total of 39 cohorts with genotyping and magnetic resonance imaging (MRI) data—38 from the ENIGMA-CNV consortium in addition to a subsample of the UK Biobank49 (project ID #27412). Demographic characteristics for each cohort are described in Supplementary Table 1 with a reference to participants’ collection and datasets including individual inclusion and exclusion parameters. Extended information on diagnosis and family information can be found in Supplementary Note 1 and age distribution of the cohorts in Supplementary Fig. 1. All participants gave written informed consent and sites involved obtained ethical approvals. The main 1q21.1 distal sample consisted of 28 deletion carriers, 22 duplication carriers and 37,088 non-carriers (Table 1) from 13 different datasets and 15 scanner sites with various ascertainments (family, clinical and population studies, case–control study for psychiatric disease) collected up until 30 September 2019. Non-carriers were defined as having no CNVs known to cause neurodevelopmental diseases (as defined in Supplementary Table 2). In the meta-analysis, an independent Icelandic sample from deCODE Genetics consisting of two deletion carriers and five duplication carriers in addition to 1150 non-carriers was added.

Table 1 Demographic data.

Genotyping and QC

The genotypes were obtained by genotyping with commercially available platforms, performed at participating sites for each cohort (Supplementary Table 1). Individuals were excluded exclusively based on quality control (QC) parameters from the CNV calling. No exclusion was done due to ancestry in the primary analysis, but the effect of ancestry was evaluated in a separate analysis (see below).

CNV calls and validation in the core ENIGMA-CNV sample

Almost all cohorts had CNVs called and identified in a unified manner as described previously15. In brief, CNVs were called using PennCNV50 and appropriate population frequency (PFB) files and GC (content) model files (Supplementary Table 3 and Supplementary Notes2 and 3). Samples were filtered and CNVs identified based on standardized QC metrics15 (Supplementary Notes 2 and 3). The 1q21.1 distal region was well covered by all arrays (Supplementary Fig. 2). CNVs overlapping the region of interest (1q21.1 distal and 1q21.1 distal and proximal) were identified with the R package iPsychCNV, visualized and manually inspected.

Image acquisition and processing

All brain measures were obtained from structural T1-weighted MRI data collected at participating sites around the world and analysed with the standardized image analysis, FreeSurfer, quality assurance and statistical methods as per the harmonized neuroimaging protocols developed within ENIGMA23 and ENIGMA3 (http://enigma.ini.usc.edu/protocols/imaging-protocols/). Further detail on data processing is provided in Supplementary Note 4. Details on study, scanner, vendor, field strength, sequence, acquisition parameters and FreeSurfer versions used are outlined in Supplementary Table 4.

Statistical analysis

Imaging data processing and CNV calling were performed locally and de-identified CNV and imaging data were provided for a central mega-analysis. One of a pair of duplicates was kept. Relatives were removed from the sample used for the main analysis. In addition, we conducted a number of sensitivity analyses to test the robustness of the results (Supplementary Note 5 and Supplementary Tables 58). Individuals with a minimum overlap of 0.4 to regions with known pathogenic CNVs (Supplementary Table 2) were excluded from the analysis regardless of copy number status as were individuals from scanner sites without 1q21.1 distal CNV carriers.

Brain measures were normalized in R v3.3.2 by an inverse normal transformation of the residual of a linear regression on the phenotype correcting for covariates as done previously15. For the primary analysis, covariates were age, age2, sex, scanner site and ICV. In the analysis of ICV, ICV was not included as a covariate. These final covariance-corrected values were used in downstream analysis and are reported for each measure. For comparison between groups, normalization was carried out including only the groups addressed (deletion and non-carriers, duplications and non-carriers) except for the deletion versus duplication comparison, where values from normalization of the entire dataset were used due to the low numbers.

For the copy number dosage effect analysis (i.e. the effect on brain structure of 1q21.1 distal copy number variation), a linear regression on the copy number status of the individuals (deletion = 1, normal = 2, duplication = 3) was performed using the following model: covariance-corrected, normalized brain measure ~ copy number (deletion = 1, non-carrier = 2, duplication = 3). For comparison between groups, a two-sample, two-sided t test assuming equal variance in all carrier/non-carrier groups was employed (R v3.3.2) where deletion or duplication carriers were compared either to each other or to non-carriers. To correct for the multiple comparisons, we calculated the number of independent outcome measures through the spectral decomposition of a correlation matrix using MatSpDlite (https://neurogenetics.qimrberghofer.edu.au/matSpDlite/) of the three global, seven subcortical and 68 regional cortical measures. Based on the ratio of observed eigenvalue variance to its theoretical maximum, the estimated equivalent of independent measures was 36. Thus, we set the significance threshold at α = 0.05/36 = 0.0014. We report the uncorrected P values throughout the manuscript.

Effect size is calculated as the absolute effect size (the difference in mean between the two copy number groups in the t test—which, in this case, equals Cohen’s D as the standard deviation of the normalized brain measures is one) and the estimate of beta in the linear regression. Plots were generated using R library ggplot2 v2.2.151. Regional cortical visualization was done with the R package ggseg v1.5.1.

In a novel analysis, the independent Icelandic data were processed and analysed as the main dataset. We meta-analysed the results using the R package metafor v2.0.0, as previously15.

Cognitive task performance data

We downloaded behavioural performance measures on seven cognitive tests (the pairs matching task, the reaction time task, reasoning and problem-solving tests, the digit span test, the symbol digit substitution test and the trail making A and B tests) from the UK Biobank repository, performed by at least 10% of the participants. The results were processed following the general approach by Kendall et al.16. For more details, see Supplementary Note 6. For the analysis of the seven cognitive measures, we set the significance threshold to α = 0.05/7 = 0.007.

Mediation analysis

Mediation analyses were done with the R package mediation v4.4.7. Brain measures were normalized as described above and cognitive tasks were corrected for age, age2 and sex prior to input into the analysis. We report the proportion of the total effect of the CNV on cognitive task performance mediated by the brain measures (‘path ab’/‘path c’), with P values calculated through quasi-Bayesian approximation using 5000 simulations. We set the significance threshold at α = 0.05/((2 + 4) × 6) = 1.4 × 10−3 given the test of two structures for deletion and four for duplication carriers on six cognitive tests. The digit span test was excluded since no 1q21.1 CNV carriers had results from both this cognitive test and brain structural data.

Results

Sample characteristics

The main 1q21.1 distal (146.5–147.4 Mb, hg19) brain structural dataset consisted of 28 deletion and 22 duplication carriers and 37,088 non-carriers (derived from the same scanner sites as the CNV carriers) from ENIGMA-CNV and UK Biobank (Table 1, separate demographics in Supplementary Table 9). The age of CNV carriers was lower (41.7 ± 19.0 (deletions), 55.4 ± 12.7 (duplications), respectively) than that of non-carriers (61.1 ± 12.1) (Table 1). Eleven deletion carriers and seven duplication carriers had a known neurological, neurodevelopmental or psychiatric diagnosis or had been recruited in a clinical CNV study. The remaining carriers either did not have an established diagnosis or were recruited in studies from which diagnostic information was unavailable (Table 1 and Supplementary Table 10). Of the 37,088 non-carriers, 6.5% (2425) had an established neurological, neurodevelopmental or psychiatric disorder.

1q21.1 distal CNV associated with global cortical surface structures

For our main dataset, there was a significant positive association between the number of 1q21.1 distal copies and ICV (β = 1.47, P = 2.8 × 10−25) as well as cortical surface area (β = 0.81, P = 1.1 × 10−8) (Fig. 1 and Supplementary Table 5) at a significance threshold of P < 0.0014 after correction for age, age2, sex, scanner site and ICV. In contrast, a significant negative copy number dosage effect was identified for the caudate (β = −0.49, P = 6.9 × 10−4) and hippocampal volumes (β = −0.56, P = 1.3 × 10−4). T tests indicated a decrease in ICV (Cohen’s D = −1.84 (−17%), P = 1.6 × 10−22) for deletion carriers and an increase for duplication carriers (Cohen’s D = 0.90 (+10%), P = 2.3 × 10−5), respectively, compared to non-carriers (Supplementary Table 6). For a raw value plot of ICV, see Supplementary Fig. 3. The cortical surface area dosage effect was primarily driven by the deletion carriers with a significantly lower total cortical surface area (Cohen’s D = −1.13 (−23%), P = 2.1 × 10−9) and the dosage effect on caudate and hippocampus was primarily driven by duplication carriers with significantly smaller caudate (Cohen’s D = −0.71 (−16%), P = 0.0012) and hippocampal (Cohen’s D = −0.92 (−15%), P = 4.1 × 10−5) volumes than non-carriers (Fig. 1 and Supplementary Table 7). Adding an independent Icelandic dataset with two deletions, five duplications and 1150 non-carriers (Table 1) in a meta-analysis strengthened the majority of the dosage results (Supplementary Fig. 4 and Supplementary Tables 11 and 12) and revealed additional significant between-group differences in nucleus accumbens, caudate and putamen (Supplementary Table 12).

Fig. 1: Cortical surface area and ICV show a positive dosage effect and caudate and hippocampus a negative dosage effect to copy number in the 1q21.1 distal region in our main sample (ENIGMA-CNV and UK Biobank).
figure 1

Boxplots of subcortical volumes, cortical surface area and mean cortical thickness and ICV are shown. Deletion carriers (del) in red, non-carriers (nc) in grey and duplication carriers (dup) in blue, respectively. The normalized brain values are presented. Boxplots represent the mean. Copy number dosage effect is noted at the bottom of each panel. Significant differences after correction between groups are noted as *P < 0.0014, **P < 0.00014, ***P = 0.000014. Centre line represents the median, box limits are the upper and lower 25% quartiles, whiskers the 1.5 interquartile range and the points are the outliers. All analyses were corrected for age, age squared, sex, scanner site and ICV (except for ICV).

A number of sensitivity analyses were run on the main dataset, namely:

  1. (a)

    Matching each carrier with one non-carrier for age, sex, scanner site and ICV or age, sex, scanner site;

  2. (b)

    including only: (i) non-affected individuals (i.e. excluding individuals with a known neurodevelopmental or neurological disorder diagnosis; (ii) adults (age ≥ 18); (iii) non-affected adults; (iv) children (age < 18); (v) ENIGMA-CNV or (vi) UK Biobank;

  3. (c)

    controlling for ancestry;

  4. (d)

    excluding ICV as a covariate or;

  5. (e)

    including first- and second-degree relatives (see Supplementary Note 5 for methods).

These analyses validated the overall effects (Supplementary Tables 5 and 6).

The 1q21.1 distal CNV is associated with regional brain structures

The largest dosage effects for the regional cortical surface area were found in the frontal lobes followed by the cingulate cortex—with additional significant effects in three regions of the parietal and temporal lobes (Fig. 2 and Supplementary Table 7). Likewise, through t tests, the largest effects in both deletion and duplication carriers in comparison to non-carriers were observed in the frontal and cingulate cortices (Fig. 2 and Supplementary Table 8).

Fig. 2: Results from the t tests and linear regression of 1q21.1 copy number variation on regional cortical surface area and cortical thickness.
figure 2

First and third rows: Effect sizes (Cohen’s d for the t tests, beta coefficient for the dosage/linear regression). Second and fourth rows: Statistical significance in –log 10 of the P value. Significant areas in rows 1 and 3 are marked with black lines with increasing thickness for increasing significance (P < 0.0014). The column names indicate the comparisons with del = deletion carriers, nc = non-carriers, dup = duplication carriers. All measures were corrected for age, age2, sex, scanner site and ICV.

For regional cortical mean thickness, we identified significant negative dosage effects in the superior temporal region and significant positive dosage effects for the pericalcarine region (Fig. 2 and Supplementary Tables 7 and 8). Similarly, significant increases in mean cortical thickness were observed in deletion carriers versus non-carriers in the pars triangularis and superior temporal regions and a significant decrease in the pericalcarine region (Fig. 2 and Supplementary Table 8). All regional results were corrected for age, age2, sex, scanner site and ICV. Sensitivity analyses similar to those performed for subcortical regions confirmed the robustness of the results (Supplementary Tables 7 and 8).

1q21.1 distal CNV associated with cognitive performance and mediation by brain structures

Deletion and duplication carriers had different cognitive profiles in comparison to non-carriers when testing for association in seven different neuropsychological tests available from the full UK Biobank sample: deletion carriers had significantly poorer performance in three tests: symbol digit substitution, trail making B and pairs matching, while duplication carriers had significantly poorer performance in two tests: reaction time and the reasoning and problem-solving task (Table 2).

Table 2 1q21.1 CNV deletion and duplication carriers show deficits in specific cognitive functions.

Testing the effect of brain structures on cognitive tests in UK Biobank participants, larger ICV and total surface area were associated with better performance on almost all tests (Table 3 and see Supplementary Table 13 for sample size details). A larger hippocampus was associated with better performance for symbol digit substitution, trail making A and B (Table 3) and a larger caudate was associated with higher performance on the trail making A (Table 3).

Table 3 Mediation analysis of brain structures over the association between 1q21.1 distal CNV carrier status and performance in the cognitive tasks in the UK Biobank.

Next, we tested whether the brain structures significantly associated with 1q21.1 distal CNV carriers might mediate the effect of the CNV on cognition. For two of the three tests associated with deletion carrier status, there were significant mediation effects (significance threshold 1.4 × 10−3): cortical surface area and ICV accounted for 5 and 10%, respectively, of the poorer performance of deletion carriers on symbol digit substitution, and 7 and 17%, respectively, of their poorer performance on the trail making B test (Table 3).

Discussion

Our main finding was a significant positive dosage effect in humans of 1q21.1 distal copy number on ICV and cortical surface area, with the largest differences in frontal and cingulate cortical surface area. We also identified a significant negative dosage effect on caudate and hippocampal volumes. A number of sensitivity analyses confirmed the robustness of the results. Both 1q21.1 distal deletion and duplication carriers showed poorer cognitive performance, although on different tests, with an indication that decreased ICV/cortical surface area might mediate the effect in deletion carriers.

The 1q21.1 distal CNV causes copy dosage effect on brain structures

We found a strong effect of the 1q21.1 distal CNV on the total cortical surface area, while no overall effect on mean cortical thickness was observed. A specific increase in the size of the cortical surface area with little effect on cortical thickness is observed throughout mammalian evolution including the primate lineage leading to humans52. This possibly reflects that cortical thickness and surface area appear to be driven by distinct genetic processes53. This pattern may be the result of an increased number of symmetric or self-renewing cell-division cycles, leading to an expansion of the neural progenitor pool and subsequently to an increase in the number of cortical neurons—in line with the radial unit hypothesis52. Interestingly, although not significant, mean cortical thickness tended to decrease in deletion carriers in the frontal cortical surface areas with the highest effect sizes, resembling a pattern found in lissencephaly54. This could suggest that large regional decreases in cortical surface area correlate inversely with mean cortical thickness.

The biomechanical forces of brain growth are thought to form the expansion of the cranium so that the skull grows in harmony with the expanding brain55. Thus, the positive copy number dosage effect on cortical surface area may directly trigger the effect on head circumference19,20,21 and ICV of 1q21.1 distal carriers due to modifications in pressure. Altered mechanical pressure might also cause the negative copy number dosage effect on the hippocampus and caudate volumes, effects on subcortical volumes also observed in a UK Biobank exploratory study on six individuals with a 1q21.1 distal duplication56.

Human-specific genes may affect the cortical surface area and cross-species effects

The positive copy number dosage effect on brain structure with the same direction as for weight and height34,35 likely results from altered gene expression as observed in 1q21.1 distal CNV cell lines48. In an independent experiment on fetal tissue, we also observed dynamic expression patterns of the genes in the 1q21.1 interval consistent with potential roles in cortical neurogenesis and development (Supplementary Note 7 and Supplementary Figs. 5 and 6).

GWAS based on the hg19 genome assembly have not identified hits in the 1q21.1 genomic region for ICV57, total cortical or regional surface area53,58. Assembly of the 1q21.1 region59 and thus gene discovery is complicated due to the presence of numerous low copy number repeats20,43 and has been faulty until the GRCh38 genome assembly. This may explain the lack of GWAS hits in the region.

Candidates for a dosage-dependent amplifier of the CNV-associated brain phenotypes are the recently identified human-specific NOTCH2NL genes that confer delayed neuronal differentiation and increased progenitor self-renewal22,23—in line with the radial unit hypothesis52. The areas with the highest regional effect sizes overlap with the areas of the highest expression of NOTCH2NLA and C in utero22 in concordance with an early developmental effect such as the macrocephaly observed in utero in a 1q21.1 distal duplication carrier38. Our observations of a 2% reduced skull diameter in the 1q21.1 deletion mouse (Supplementary Fig. 7 and Supplementary Notes 8 and 9) and recent findings of decreased total brain volume focused on the temporo-parietal and subcortical areas in the deletion mouse60 suggest that genes overlapping between human and mice (nine of ten mice genes are syntenic to the human region42) and not specific to humans are also involved in the altered skull and brain morphology. However, although diameter and volume are not directly comparable, the 17% decrease in ICV in human 1q21.1 deletion carriers would still point towards a substantial role of human-specific genes or genes with altered functions in comparison to mice. This underlines the need for additional data to disentangle which specific genes are involved in the skull and brain structural phenotypes. Of note, we also observed shorter bones overall in the 1q21.1 deletion mice (Supplementary Fig. 8 and Supplementary Note 9), expanding on previous head-to-tail length data42, and lower bone mineral density in female mice (Supplementary Fig. 9 and Supplementary Note 9), which mirror bone characteristics from human deletion carriers34 increasing the number of observed cross-species effects between the 1q21.1 mice and human 1q21.1 deletion carriers.

1q21.1 distal CNV deletion and duplication carriers show deficits in different cognitive functions

Our findings of widespread lower performance across several tests in different domains for both carrier groups in the volunteer-based UK Biobank sample are in line with cognitive results from a recent study33 and support that cognitive function in CNV carriers largely without a neurodevelopmental diagnosis may still be compromised8,16. Interestingly, the frontal and cingulate regions61, with the greatest cortical effect sizes for distal 1q21.1, correlate particularly with cognitive function and have gone through the greatest expansion during human development and evolution62. Our analyses indicated that the decreases in cognitive task performance are partially mediated by the observed differences in ICV and cortical surface area, reflecting the positive correlation between brain volume and intellectual function in line with previous findings63. The decrease in performance for several cognitive tasks in duplication carriers despite a larger ICV and cortical surface area suggests that the positive correlations may only be applicable within a certain narrower range. Interestingly, recent genetic analysis of NOTCH2NL in archaic and modern humans revealed ongoing adaptive evolution towards a lower dosage of the protein64, suggesting negative effects of excessive NOTCH2NL protein.

Our brain structural findings in 1q21.1 distal CNV carriers overlap with brain alterations in associated disorders: for example, ADHD65, autism spectrum disorders66, schizophrenia67, bipolar disorder68, major depressive disorder69 and subtypes of epilepsy70, but the exact overlaps differ between carrier groups. Of note, 1q21.1 distal deletion and duplication carriers display direct, opposite effects on several brain structures, while at risk for the same neurodevelopmental diseases. Other pathogenic CNVs also display overlapping disease risk and similar opposite copy number effects6,8,9,10,11,12,13,14,15 including effects on the cortical surface area in 22q11 and 16p11.2 proximal CNV carriers6,12,13,14. These CNVs impact different genes, but may converge on the same downstream pathways altering cortical surface area formation, similar to what has been reported for behavioural and neurocognitive phenotypes28.

This also suggests that other risk factors interplay to cause disease. It also supports that subgroups within neurodevelopmental disorders can be defined based on genetic profile and brain structural differences.

We demonstrate large effects of 1q21.1 distal CNVs on brain structure and cognition in humans including a mediation effect. These findings provide insight into molecular mechanisms involved in critical stages of human brain development and mapping of gene dosages to brain structural fingerprints.