Introduction

Genetic variants are associated with neurodevelopmental disorders across conventional diagnostic classifications. This is evidenced both by common variants with low penetrance and by rarer but highly penetrant copy number variants (CNVs)1,2. The mechanisms through which these genetic variants affect brain and behaviour are poorly understood, but genetic imaging studies in people selected for polygenic risk3 or CNV carriers4 can elucidate changes in brain development, structure and function that are not confounded by secondary disease effects. Because of their high penetrance, CNVs are particularly suited to translational studies elucidating disease mechanisms. However, rarity of CNVs makes it difficult to investigate their biological effects in humans. For most of the pathogenic CNVs identified for schizophrenia and developmental delay, there is no information about neuroimaging correlates beyond small case series. Most neuroimaging studies of these CNVs have assessed effects of the 22q11.2 deletion (reviewed in5,6,7,8). Neuroimaging studies in carriers of other CNVs have revealed highly heterogeneous changes in brain structure across CNVs (Table 1).

Table 1 Summary of neuroimaging findings in targeted CNVs

Most findings relate to brain morphology (measured using regional volumes, cortical thickness, surface area etc.), with few studies reporting microstructural difference, usually measured with diffusion tensor metrics such as fractional anisotropy (FA) and mean diffusivity (MD). Although some of these findings corroborate those for non-CNV neuropsychiatric patients (e.g. structural correlates in 15q11.2 deletion carriers are similar to those found in dyslexia9), in most cases it is difficult to corroborate findings in CNV patients with particular clinical features.

Our survey of the neuroimaging literature in CNV carriers (Table 1) attests to the difficulty of collecting substantial quantitative data and drawing consistent conclusions about morphological and microstructural brain alterations in carriers from the literature, particularly of the rarer CNVs. We therefore conducted an analysis across a cohort of carriers of different CNVs, looking for features that correlate with the propensity of these CNVs to contribute to neuropsychiatric illness. We adopt a novel approach to characterising brain features in high-risk CNV carriers using penetrance scores previously calculated from a large cohort of neuropsychiatric CNV patients10. This approach allows us to take account of the degree of pathogenicity of a genetic variant, testing the hypothesis that clinically more penetrant variants will also be associated with more salient neurodevelopmental changes as detected on neuroimaging. This method has the advantage over single-CNV studies that it enables the detection of convergent pathways common to several genetic variants, which are putatively most directly related to the pathophysiology of the associated diseases.

We explore structural brain phenotypes derived from neuroimaging data. These relate to macroscopic structure of cortex, including volume, surface area and cortical thickness, and the size of subcortical regions using T1-weighted structural MRI. We also quantify indices of white-matter microstructure and morphology using diffusion MRI. We quantify tract volume and tract shape using a novel approach that extracts principal modes of feature variation of streamline shape11. We also quantify various microstructural parameters within the principal white-matter pathways using metrics derived from diffusion tensor imaging (DTI), measures of axon density and dispersion using the neurite orientation dispersion and density imaging (NODDI)12 method and T1 relaxometry which provides a putative index of myelination13.

In addition to examining each of these features separately, we also use principal component analysis (PCA) to identify any components across all these neuroanatomical features that show a strong effect of disease penetrance. The advantage of this approach is that it can highlight any prevalent components across correlated features that are not necessarily apparent when examining individual features separately.

Methods

Participants

The study was approved by the South Wales Research Ethics Committee and the Cardiff University School of Psychology Ethics Committee. All participants provided written informed consent.

MRI data were obtained from 21 CNV carriers and 15 controls. CNVs (Table 2) were targeted for their high association with the development of schizophrenia and developmental disorder. Patients were recruited from NHS genetics clinics within the UK and through information disseminated by relevant support groups to their members. Summary information on clinical features and intelligence for this cohort are provided in the Supplementary Information (SI, section 1). Exclusion criteria included any contraindication to MRI.

Table 2 Demographic data for CNV and control participants

Controls were recruited via a local panel of genotyped volunteers. Control participants were chosen to match the age and gender of the CNV patients where possible. Criteria for inclusion were having no history of neurological or psychiatric disorders in addition to the general screening for MRI contraindications and exclusion of any of the pathogenic CNVs.

Genotyping

To confirm CNV status, all patients and controls were genotyped using the Illumina HumanCoreExome whole genome SNP array that contains an additional 27,000 genetic variants at loci that had been previously implicated in neurological and psychiatric disease, which included CNVs. After processing the raw intensity data using Illumina Genome Studio software (version 2011.1), log R ratios and B allele frequencies were used to call CNVs using PennCNV (version 1.0.3)14. CNV coordinates are according to genome version hg19. CNVs were called if they spanned at least 10 informative SNPs and were joined if the distance between them was less than 50% of their combined length. CNVs were excluded if they were less than 10 kb in size, overlapped low copy repeats by more than 50% of their length, or had a probe density of less than 1 probe/20 kb. The Log R ratio and B allele frequency plots at each of the genomic regions of interest (chr1:146,527,987–147,394,444, chr2:50145643–51259674, chr3:195,720,167–197,354,826, chr15:22,805,313–23,094,530, chr15:31,080,645–32,462,776, chr16:28,823,196–29,046,783, chr16:29,650,840–30,200,773, chr17:29,107,491–30,265,075, chr22:19,037,332–21,466,726) were also manually inspected in order to confirm the presence of the CNV.

MRI acquisition

The collection and analysis of all MRI measures used in this study are summarised in Fig. 1. All MRI data were acquired on a 3 T General Electric HDx MRI system (GE Medical Systems, Milwaukee, WI) using an eight-channel receive-only head RF coil.

Fig. 1: Flowchart of MRI data processing pipeline.
figure 1

Red boxes show MRI acquisition steps. Green boxes show image registration steps. Purple boxes show main data processing steps. Blue boxes represent final derived imaging variables

Structural

T1-weighted structural images were acquired with a 3D fast spoiled gradient echo (FSPGR) sequence (TR = 7.8 ms, TE = 3.0 ms, voxel size = 1 mm³ isomorphic).

Diffusion

A cardiac-gated diffusion-weighted spin-echo echo-planar imaging sequence was used to acquire high angular resolution diffusion-weighted images (HARDI)15. Sixty gradient orientations at b = 2000 s/mm2, and 30 directions at b = 1200 s/mm2, and 6 unweighted (b = 0 s/mm2) images were acquired with the following parameters: TE = 87 ms, 60 slices, slice thickness = 2.4 mm, FoV = 230 × 230 mm, Acquisition matrix = 96 × 96, resulting in data acquired with a 2.4 × 2.4 × 2.4 mm isotropic resolution. This was followed by zero-filling to a 128 × 128, in-plane matrix for the fast Fourier transform. The final image resolution was therefore 1.8 × 1.8 × 2.4 mm.

Relaxometry

A series of spoiled gradient echo (SPGR) images was acquired with 8 flip angles plus an additional inversion-recovery (IR) SPGR image. All images had TE = 2.11 ms and TR = 4.7 ms. SPGR images were acquired with flip angles of 3°, 4°, 5°, 6°, 7°, 9°, 13° and 18°. For the IR-SPGR acquisition, Inversion time = 450 ms and flip angle = 5°.

Grey-matter morphometry

Cortical reconstruction and volumetric segmentation were obtained from the T1-weighted structural images using Freesurfer (http://surfer.nmr.mgh.harvard.edu/)16. The technical details of these procedures are described in prior publications17,18,19,20,21,22,23. Grey matter was registered and parcelated to the Desikan–Killiany Atlas24 (40 regions) and measures of volume, surface area, thickness and curvature were generated.

Relaxometry pre-processing and analysis

Relaxometry data were pre-processed using FSL v5.025. All SPGR/IR-SPGR images were coregistered to each other using a rigid affine transform and skull-stripped26. Relaxation rate (R1 = 1/T1) maps were derived using Driven Equilibrium Single Pulse Observation of T1 with High-Speed Incorporation of RF Field Inhomogeneities (DESPOT1-HIFI)13, which incorporates correction for B1 field inhomogeneities with in-house code. A synthetic T1-weighted image was computed from the quantitative T1 map with contrast matching that of the FSPGR image. This was used as a reference for transforming the R1 (maps, which were then warped to the T1-weighted space.

Diffusion MRI pre-processing

HARDI data were pre-processed in ExploreDTI v4.8.327. Data were corrected for motion, eddy currents and field inhomogeneities prior to tractography. Motion artefacts and eddy current distortions were corrected with B-matrix rotation using the approach of ref. 28. A comparison of subject motion between the two groups is shown in SI, section 2.

Field inhomogeneities were corrected using the approach of29. DWIs were non-linearly warped to the T1-weighted image using the fractional anisotropy map from the DWIs as a reference. Warps were computed using Elastix30 using normalised mutual information cost function and constraining deformations to the phase-encoding direction. The corrected DWIs are therefore in the same (undistorted) space as the T1-weighted structural images.

DTI analysis

DTI fitting was performed using ExploreDTI v4.8.3. The corrected HARDI data from the b = 1200 s/mm2 shell were fitted to the diffusion tensor (DT) and corrected for CSF-partial volume effects31 was applied to the DTs. The b = 1200 s/mm2 shell was used as this is the domain in which the DT representation applies. The fractional anisotropy (FA), mean (MD), axial (AD) and radial (RD) diffusivities then computed from the DT. Intra-scan head motion was quantified and assessed for potential impact on subsequent statistics (see SI, section 2).

NODDI (neurite orientation dispersion and density imaging) analysis

NODDI12 was performed using both the NODDI toolbox (https://www.nitrc.org/projects/noddi_toolbox/) the b = 1200 s/mm2 and b = 2000 s/mm2 diffusion shells using the NODDI toolbox v0.9. NODDI yields three parameters that describe the microstructure of the tissue in each voxel, intracellular volume fraction (ICVF), isotropic fraction (ISOF) and orientation dispersion index (ODI).

Tractography

Whole-brain tractography was performed using the damped Richardson–Lucy algorithm using in-house MATLAB code32. This is a modified spherical deconvolution method which is more robust to spurious peaks in the fibre orientation distribution (FOD) than standard spherical deconvolution methods33. RESDORE34 was also applied to remove corrupted voxels from the FOD calculation. The tractography algorithm used is that of Basser et al.35, which uses a uniform step size. Seed points were arranged in a 3 × 3 × 3 mm grid in white matter, step size = 1 mm, angle threshold = 45°, length threshold = 20–500 mm, FOD threshold = 0.05, β = 1.77, λ = 0.0019, η = 0.04, number of iterations = 200 (See ref. 32 for full details of these parameters).

Additional anatomical constraints were introduced to ensure minimal contamination from spurious streamline trajectories through grey matter. A segmentation of the T1-weighted images was performed using FSL-FAST and was used to apply a mask to the streamlines, such that streamlines were forced to terminate when they entered grey matter. There were no explicit masking of CSF, however, the termination criteria used for the dRL algorithm, which is based on the amplitude of the FOD peak, ensures that no streamlines enter isotropic regions such as CSF.

Tract segmentation and shape analysis

Tract shape analyses was performed to quantify the shapes of white-matter pathways using in-house MATLAB code11. Subject whole-brain tractography results were affinely (registered to a standard MNI template (preserving shape but eliminating variance in position/orientation). Streamlines were then re-parameterised to 30 knot-points (spline interpolation), translated to the origin and reduced to a feature vector through coordinate concatenation. PCA was then applied to determine principal modes of feature vector variation and, by extension, variation of streamline shape. Following decomposition of the feature vectors onto the first seven PCA eigenvectors (representing a set of streamline shape basis functions encapsulating ~95% of observed shape variation), the resultant weight vectors were clustered (k-means, k = 800) and, for each subject, histograms of streamline cluster membership recorded.

Streamlines were segmented into the following 19 tract bundles specific pathways, based on the descriptors computed from previously generated statistical models: bilateral arcuate, uncinate, inferior longitudinal, superior longitudinal, fronto-occipital fasciculi, cingulum (dorsal and parahippocampal parts), fornix (left and right branches) and corpus callosum (splenium, body and genu). This yielded a total of 194 shape descriptors. In addition to shape, volume of each bundle was also computed by counting voxels traversed by streamlines in each bundle and normalising to voxel volume. All nine microstructural measurements (from DTI, relaxometry and NODDI) were registered to streamline points in each bundle and the median value taken for each bundle. This yields a total of 190 (19 bundles × 10 variables) microstructural variables.

CNV penetrance scores

To assess the contribution of genetic loading for each CNV for psychopathology, and to accommodate the small sample size of each individual CNV within the cohort, penetrance cores were used for statistical analysis. The CNV penetrance scores used in the current study correspond to the probability of manifesting a given phenotype in carriers of that CNV. Estimates of CNV penetrance for the development of schizophrenia and ID/DD/ASD were obtained from Kirov et al.10. These authors used CNV data from large cohorts of patients diagnosed with schizophrenia, ID/DD/ASD and non-psychiatric controls, to estimate the rate of specific neuropsychiatric CNVs among these disease and non-disease populations, and then estimated CNV penetrance as the probability of carrying a specific CNV given disease status (i.e. rate of the CNV in schizophrenia or ID/DD/ASD), divided by the probability of carrying the CNV in the general population (which includes both case and control populations). Full details on how CNV penetrance scores were calculated can be found in ref. 10.

Statistical analysis

To assess the contribution of genetic loading for each CNV for psychopathology, and to accommodate the small sample size of each individual CNV within the cohort, each participant was assigned a penetrance score for schizophrenia and developmental delay, as previously computed in Kirov et al.10.

A general linear model was applied to identify relationships between the penetrance scores and the imaging variable of interest. Age and gender were included as covariates. Furthermore, for volumetric measures (including tract volume), total brain volume was treated as a covariate and intra-scan head motion was included as a covariate for all diffusion-derived measures. The correlation between the two penetrance scores was also computed.

Multiple comparisons were corrected with permutation testing (5000 iterations) correcting across all observations within the same imaging variable. Permutation correction will also control for any distributions that are non-Gaussian. In the case of microstructural measurements, this is corrected across the 19 fibre bundles for each microstructural variable. For the shape descriptors, the correction is applied across the 194 shape descriptors. For macrostructural morphometry measures, correction was applied across the 40 atlas regions for each macrostructural measurement. To verify effects are due to penetrance and not simply for the presence of CNVs, equivalent analysis was performed using only a binary classification of CNV carriers vs controls. To assess contributions of individual CNVs which may be driving any significant associates identified, the statistical analysis was repeated with individual CNVs omitted and the change in effect size quantified (see SI, section 3).

To identify multivariate features in the data, a principal component analysis (PCA) of all imaging variables was performed. The components were subject to the same statistical analysis as the variables.

Results

Relation between penetrance scores

There was a significant correlation between the penetrance scores PSz and PDD (ρ = 0.77, p = 7.7 × 10–9).

White-matter morphology

Shape analysis revealed significant alteration of one component of the left cingulum bundle that was significantly correlated with both PSz (t = 4.195, p = 2.11 × 10–4, pcorr = 0.026) and PDD (t = 4.06, p = 3.1 × 10−5, pcorr = 0.035). This component reflects the curvature of the dorsal cingulum along the anterior-posterior axis (Fig. 2) with higher penetrance associated with greater curvature of the dorsal cingulum. The same effect was seen in the corresponding component of the right dorsal cingulum with PDD (t = −5.60, p = 1.81 × 10−6, pcorr = 8.00 × 10−4). A strong effect was also observed for PSz, but this did not survive permutation correction (t = −3.601, p = 0.001, pcorr = 0.11). In the binary comparison of CNV carriers vs controls, these effects were non-significant (left cingulum: t = 1.41, p = 0.17; right cingulum: t = −1.94, p = 0.062). Note that the sign of the descriptor from the analysis is arbitrary, so although the effect was in the opposite direction in the right compared to the left, the modes of variation also are also reversed between left and right. Therefore both left and right cingulum bundles are showing the same geometric variation—greater curvature associated with higher penetrance. No other tract shape descriptors showed any significant effect although there were marginal effects of PSz on shape in the right parahippocampal cingulum and cortico-spinal tract (see SI, section 4 for full results).

Fig. 2: Effects of penetrance on cingulum moephology.
figure 2

Scatterplots of the left (a) and right (d) homologous shape descriptors in cingulum bundles against PSz and PDD, with associated regression lines (note the sign of the shape descriptor in the right cingulum was flipped for clarity). Shape descriptors and examples of segmented tracts for left cingulum (b, c) and right cingulum (e, f). Shape descriptors (b, e) show the mode of variation within the maximum (left) and minimum (right) range observed. Example tracts (c, f) are shown for a patient (22q11.2 deletion) with high PDD (left) and a typical control (right)

Tract volumes show significant effects of PSz on the right cortico-spinal tract (t = −3.163, p = 0.0035, pcorr = 0.030) and a marginal effect on the right uncinate fasiculus (t = −2.881, p = 0.0072, pcorr = 0.057). PDD was associated with lower volumes of the right cingulum (t = −3.698, p = 8.4 × 10−4, pcorr = 0.0094) and the right uncinate fasiculus (t = −3.617, p = 0.0010, pcorr = 0.011). Right cortico-spinal tract (t = −2.605, p = 0.014) and right uncinate fasiculus (t = −2.093, p = 0.014) also show some effects using the binary comparison, indicating significant group differences.

White-matter microstructure

There were significant correlations with both penetrance scores with ICVF in the left cingulum (Sz: t = −3.258, p = 0.003, pcorr = 0.019; DD: t = −2.644, p = 0.013, pcorr = 0.064). There was also a significant association between PSz and FA (Sz: t = −3.290, p = 0.002, pcorr = 0.017). No significant associated of diffusivity measures were observed, but the general pattern is that penetrance scores show a negative relationship with FA and AD and positive relationship with MD and RD. These findings are summarised in Fig. 3. Other findings relating to microstructural metrics were a significant positive relationship between PSz and ODI in left and right inferior fronto-occipital fasciculi (left: t = 3.757, p = 7.1 × 10−4, pcorr = 0.0072; right: t = 4.034, p = 3.3 × 10−4, pcorr = 0.0042). No effects were observed for R1 (see SI section 4 for full results).

Fig. 3: Effects of penetrance on cingulum microstructure.
figure 3

Scatterplots of various microstructural measures in the left (1st and 2nd row) and right (3rd an 4th row) cingulum bundles against penetrance (Sz on 1st and 3rd row, DD on 2nd and 4th rows). Linear regression fit lines are shown where effects are found to be significant or close-to-significant

Other structural measures

Comparison of whole-brain volume showed no significant effects (Sz: t = 0.392, p = 0.697; DD: t = 0.722, p = 0.475). No other grey-matter morphometric or in gross brain morphometric measures showed any significant association (see SI, section 4 for full results).

Principal component analysis

The first 34 components were tested for effects of penetrance. Only the 8th largest component (PC8) was found to be significantly associated with PDD (t = −3.033, p = 0.005, pcorr = 0.027) and marginally significant for PSz (t = −2.560, p = 0.016, pcorr = 0.055). This component shows no significant effect when tested with the binary CNV model (t = −2.431, p = 0.021, pcorr = 0.324). This component is heavily weighted by variables relating to white-matter volume (Fig. 4). The two largest anatomical regions with the highest weighting on this component are the body and the splenium of the corpus callosum. These areas were weighted in opposite directions, which would be compatible with the altered curvature observed for the risk scores (see above, Fig. 2).

Fig. 4
figure 4

Effects of penetrance on principle component PC8. a Scatterplots of PC8 against PSz and PDD. b Weighting of each imaging variable in PC8, showing the top 25 weightings. Blue bars indicate positive weighting, red bars indicate negative weighting. c Volumetric change associated with white-matter structures strongly represented in PC8 rendered on the JHU atlas, with positive values corresponding to larger volumes for low penetrance (or smaller volume for higher penetrance) and negative values corresponding to smaller volumes for low penetrance (or larger volume for lower penetrance). d Relative volumes of 3 segments of corpus callosum for extreme cases of high and low component weight

Further examination of the corpus callosum volumes, show the body is relatively smaller in high-penetrance cases (low PC8 weight) compared to low-penetrance cases (high PC8 weight). A post-hoc test was performed on the ratio of volumes between the CC body and splenium that confirms negative correlation with both penetrance scores (PSz: t = −2.193, p = 0.036; PDD; t = −2.931, p = 0.006)

Discussion

In this study we investigated morphological and microstructural alterations associated with CNVs with high penetrance for schizophrenia and developmental delay. We utilised a novel approach to characterising morphology of white-matter fibre bundles in combination with more traditional metrics of brain morphology and microstructure to identify features and principal components of these features that characterise neuropsychiatric CNVs. We found altered morphology in the left and right cingulum. We also found a significant reduction in FA and ICVF in this structure, and our preliminary interpretation of this result is that it reflects a reduction in axon density in this pathway rather than myelin because no effects of relaxometry measures were observed (although it should be noted this may be due to comparatively lower sensitivity of R1 compared to DTI metrics36). There was also an overall positive trend of increased diffusivity with increased penetrance, which is also consistent with a decrease in fibre density in high-penetrance individuals.

A converging finding comes from the PCA analysis, in which PC8 shows a strong effect of penetrance. This component is heavily weighted by midline white-matter structures, in particular, the three components of the corpus callosum. Many of these volumes did not show effects when tested directly, which is likely due to discordant signs between the weighting of features within the component. Most prominently, the splenium has opposite weighting to the body and genu of the corpus callosum. This suggests that rather than there being absolute volumetric effects of penetrance, penetrance affects the volumetric interrelationships between these structures. The corpus callosum is of particular interest, because the altered interrelationships in the three segments suggest alteration of the arrangement of fibres along the AP axis. This is consistent with the increase in curvature of the cingulum along the AP axis, which wraps dorsally around the corpus callosum. We obtained these findings in the absence of any effects of gross brain morphology. There is no apparent relationship between the overall brain size or shape and penetrance. Therefore, it appears that these high-risk CNVs lead to neurodevelopmental alterations that are associated with penetrance, but manifest more subtly than in gross brain morphology. Also of note is that the results of the binary model show weak effects for these metrics, adding further credence to the view that these features are related to CNV penetrance for neuropsychiatric illness rather than simply due to the presence of CNVs.

Altered brain development appears to manifest in the following way: there is increase in forces37 applied parallel to the AP axis. The distribution of axons along the length of the corpus callosum is altered, which causes a change in the forces applied to the cingulum during development. Some findings from 22q11.2 deletion patients appear to corroborate this suggestion, reporting a larger area of the mid-sagittal section of the corpus callosum, in particular the posterior part37,38,39 which corroborates the pattern of fractional corpus callosum volumes observed in this study.

In terms of the mechanisms underlying the cingulum shape/microstructural abnormalities, the embryological literature indicates that the genes affected by the CNVs we have studied are expressed in the developing brain (e.g. for those in 22q11.2 deletion see40). It is therefore likely that alterations in brain morphology will start to occur at this very early stage of development, through direct effects on neurodevelopment or as secondary knock-on effects from abnormal development of the skull or interaction with environmental factors. Abnormal brain morphology can impact on mechanical processes during embryonic brain development, leading to altered structure of the cingulum and consequently morphological differences in the arrangement of fibres along the corpus callosum, as suggested by mechanical models of morphogenesis41. The early formation of neurons has been shown to guide the trajectory of axonal process42. Therefore, we can speculate that early disturbances of head shape can lead to further disruption of axonal processes, causing a reduction in axon density. There is evidence for altered neuronal migration during brain development in both humans and mouse models of 22q11.2 deletion43. Kates et al. even speculate that 22q11.2 deletion is a disorder of axonal guidance as indicated by differential developmental trajectories between 22q11.2 deletion carriers and controls in microstructural measurements44,45.

The changes in cingulum microstructure that we report are largely compatible with previous reports in schizophrenia46,47,48. Similar findings have been observed in 1st degree relatives of schizophrenia patients49 and (for right cingulum) for carriers of the common schizophrenia risk variant at ZNF840A50 and in a sample of adolescents and young adults with 22q11.2 deletion syndrome51, where the most robust finding was AD reduction (without corresponding increases in RD) in the anterior and dorsal cingulum. Although not significant, our results point to a similar trend in cingulum diffusivity measures.

Of note, these cingulum changes are observed in probands who are not affected by schizophrenia, although they do have a range of other psychiatric diagnoses (see Table S1 in SI). This suggests a putative common neurodevelopmental pathway, which is disrupted by a range of genetic variants through downstream effects on genes involved in prenatal brain development and has pleiotropic clinical effects. These features can be regarded as a hallmark of genetic vulnerability to schizophrenia, rather than of schizophrenia per se. We would also suggest that our sample, with its high rates of psychopathology and below-average IQ (see table S1 in SI), would show features of vulnerability rather than resilience. Ultimately a definitive answer to the question of mechanisms of resilience would require a much larger sample with a wider spread of levels of functioning and a larger number of highly functioning carriers without any psychopathology.

There is little discrimination between the effects of the two penetrance scores. All significant effects observed apply to both scores, with a few exceptions (for example, increased fibre dispersion in bilateral inferior fronto-occipital cortices associated with higher penetrance for schizophrenia). This is also evident in the high correlation between penetrance scores. This suggests that the neurodevelopmental indicators observed are not attributable to a particular psychopathology, but rather that these features reflect penetrance for neuropsychiatric illness more generally, a notion which has also been proposed for the clinical phenotype65. However, it should be stressed, due to the comparatively weak effects seen in the binary model, this effect is not simply due to the presence of a CNV.

One concern of the present study the small sample size. This is an unavoidable issue in deep phenotyping research of rare genetic variations. Thus, generalisability of the findings of the present study is somewhat limited, and needs to be confirmed by replication in future studies, ideally through multi-centre collaborations. We do however mitigate issues by taking advantage of the large effect sizes in these patients and by considering variation across a spread of CNVs rather than focusing on single variants. We also need to be conscious of drawbacks of this approach. One is that contributions of individual CNVs cannot be disentangled easily, and the effects observed can be mostly driven by the contribution of a small number of very high-penetrance CNVs (see supplementary analysis in SI, section 3). It should be noted that the highly heterogeneous findings in the literature (summarised in Table 1) suggest that this type of grouped analysis risks overlooking interesting CNV-specific mechanisms.

As noted previously, no effects of brain volume were observed, although previous CNV-specific studies have identified increased55,65 or decreased6163 brain volume. The changes in brain volume appear to reflect CNV-specific features, but ones not necessarily related to risk for schizophrenia or developmental disorders. In contrast, our findings relating to medial white-matter structures implicate a risk mechanism for these disorders that is common across a range of high-risk genetic variants.

In summary, we reveal significant morphological and microstructural features associated with penetrance for neuropsychiatric illness in a cohort of CNV carriers. The most pronounced of these features is in the medial white-matter structures: curvature of the cingulum bundle and volumetric interrelationships between difference segments of the corpus callosum. These features are consistent with a common neurodevelopmental trajectory, which does not manifest in gross brain morphological changes, but in more subtle alterations in the morphological interrelationships in midline white-matter structures. This can lead to downstream effects on cognitive and intellectual impairments that commonly arise as a result of these genetic variants.