Schizophrenia risk from complex variation of complement component 4

Journal name:
Nature
Volume:
530,
Pages:
177–183
Date published:
DOI:
doi:10.1038/nature16549
Received
Accepted
Published online

Abstract

Schizophrenia is a heritable brain illness with unknown pathogenic mechanisms. Schizophrenia’s strongest genetic association at a population level involves variation in the major histocompatibility complex (MHC) locus, but the genes and molecular mechanisms accounting for this have been challenging to identify. Here we show that this association arises in part from many structurally diverse alleles of the complement component 4 (C4) genes. We found that these alleles generated widely varying levels of C4A and C4B expression in the brain, with each common C4 allele associating with schizophrenia in proportion to its tendency to generate greater expression of C4A. Human C4 protein localized to neuronal synapses, dendrites, axons, and cell bodies. In mice, C4 mediated synapse elimination during postnatal development. These results implicate excessive complement activity in the development of schizophrenia and may help explain the reduced numbers of synapses in the brains of individuals with schizophrenia.

At a glance

Figures

  1. Structural variation of the complement component 4 (C4) gene.
    Figure 1: Structural variation of the complement component 4 (C4) gene.

    a, Location of the C4 genes within the major histocompatibility complex (MHC) locus on human chromosome 6. b, Human C4 exists as two paralogous genes (isotypes), C4A and C4B; the encoded proteins are distinguished at a key site that determines which molecular targets they bind19, 20. Both C4A and C4B also exist in both long (L) and short (S) forms distinguished by an endogenous retroviral (C4–HERV) sequence in intron 9. c, Structural forms of the C4 locus and their frequencies among a European-ancestry population sample (222 chromosomes from 111 genetically unrelated individuals, HapMap CEU), inferred as described in Extended Data Fig. 2. Asterisks indicate allele frequencies too low to be estimated accurately.

  2. Haplotypes formed by C4 structures and SNPs.
    Figure 2: Haplotypes formed by C4 structures and SNPs.

    SNP haplotype(s) on which common C4 structures were present. Each thin horizontal line represents the series of SNP alleles (haplotype) along a 250 kilobase (kb) chromosomal segment. Each column represents a SNP; grey and black indicate which allele is present on each haplotype. The SNP haplotypes are grouped into 13 sets of haplotypes associating with each of the four most common C4 structures. Three C4 structures (AL–BS, AL–BL, and AL–AL) each segregated on multiple SNP haplotypes (numbered at right).

  3. Brain RNA expression of C4A and C4B in relation to copy numbers of C4A, C4B, and the C4–HERV.
    Figure 3: Brain RNA expression of C4A and C4B in relation to copy numbers of C4A, C4B, and the C4–HERV.

    a, b, mRNA expression of C4A (a) and C4B (b) was measured (by ddPCR) in brain tissue from 244 individuals. The copy numbers of C4A, C4B, and the C4–HERV were measured (by ddPCR) in genomic DNA from the brain donors. The results were consistent across 8 panels of brain tissue representing 5 brain regions and 3 distinct sets of donors (one set shown here, with data from 101 individuals; all panels in Extended Data Fig. 4; a few outlier points are beyond the range of these plots but are shown in Extended Data Fig. 4.) P values were obtained by a Spearman rank correlation test. In c, expression of C4A (per genomic copy) is normalized to expression of C4B (per genomic copy) to control for trans-acting influences shared by C4A and C4B.

  4. Association of schizophrenia to C4 and the extended MHC locus.
    Figure 4: Association of schizophrenia to C4 and the extended MHC locus.

    Association of schizophrenia to 7,751 SNPs across the MHC locus and to genetically predicted expression levels of C4A and C4B in the brain (represented in the genomic location of the C4 gene). The data shown are based on analysis of 28,799 schizophrenia cases and 35,986 controls of European ancestry from the Psychiatric Genomics Consortium. The height of each point represents the statistical strength (−log10(P)) of association with schizophrenia. a, b, Association of schizophrenia to SNPs in the MHC locus and to genetically predicted expression of C4A and C4B. In b, genetic variants are coloured by their levels of correlation to rs13194504 (upper panel) or by their levels of correlation to genetically predicted brain C4A expression levels (lower panel). cf, Conditional association analysis. The red dashed line indicates the statistical threshold for genome-wide significance (P = 5 × 10−8). See also Extended Data Figs 5, 6, 7 for detailed association analyses involving C4 locus structures and HLA alleles.

  5. C4 structures, C4A expression, and schizophrenia risk.
    Figure 5: C4 structures, C4A expression, and schizophrenia risk.

    a, Schizophrenia risk associated with four common structural forms of C4 in analysis of 28,799 schizophrenia cases and 35,986 controls. b, Brain C4A RNA expression levels associated with four common structural forms of C4. β was calculated from fitting C4A RNA expression (in brain tissue) to the number of chromosomes (0, 1, or 2) carrying each C4 structure (across 120 individuals sampled). c, Schizophrenia risk associated with 13 combinations of C4 structural allele and MHC SNP haplotype. The numbers on the y axis adjacent to the C4 structures indicate the ‘haplogroup’, the MHC SNP haplotype background on which the C4 structure segregates, and correspond to Fig. 2. Statistical tests of heterogeneity yielded P = 0.55 for AL–AL alleles; P = 0.93 for AL–BL alleles; P = 0.06 for AL–BS alleles; and P = 5.7 × 10−5 across the overall allelic series. d, Expression levels of C4A RNA were directly measured (by RT-ddPCR) in post-mortem brain samples from 35 schizophrenia patients and 70 individuals not affected with schizophrenia. Measurements for all five brain regions analysed exhibited the same relationship (Extended Data Fig. 8). Horizontal lines show the median value for each group. P values were derived by a (non-parametric) one-sided Mann–Whitney test. Error bars shown in ac represent 95% confidence intervals around the effect size estimate.

  6. C4 protein at neuronal cell bodies, processes and synapses.
    Figure 6: C4 protein at neuronal cell bodies, processes and synapses.

    a, C4 protein localization in human brain tissue. Two representative confocal images (drawn from immunohistochemistry performed on samples from five individuals with schizophrenia and two unaffected individuals) within the hippocampal formation demonstrate localization of C4 in a subset of NeuN+ neurons. b, High-resolution structured illumination microscopy (SIM) imaging of tissue in the hippocampal formation reveals colocalization of C4 with the presynaptic terminal markers VGLUT1/2 and the postsynaptic marker PSD-95. c, Confocal images of primary human cortical neurons show colocalization of C4, MAP2, and neurofilament along neuronal processes. d, Confocal image of primary cortical neurons stained for C4, presynaptic marker synaptotagmin, and postsynaptic marker PSD-95. Scale bars, 25 μm (a, c, and d); 5 μm (b, left); and 1 μm (b, right). Extended Data Fig. 9 contains additional data on antibody specificity.

  7. C4 in retinogeniculate synaptic refinement.
    Figure 7: C4 in retinogeniculate synaptic refinement.

    a, Representative confocal images of immunohistochemistry for C3 in the P5 dLGN showed reduced C3 deposition in the dLGN of C4−/− mice compared to wild-type (WT) littermates. b, Quantification confirmed reduced C3 immunoreactivity in the dLGN (n = 3 mice per group, P < 0.05, t-test; y axis: mean fluorescence intensity, normalized to wild type). c, Co-localization analysis revealed a reduction in the fraction of VGLUT2+ puncta that were C3+ in C4-deficient mice relative to their WT littermates (n = 3 mice per group, P = 0.0011, two-sided t-test). d, Synaptic refinement in mice with 0, 1, or 2 copies of C4. These images represent the segregation of ipsilateral and contralateral RGC projections to the dLGN; two analysis methods were used. Top, projections from the ipsilateral (green) and contralateral (red) eyes show minimal overlap (yellow) in wild-type mice. The overlapping area is significantly increased in C4−/− mice (n = 6 mice per group, P < 0.01, ANOVA with Bonferroni post-hoc tests). Bottom, threshold-independent analysis using the R value50 (R = log10(Fipsi/Fcontra)). Pixels are pseudocoloured with an R value heat map (red indicates areas having only contralateral inputs; purple, only ipsilateral inputs). Compared to their wild-type littermates, C4-deficient mice exhibited lower R value variance, indicating defects in synaptic refinement (n = 6 mice per group, P < 0.001, ANOVA with Bonferroni post-hoc tests). Control experiments analysing total dLGN size, dLGN area receiving ipsilateral input, and number of RGCs are shown in Extended Data Fig. 10f–h, respectively. Error bars in bd represent s.e.m.

  8. Association of schizophrenia to common variants in the MHC locus in individual case-control cohorts, and schematic of the repeat module containing C4.
    Extended Data Fig. 1: Association of schizophrenia to common variants in the MHC locus in individual case-control cohorts, and schematic of the repeat module containing C4.

    af, Data for several schizophrenia case-control cohorts that were genome-scanned before we began this work (ad) exhibits peaks of association near chr6: 32 Mb (blue vertical line) on the human genome reference sequence (GRCh37/hg19). Note that association patterns vary from cohort to cohort, reflecting statistical sampling fluctuations and potentially fluctuations in allele frequencies of the (unknown) causal variants in different cohorts. Cohorts such as in b, e and f suggest the existence of effects at multiple loci within the MHC region. Even in the cohorts with simpler peaks (a, c, d), the pattern of association across the individual SNPs at chr6: 32 Mb does not correspond to the LD around any known variant. This motivated the focus in the current work on cryptic genetic influences in this region that could cause unconventional association signals that do not resemble the LD patterns of individual variants. g, A complex form of genome structural variation resides near chr6: 32 Mb. Shown here are three of the known alternative structural forms of this genomic region. The most prominent feature of this structural variation is the tandem duplication of a genomic segment that contains a C4 gene, 3′ fragments of the STK19 and TNXB genes, and a pseudogenized copy of the CYP21A2 gene. This cassette is present in 1–3 copies on the three alleles depicted above; the boundaries below each haplotype demarcate the sequence that is duplicated. Haplotypes with multiple copies of this module (middle and bottom) contain multiple functional copies of C4, whereas the additional gene fragments or copies denoted STK19P, CYP21A2P, and TNXA are typically pseudogenized. Rare haplotypes with a gain or loss of intact CYP21A2 have also been observed18. Although C4A and C4B contain multiple sequence variants, they are defined based on the differences encoded by exon 26, which determine the relative affinities of C4A and C4B for distinct molecular targets19, 20 (Fig. 1). Many additional forms of this locus appear to have arisen by non-allelic homologous recombination and gene conversion (ref. 18 and Fig. 1).

  9. Schematic of strategy for identifying the segregating structural forms of the C4 locus.
    Extended Data Fig. 2: Schematic of strategy for identifying the segregating structural forms of the C4 locus.

    a, Molecular assays for measuring copy number of the key, variable C4 structural features—the length polymorphism (HERV insertion) that distinguishes the long (L) from the short (S) genomic form of C4, and the C4A/C4B isotypic difference. Each primer–probe–primer assay is represented with the combination of arrows (primers) and asterisk (probe) in its approximate genomic location (though not to scale). b, Measurement of copy number of C4 gene types in the genomes of 162 individuals (from HapMap CEU sample). The absolute, integer copy number of each C4 gene type in each genome is precisely inferred from the resulting data. To ensure high accuracy, the data are further evaluated for a checksum relationship (A + B = L + S) and for concordance with earlier data from Southern blotting of 89 of the same HapMap individuals51. c, To measure the copy number of compound structural forms of C4 (involving combinations of L/S and A/B), we perform long-range PCR followed by quantitative measurement of the A/B isotype-distinguishing sequences in droplets. d, Analysis of transmissions in father–mother–offspring trios enables inference of the C4 gene contents of individual copies (alleles) of chromosome 6. Three example trios are shown in this schematic. e, Examples of the inferred structural forms of the C4 locus (more shown in Fig. 1c). For the common C4 structures (AL–BL, AL–BS, AL–AL, and BS), genomic order of the C4 gene copies is known from earlier assemblies of sequence contigs in individuals homozygous for MHC haplotypes due to consanguinity17 and other molecular analyses of the C4 locus18. For the rarer C4 structures, the genomic order of C4 gene copies is hypothesized or provisional.

  10. Linkage disequilibrium relationships (r2) of MHC SNPs to forms of C4 structural variation.
    Extended Data Fig. 3: Linkage disequilibrium relationships (r2) of MHC SNPs to forms of C4 structural variation.

    a, b, Correlations of SNPs in the MHC locus with copy number of C4 gene types (a) and larger-scale structural forms (haplotypes) (b) of the C4 locus. Dashed, vertical lines indicate the genomic location of the C4 locus. C4 structural forms show only partial correlation (r2) to the allelic states of nearby SNPs, reflecting the relationship shown in Fig. 2, in which a structural form of the C4 locus often segregates on multiple different SNP haplotypes.

  11. RNA expression of C4A and C4B in relation to copy number of C4A, C4B, and the C4–HERV (long form of C4), in eight panels of post-mortem brain tissue.
    Extended Data Fig. 4: RNA expression of C4A and C4B in relation to copy number of C4A, C4B, and the C4–HERV (long form of C4), in eight panels of post-mortem brain tissue.

    Copy number of C4 structural features was measured by ddPCR; RNA expression levels were measured by RT-ddPCR. ae, Data for tissues from the Stanley Medical Research Institute (SMRI) Array Consortium consisting of anterior cingulate cortex (a), cerebellum (b), corpus callosum (c), orbital frontal cortex (d), and parietal cortex (e). f, Data for the frontal cortex samples from the NHGRI Genes and Tissues Expression (GTEx) Project. g, h, Data for tissues from the SMRI Neuropathology Consortium (anterior cingulate cortex and cerebellum, respectively). These data were then used to inform (by linear regression) the derivation of a linear model for predicting each individual’s RNA expression of C4A and C4B as a function of the numbers of copies of AL, BL, AS, and BS. The derivation of this model, and the regression coefficients induced, are described in Supplementary Methods. In the rightmost plot of each panel, expression of C4A (per genomic copy) is normalized to expression of C4B (per genomic copy) to more specifically visualize the effect of the C4–HERV by controlling for genomic copy number and for any trans-acting influences shared by C4A and C4B; the inferred regression coefficients (Supplementary Methods) suggest that the observed effect is mostly due to increased expression of C4A.

  12. Detailed analysis of the association of schizophrenia to genetic variation at and around C4, in data from 28,799 schizophrenia cases and 35,986 controls.
    Extended Data Fig. 5: Detailed analysis of the association of schizophrenia to genetic variation at and around C4, in data from 28,799 schizophrenia cases and 35,986 controls.

    (Psychiatric Genomics Consortium, ref. 6.) SCZ, schizophrenia; β, estimated effect size per copy of the genomic feature or allele indicated; SE, standard error. Detailed association analyses of HLA alleles are in Extended Data Figs 6 and 7. The single asterisk (*) indicates that we specifically tested C4B-null status because a 1985 study52 reported an analysis of 165 schizophrenia patients and 330 controls in which rare C4B-null status associated with elevated risk of schizophrenia, though two subsequent studies53, 54 found no association of schizophrenia to C4B-null genotype. We sought to evaluate this using the large data set in this study, finding no association to C4B-null status. The double asterisk (**) indicates total copy number of C4 is also strongly correlated to copy number of the CYP21A2P pseudogene, which is present on duplicated copies of the sequence shown in Extended Data Fig. 1g.

  13. Evaluation of the association of schizophrenia with HLA alleles and coding-sequence polymorphisms.
    Extended Data Fig. 6: Evaluation of the association of schizophrenia with HLA alleles and coding-sequence polymorphisms.

    ae, Associations to HLA alleles and coding-sequence polymorphisms are shown in black; to provide the context of levels of association to nearby SNPs, associations to other SNPs are shown in grey. The series of conditional analyses shown in be parallels the analyses in Fig. 4. Further detail on the most strongly associating HLA alleles (including conditional association analysis) is provided in Extended Data Fig. 7.

  14. Detailed association analysis for the most strongly associating classical HLA alleles.
    Extended Data Fig. 7: Detailed association analysis for the most strongly associating classical HLA alleles.

    The most strongly associating HLA loci were HLA-B (in primary analyses, Fig. 4a and Extended Data Fig. 6a) and HLA-DRB1 and HLA-DQB1 (in analyses controlling for the signal defined by rs13194504, Fig. 4c and Extended Data Fig. 6b). At these loci, the most strongly associating classical HLA alleles were HLA-B*0801, HLA-DRB1*0301, and HLA-DQB*02, respectively. These HLA alleles are all in strong but partial LD with C4 BS, the most protective of the C4 alleles; they are also in partial LD with the low-risk allele at rs13194505, representing the distinct signal several megabases to the left (Fig. 4). In joint analyses with each of these HLA alleles, genetically predicted C4A expression and rs13194505 continued to associate strongly with schizophrenia, while the HLA alleles did not. In further joint analyses with rs13194504 and genetically predicted C4A expression, 0 of 2,514 tested HLA SNP, amino acid and classical-allele polymorphisms (from ref. 55, including all variants with minor allele frequency (MAF) >0.005) associated with schizophrenia as strongly as rs13194504 or predicted C4A expression did.

  15. Expression of C4A RNA in brain tissue (five brain regions) from 35 schizophrenia cases and 70 non-schizophrenia controls, from the Stanley Medical Research Institute Array Consortium.
    Extended Data Fig. 8: Expression of C4A RNA in brain tissue (five brain regions) from 35 schizophrenia cases and 70 non-schizophrenia controls, from the Stanley Medical Research Institute Array Consortium.

    C4A RNA expression levels were measured by ddPCR. P values are derived from Mann–Whitney U-test.

  16. Secretion of C4, and specificity of the monoclonal anti-C4 antibody for C4 protein in human brain tissue and cultured primary cortical neurons.
    Extended Data Fig. 9: Secretion of C4, and specificity of the monoclonal anti-C4 antibody for C4 protein in human brain tissue and cultured primary cortical neurons.

    a, Brain tissue (from an individual affected with schizophrenia) was stained with a fluorescent secondary antibody, C4 antibody, or C4 antibody that was pre-adsorbed with purified C4 protein. Confocal images demonstrate the loss of immunoreactivity in the secondary-only and pre-adsorbed conditions. b, Primary human neurons were stained with a fluorescent secondary antibody, C4 antibody or C4 antibody that was pre-adsorbed with purified C4 protein. Confocal images demonstrate the loss of immunoreactivity in the secondary-only and pre-adsorbed conditions. Scale bars, 25 μm. c, Secretion of C4 protein by cultured primary neurons. Western blot for C4 protein analysis. (+) Purified human C4 protein. (–) Unconditioned medium, a negative control. HN-conditioned shows the same medium after conditioning by cultured human neurons at days 7 (d7) and 30 (d30). Details of western blot protocol, antibody catalogue numbers and concentrations used are in Supplementary Methods. C4 molecular weight, ~210 kDa.

  17. Mouse C4 genes and additional analyses of the dLGN eye segregation phenotype in C4 mutant mice and wild-type and heterozygous littermate controls.
    Extended Data Fig. 10: Mouse C4 genes and additional analyses of the dLGN eye segregation phenotype in C4 mutant mice and wild-type and heterozygous littermate controls.

    a, The functional specialization of C4 into C4A and C4B in humans does not have an analogy in mice. Although the mouse genome contains both a C4 gene and a C4-like gene (classically called Slp), and these genes are also present as a tandem duplication within the mouse MHC locus, analysis of the encoded protein sequences indicates a distinct specialization, as illustrated by the protein phylogenetic tree. Top, mouse Slp is indicated in grey to reflect its potential pseudogenization: Slp is already known to have mutations at a C1s cleavage site, which are thought to abrogate activation of the protein through the classical complement pathway56; and the M. musculus reference genome sequence (mm10) at Slp shows a 1-bp deletion (relative to C4) within the coding region at chr17:34815158, which would be predicted to cause a premature termination of the encoded protein. In some genome data resources, mouse Slp and C4 have been annotated respectively as ‘C4a’ (for example, NM_011413.2) and ‘C4b’ (for example, NM_009780.2) based on synteny with the human C4A and C4B genes, but the above sequence analysis indicates that they are not paralogous to C4A and C4B. b, Sequence differences between C4A and C4B—which are otherwise 99.5% identical at an amino acid level—are concentrated at the ‘isotypic site’ where they shape each isotype’s relative affinity for different molecular targets19, 20. At the isotypic site, mouse C4 contains a combination of the residues present in human C4A and C4B. c, Expression of mouse C4 mRNA in whole retina and lateral geniculate nucleus (LGN) from P5 animals and in purified retinal ganglion cells (RGCs) from P5 and P15 animals. These time points were chosen as P5 is a time of more robust synaptic refinement in the retinogeniculate system compared to P15. The same assays detected no C4 RNA in control RNA isolated from C4−/− mice (not shown). n = 3 samples for p5 retina, LGN, and P15 RGCs, n = 4 samples for P5 RGCs; *P < 0.05 by ANOVA with post-hoc Tukey–Kramer multiple-comparisons test. d, Representative images of dLGN innervation by contralateral projections (red in bottom image), ipsilateral projections (green in bottom image), and their overlap (yellow in bottom image). Scale bar, 100 μm. e, Quantification of the percentage of total dLGN area receiving both contralateral and ipsilateral projections shows a significant increase in C4−/− compared to wild-type littermates (ANOVA, n = 5 mice per group, P < 0.01). These data are consistent with results using R value analysis as shown in Fig. 7. f, Quantification of total dLGN area showed no significant difference between wild-type and C4−/− mice (ANOVA, n = 5 per group, P > 0.05). g, Quantification of dLGN area receiving ipsilateral innervation showed a significant increase in ipsilateral territory in the C4−/− mice compared to wild-type littermates (AVOVA, n = 5 mice per group, P > 0.01). This result is consistent with defects in eye specific segregation. Scale bar, 100 μm. h, The number of RGCs in the retina was estimated by counting the number of Brn3a+ cells in wild-type and C4−/− mice. No differences were observed between wild-type and C4−/− mice (t-test, n = 4 mice per group, P > 0.05). Scale bar, 100 μm.

References

  1. Cannon, T. D. et al. Cortex mapping reveals regionally specific patterns of genetic and disease-specific gray-matter deficits in twins discordant for schizophrenia. Proc. Natl Acad. Sci. USA 99, 32283233 (2002)
  2. Cannon, T. D. et al. Progressive reduction in cortical thickness as psychosis develops: a multisite longitudinal neuroimaging study of youth at elevated clinical risk. Biol. Psychiatry 77, 147157 (2015)
  3. Garey, L. J. et al. Reduced dendritic spine density on cerebral cortical pyramidal neurons in schizophrenia. J. Neurol. Neurosurg. Psychiatry 65, 446453 (1998)
  4. Glantz, L. A. & Lewis, D. A. Decreased dendritic spine density on prefrontal cortical pyramidal neurons in schizophrenia. Arch. Gen. Psychiatry 57, 6573 (2000)
  5. Glausier, J. R. & Lewis, D. A. Dendritic spine pathology in schizophrenia. Neuroscience 251, 90107 (2013)
  6. Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421427 (2014)
  7. Shi, J. et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature 460, 753757 (2009)
  8. Stefansson, H. et al. Common variants conferring risk of schizophrenia. Nature 460, 744747 (2009)
  9. International Schizophrenia Consortium et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748752 (2009)
  10. Schizophrenia Psychiatric Genome-Wide Association Study Consortium. Genome-wide association study identifies five new schizophrenia loci. Nature Genet . 43, 969976 (2011)
  11. Howson, J. M., Walker, N. M., Clayton, D. & Todd, J. A. Confirmation of HLA class II independent type 1 diabetes associations in the major histocompatibility complex including HLA-B and HLA-A. Diabetes Obes. Metab. 11 (Suppl 1), 3145 (2009)
  12. Raychaudhuri, S. et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nature Genet. 44, 291296 (2012)
  13. Escudero-Esparza, A., Kalchishkova, N., Kurbasic, E., Jiang, W. G. & Blom, A. M. The novel complement inhibitor human CUB and Sushi multiple domains 1 (CSMD1) protein promotes factor I-mediated degradation of C4b and C3b and inhibits the membrane attack complex assembly. FASEB J . 27, 50835093 (2013)
  14. Carroll, M. C., Campbell, R. D., Bentley, D. R. & Porter, R. R. A molecular map of the human major histocompatibility complex class III region linking complement genes C4, C2 and factor B. Nature 307, 237241 (1984)
  15. Carroll, M. C., Belt, T., Palsdottir, A. & Porter, R. R. Structure and organization of the C4 genes. Phil. Trans. R. Soc. Lond. B 306, 379388 (1984)
  16. Dangel, A. W. et al. The dichotomous size variation of human complement C4 genes is mediated by a novel family of endogenous retroviruses, which also establishes species-specific genomic patterns among Old World primates. Immunogenetics 40, 425436 (1994)
  17. Horton, R. et al. Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project. Immunogenetics 60, 118 (2008)
  18. Bánlaki, Z., Doleschall, M., Rajczy, K., Fust, G. & Szilagyi, A. Fine-tuned characterization of RCCX copy number variants and their relationship with extended MHC haplotypes. Genes Immun. 13, 530535 (2012)
  19. Law, S. K., Dodds, A. W. & Porter, R. R. A comparison of the properties of two classes, C4A and C4B, of the human complement component C4. EMBO J. 3, 18191823 (1984)
  20. Isenman, D. E. & Young, J. R. The molecular basis for the difference in immune hemolysis activity of the Chido and Rodgers isotypes of human complement component C4. J. Immunol. 132, 30193027 (1984)
  21. Illarionova, A. E., Vinogradova, T. V. & Sverdlov, E. D. Only those genes of the KIAA1245 gene subfamily that contain HERV(K) LTRs in their introns are transcriptionally active. Virology 358, 3947 (2007)
  22. Nakamura, A., Okazaki, Y., Sugimoto, J., Oda, T. & Jinno, Y. Human endogenous retroviruses with transcriptional potential in the brain. J. Hum. Genet. 48, 575581 (2003)
  23. Suntsova, M. et al. Human-specific endogenous retroviral insert serves as an enhancer for the schizophrenia-linked gene PRODH. Proc. Natl Acad. Sci. USA 110, 1947219477 (2013)
  24. Yang, Y. et al. Diversity in intrinsic strengths of the human complement system: serum C4 protein concentrations correlate with C4 gene size and polygenic variations, hemolytic activities, and body mass index. J. Immunol. 171, 27342745 (2003)
  25. Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 10841097 (2007)
  26. Iossifov, I. et al. The contribution of de novo coding mutations to autism spectrum disorder. Nature 515, 216221 (2014)
  27. Mayilyan, K. R., Arnold, J. N., Presanis, J. S., Soghoyan, A. F. & Sim, R. B. Increased complement classical and mannan-binding lectin pathway activities in schizophrenia. Neurosci. Lett. 404, 336341 (2006)
  28. Hakobyan, S., Boyajyan, A. & Sim, R. B. Classical pathway complement activity in schizophrenia. Neurosci. Lett. 374, 3537 (2005)
  29. Stevens, B. et al. The classical complement cascade mediates CNS synapse elimination. Cell 131, 11641178 (2007)
  30. Schafer, D. P. et al. Microglia sculpt postnatal neural circuits in an activity and complement-dependent manner. Neuron 74, 691705 (2012)
  31. Bialas, A. R. & Stevens, B. TGF-β signaling regulates neuronal C1q expression and developmental synaptic refinement. Nature Neurosci. 16, 17731782 (2013)
  32. Kaiser, T. & Feng, G. Modeling psychiatric disorders for developing effective treatments. Nature Med. 21, 979988 (2015)
  33. Shatz, C. J. & Kirkwood, P. A. Prenatal development of functional connections in the cat’s retinogeniculate pathway. J. Neurosci. 4, 13781397 (1984)
  34. Sretavan, D. W. & Shatz, C. J. Prenatal development of retinal ganglion cell axons: segregation into eye-specific layers within the cat’s lateral geniculate nucleus. J. Neurosci. 6, 234251 (1986)
  35. Chen, C. & Regehr, W. G. Developmental remodeling of the retinogeniculate synapse. Neuron 28, 955966 (2000)
  36. Fischer, M. B. et al. Regulation of the B cell response to T-dependent antigens by classical pathway complement. J. Immunol. 157, 549556 (1996)
  37. Huttenlocher, P. R. & Dabholkar, A. S. Regional differences in synaptogenesis in human cerebral cortex. J. Comp. Neurol. 387, 167178 (1997)
  38. Huttenlocher, P. R. Synaptic density in human frontal cortex—developmental changes and effects of aging. Brain Res. 163, 195205 (1979)
  39. Petanjek, Z. et al. Extraordinary neoteny of synaptic spines in the human prefrontal cortex. Proc. Natl Acad. Sci. USA 108, 1328113286 (2011)
  40. Buckner, R. L. & Krienen, F. M. The evolution of distributed association networks in the human brain. Trends Cogn. Sci. 17, 648665 (2013)
  41. Feinberg, I. Schizophrenia: caused by a fault in programmed synaptic elimination during adolescence? J. Psychiatr. Res. 17, 319334 (1982–1983)
  42. Kirov, G. et al. De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophrenia. Mol. Psychiatry 17, 142153 (2012)
  43. Fromer, M. et al. De novo mutations in schizophrenia implicate synaptic networks. Nature 506, 179184 (2014)
  44. Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185190 (2014)
  45. Datwani, A. et al. Classical MHCI molecules regulate retinogeniculate refinement and limit ocular dominance plasticity. Neuron 64, 463470 (2009)
  46. Lee, H. et al. Synapse elimination and learning rules co-regulated by MHC class I H2-Db. Nature 509, 195200 (2014)
  47. van den Elsen, J. M. et al. X-ray crystal structure of the C4d fragment of human complement component C4. J. Mol. Biol. 322, 11031115 (2002)
  48. Dodds, A. W., Ren, X. D., Willis, A. C. & Law, S. K. The reaction mechanism of the internal thioester in the human complement component C4. Nature 379, 177179 (1996)
  49. Handsaker, R. E. et al. Large multiallelic copy number variations in humans. Nature Genet. 47, 296303 (2015)
  50. Torborg, C. L. & Feller, M. B. Unbiased analysis of bulk axonal segregation patterns. J. Neurosci. Methods 135, 1726 (2004)
  51. Fernando, M. M. et al. Assessment of complement C4 gene copy number using the paralog ratio test. Hum. Mutat. 31, 866874 (2010)
  52. Rudduck, C., Beckman, L., Franzen, G., Jacobsson, L. & Lindstrom, L. Complement factor C4 in schizophrenia. Hum. Hered. 35, 223226 (1985)
  53. Schroers, R. et al. Investigation of complement C4B deficiency in schizophrenia. Hum. Hered. 47, 279282 (1997)
  54. Mayilyan, K. R., Dodds, A. W., Boyajyan, A. S., Soghoyan, A. F. & Sim, R. B. Complement C4B protein in schizophrenia. World J. Biol. Psychiatry 9, 225230 (2008)
  55. Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS ONE 8, e64683 (2013)
  56. Nonaka, M., Nakayama, K., Yeul, Y. D. & Takahashi, M. Complete nucleotide and derived amino acid sequences of sex-limited protein (Slp), nonfunctional isotype of the fourth component of mouse complement (C4). J. Immunol. 136, 29892993 (1986)

Download references

Author information

Affiliations

  1. Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Aswin Sekar,
    • Heather de Rivera,
    • Avery Davis,
    • Nolan Kamitaki,
    • Katherine Tooley,
    • Matthew Baum,
    • Vanessa Van Doren,
    • Giulio Genovese,
    • Robert E. Handsaker &
    • Steven A. McCarroll
  2. Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA

    • Aswin Sekar,
    • Heather de Rivera,
    • Avery Davis,
    • Nolan Kamitaki,
    • Katherine Tooley,
    • Matthew Baum,
    • Giulio Genovese,
    • Samuel A. Rose,
    • Robert E. Handsaker,
    • Mark J. Daly,
    • Beth Stevens &
    • Steven A. McCarroll
  3. MD-PhD Program, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Aswin Sekar &
    • Matthew Baum
  4. Department of Neurology, F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Allison R. Bialas,
    • Timothy R. Hammond,
    • Matthew Baum &
    • Beth Stevens
  5. Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Boston, Massachusetts 02115, USA

    • Allison R. Bialas,
    • Jessy Presumey &
    • Michael C. Carroll
  6. Analytical and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA

    • Mark J. Daly

Consortia

  1. Schizophrenia Working Group of the Psychiatric Genomics Consortium

  2. Lists of participants and their affiliations appear in the Supplementary Information.

Contributions

S.A.M. and A.S. conceived the genetic studies. A.S. performed the laboratory experiments and computational analyses to understand the molecular and population genetics of the C4 locus (Figs 1 and 2). A.S., K.T., N.K., and V.V.D. analysed C4 expression variation in human brain (Figs 3 and 5b, d). G.G., R.E.H., and S.A.R. contributed to genetic analyses. A.S. and A.D. did the imputation and association analysis (Figs 4 and 5a, c). M.J.D. provided advice on the association analyses. Investigators in the Schizophrenia Working Group of the Psychiatric Genomics Consortium collected and phenotyped cohorts and contributed genotype data for analysis. B.S. and M.C.C. contributed expertise and reagents for experiments described in Fig. 6 and 7. H.d.R and T.R.H. performed the C4 immunocytochemistry and immunohistochemistry experiments respectively, with advice from A.R.B. (Fig. 6). A.R.B. and J.P. analysed the role of C4 in synaptic refinement in the mouse visual system (Fig. 7). M.B. analysed C4 expression in mice. S.A.M and A.S. wrote the manuscript with contributions from all authors.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Association of schizophrenia to common variants in the MHC locus in individual case-control cohorts, and schematic of the repeat module containing C4. (339 KB)

    af, Data for several schizophrenia case-control cohorts that were genome-scanned before we began this work (ad) exhibits peaks of association near chr6: 32 Mb (blue vertical line) on the human genome reference sequence (GRCh37/hg19). Note that association patterns vary from cohort to cohort, reflecting statistical sampling fluctuations and potentially fluctuations in allele frequencies of the (unknown) causal variants in different cohorts. Cohorts such as in b, e and f suggest the existence of effects at multiple loci within the MHC region. Even in the cohorts with simpler peaks (a, c, d), the pattern of association across the individual SNPs at chr6: 32 Mb does not correspond to the LD around any known variant. This motivated the focus in the current work on cryptic genetic influences in this region that could cause unconventional association signals that do not resemble the LD patterns of individual variants. g, A complex form of genome structural variation resides near chr6: 32 Mb. Shown here are three of the known alternative structural forms of this genomic region. The most prominent feature of this structural variation is the tandem duplication of a genomic segment that contains a C4 gene, 3′ fragments of the STK19 and TNXB genes, and a pseudogenized copy of the CYP21A2 gene. This cassette is present in 1–3 copies on the three alleles depicted above; the boundaries below each haplotype demarcate the sequence that is duplicated. Haplotypes with multiple copies of this module (middle and bottom) contain multiple functional copies of C4, whereas the additional gene fragments or copies denoted STK19P, CYP21A2P, and TNXA are typically pseudogenized. Rare haplotypes with a gain or loss of intact CYP21A2 have also been observed18. Although C4A and C4B contain multiple sequence variants, they are defined based on the differences encoded by exon 26, which determine the relative affinities of C4A and C4B for distinct molecular targets19, 20 (Fig. 1). Many additional forms of this locus appear to have arisen by non-allelic homologous recombination and gene conversion (ref. 18 and Fig. 1).

  2. Extended Data Figure 2: Schematic of strategy for identifying the segregating structural forms of the C4 locus. (286 KB)

    a, Molecular assays for measuring copy number of the key, variable C4 structural features—the length polymorphism (HERV insertion) that distinguishes the long (L) from the short (S) genomic form of C4, and the C4A/C4B isotypic difference. Each primer–probe–primer assay is represented with the combination of arrows (primers) and asterisk (probe) in its approximate genomic location (though not to scale). b, Measurement of copy number of C4 gene types in the genomes of 162 individuals (from HapMap CEU sample). The absolute, integer copy number of each C4 gene type in each genome is precisely inferred from the resulting data. To ensure high accuracy, the data are further evaluated for a checksum relationship (A + B = L + S) and for concordance with earlier data from Southern blotting of 89 of the same HapMap individuals51. c, To measure the copy number of compound structural forms of C4 (involving combinations of L/S and A/B), we perform long-range PCR followed by quantitative measurement of the A/B isotype-distinguishing sequences in droplets. d, Analysis of transmissions in father–mother–offspring trios enables inference of the C4 gene contents of individual copies (alleles) of chromosome 6. Three example trios are shown in this schematic. e, Examples of the inferred structural forms of the C4 locus (more shown in Fig. 1c). For the common C4 structures (AL–BL, AL–BS, AL–AL, and BS), genomic order of the C4 gene copies is known from earlier assemblies of sequence contigs in individuals homozygous for MHC haplotypes due to consanguinity17 and other molecular analyses of the C4 locus18. For the rarer C4 structures, the genomic order of C4 gene copies is hypothesized or provisional.

  3. Extended Data Figure 3: Linkage disequilibrium relationships (r2) of MHC SNPs to forms of C4 structural variation. (230 KB)

    a, b, Correlations of SNPs in the MHC locus with copy number of C4 gene types (a) and larger-scale structural forms (haplotypes) (b) of the C4 locus. Dashed, vertical lines indicate the genomic location of the C4 locus. C4 structural forms show only partial correlation (r2) to the allelic states of nearby SNPs, reflecting the relationship shown in Fig. 2, in which a structural form of the C4 locus often segregates on multiple different SNP haplotypes.

  4. Extended Data Figure 4: RNA expression of C4A and C4B in relation to copy number of C4A, C4B, and the C4–HERV (long form of C4), in eight panels of post-mortem brain tissue. (268 KB)

    Copy number of C4 structural features was measured by ddPCR; RNA expression levels were measured by RT-ddPCR. ae, Data for tissues from the Stanley Medical Research Institute (SMRI) Array Consortium consisting of anterior cingulate cortex (a), cerebellum (b), corpus callosum (c), orbital frontal cortex (d), and parietal cortex (e). f, Data for the frontal cortex samples from the NHGRI Genes and Tissues Expression (GTEx) Project. g, h, Data for tissues from the SMRI Neuropathology Consortium (anterior cingulate cortex and cerebellum, respectively). These data were then used to inform (by linear regression) the derivation of a linear model for predicting each individual’s RNA expression of C4A and C4B as a function of the numbers of copies of AL, BL, AS, and BS. The derivation of this model, and the regression coefficients induced, are described in Supplementary Methods. In the rightmost plot of each panel, expression of C4A (per genomic copy) is normalized to expression of C4B (per genomic copy) to more specifically visualize the effect of the C4–HERV by controlling for genomic copy number and for any trans-acting influences shared by C4A and C4B; the inferred regression coefficients (Supplementary Methods) suggest that the observed effect is mostly due to increased expression of C4A.

  5. Extended Data Figure 5: Detailed analysis of the association of schizophrenia to genetic variation at and around C4, in data from 28,799 schizophrenia cases and 35,986 controls. (376 KB)

    (Psychiatric Genomics Consortium, ref. 6.) SCZ, schizophrenia; β, estimated effect size per copy of the genomic feature or allele indicated; SE, standard error. Detailed association analyses of HLA alleles are in Extended Data Figs 6 and 7. The single asterisk (*) indicates that we specifically tested C4B-null status because a 1985 study52 reported an analysis of 165 schizophrenia patients and 330 controls in which rare C4B-null status associated with elevated risk of schizophrenia, though two subsequent studies53, 54 found no association of schizophrenia to C4B-null genotype. We sought to evaluate this using the large data set in this study, finding no association to C4B-null status. The double asterisk (**) indicates total copy number of C4 is also strongly correlated to copy number of the CYP21A2P pseudogene, which is present on duplicated copies of the sequence shown in Extended Data Fig. 1g.

  6. Extended Data Figure 6: Evaluation of the association of schizophrenia with HLA alleles and coding-sequence polymorphisms. (386 KB)

    ae, Associations to HLA alleles and coding-sequence polymorphisms are shown in black; to provide the context of levels of association to nearby SNPs, associations to other SNPs are shown in grey. The series of conditional analyses shown in be parallels the analyses in Fig. 4. Further detail on the most strongly associating HLA alleles (including conditional association analysis) is provided in Extended Data Fig. 7.

  7. Extended Data Figure 7: Detailed association analysis for the most strongly associating classical HLA alleles. (110 KB)

    The most strongly associating HLA loci were HLA-B (in primary analyses, Fig. 4a and Extended Data Fig. 6a) and HLA-DRB1 and HLA-DQB1 (in analyses controlling for the signal defined by rs13194504, Fig. 4c and Extended Data Fig. 6b). At these loci, the most strongly associating classical HLA alleles were HLA-B*0801, HLA-DRB1*0301, and HLA-DQB*02, respectively. These HLA alleles are all in strong but partial LD with C4 BS, the most protective of the C4 alleles; they are also in partial LD with the low-risk allele at rs13194505, representing the distinct signal several megabases to the left (Fig. 4). In joint analyses with each of these HLA alleles, genetically predicted C4A expression and rs13194505 continued to associate strongly with schizophrenia, while the HLA alleles did not. In further joint analyses with rs13194504 and genetically predicted C4A expression, 0 of 2,514 tested HLA SNP, amino acid and classical-allele polymorphisms (from ref. 55, including all variants with minor allele frequency (MAF) >0.005) associated with schizophrenia as strongly as rs13194504 or predicted C4A expression did.

  8. Extended Data Figure 8: Expression of C4A RNA in brain tissue (five brain regions) from 35 schizophrenia cases and 70 non-schizophrenia controls, from the Stanley Medical Research Institute Array Consortium. (102 KB)

    C4A RNA expression levels were measured by ddPCR. P values are derived from Mann–Whitney U-test.

  9. Extended Data Figure 9: Secretion of C4, and specificity of the monoclonal anti-C4 antibody for C4 protein in human brain tissue and cultured primary cortical neurons. (552 KB)

    a, Brain tissue (from an individual affected with schizophrenia) was stained with a fluorescent secondary antibody, C4 antibody, or C4 antibody that was pre-adsorbed with purified C4 protein. Confocal images demonstrate the loss of immunoreactivity in the secondary-only and pre-adsorbed conditions. b, Primary human neurons were stained with a fluorescent secondary antibody, C4 antibody or C4 antibody that was pre-adsorbed with purified C4 protein. Confocal images demonstrate the loss of immunoreactivity in the secondary-only and pre-adsorbed conditions. Scale bars, 25 μm. c, Secretion of C4 protein by cultured primary neurons. Western blot for C4 protein analysis. (+) Purified human C4 protein. (–) Unconditioned medium, a negative control. HN-conditioned shows the same medium after conditioning by cultured human neurons at days 7 (d7) and 30 (d30). Details of western blot protocol, antibody catalogue numbers and concentrations used are in Supplementary Methods. C4 molecular weight, ~210 kDa.

  10. Extended Data Figure 10: Mouse C4 genes and additional analyses of the dLGN eye segregation phenotype in C4 mutant mice and wild-type and heterozygous littermate controls. (312 KB)

    a, The functional specialization of C4 into C4A and C4B in humans does not have an analogy in mice. Although the mouse genome contains both a C4 gene and a C4-like gene (classically called Slp), and these genes are also present as a tandem duplication within the mouse MHC locus, analysis of the encoded protein sequences indicates a distinct specialization, as illustrated by the protein phylogenetic tree. Top, mouse Slp is indicated in grey to reflect its potential pseudogenization: Slp is already known to have mutations at a C1s cleavage site, which are thought to abrogate activation of the protein through the classical complement pathway56; and the M. musculus reference genome sequence (mm10) at Slp shows a 1-bp deletion (relative to C4) within the coding region at chr17:34815158, which would be predicted to cause a premature termination of the encoded protein. In some genome data resources, mouse Slp and C4 have been annotated respectively as ‘C4a’ (for example, NM_011413.2) and ‘C4b’ (for example, NM_009780.2) based on synteny with the human C4A and C4B genes, but the above sequence analysis indicates that they are not paralogous to C4A and C4B. b, Sequence differences between C4A and C4B—which are otherwise 99.5% identical at an amino acid level—are concentrated at the ‘isotypic site’ where they shape each isotype’s relative affinity for different molecular targets19, 20. At the isotypic site, mouse C4 contains a combination of the residues present in human C4A and C4B. c, Expression of mouse C4 mRNA in whole retina and lateral geniculate nucleus (LGN) from P5 animals and in purified retinal ganglion cells (RGCs) from P5 and P15 animals. These time points were chosen as P5 is a time of more robust synaptic refinement in the retinogeniculate system compared to P15. The same assays detected no C4 RNA in control RNA isolated from C4−/− mice (not shown). n = 3 samples for p5 retina, LGN, and P15 RGCs, n = 4 samples for P5 RGCs; *P < 0.05 by ANOVA with post-hoc Tukey–Kramer multiple-comparisons test. d, Representative images of dLGN innervation by contralateral projections (red in bottom image), ipsilateral projections (green in bottom image), and their overlap (yellow in bottom image). Scale bar, 100 μm. e, Quantification of the percentage of total dLGN area receiving both contralateral and ipsilateral projections shows a significant increase in C4−/− compared to wild-type littermates (ANOVA, n = 5 mice per group, P < 0.01). These data are consistent with results using R value analysis as shown in Fig. 7. f, Quantification of total dLGN area showed no significant difference between wild-type and C4−/− mice (ANOVA, n = 5 per group, P > 0.05). g, Quantification of dLGN area receiving ipsilateral innervation showed a significant increase in ipsilateral territory in the C4−/− mice compared to wild-type littermates (AVOVA, n = 5 mice per group, P > 0.01). This result is consistent with defects in eye specific segregation. Scale bar, 100 μm. h, The number of RGCs in the retina was estimated by counting the number of Brn3a+ cells in wild-type and C4−/− mice. No differences were observed between wild-type and C4−/− mice (t-test, n = 4 mice per group, P > 0.05). Scale bar, 100 μm.

Supplementary information

PDF files

  1. Supplementary Information (2.1 MB)

    This file contains Supplementary Methods, Supplementary Tables 1-3, a full list of the collaborators from the PGC Schizophrenia Working Group and additional references. This file was replaced on 11 April 2016 to update affiliation 210.

Comments

  1. Report this comment #67589

    Ilya Grushevskiy said:

    Was there any data taken on what fraction of these individuals suffered from child abuse/neglect.. The correlation between abuse and schizophrenia seems quite strong, even if causation is put down to a gene, surely this is missing the main issue?

Subscribe to comments

Additional data