Increased co-expression of genes harboring the damaging de novo mutations in Chinese schizophrenic patients during prenatal development

Schizophrenia is a heritable, heterogeneous common psychiatric disorder. In this study, we evaluated the hypothesis that de novo variants (DNVs) contribute to the pathogenesis of schizophrenia. We performed exome sequencing in Chinese patients (N = 45) with schizophrenia and their unaffected parents (N = 90). Forty genes were found to contain DNVs. These genes had enriched transcriptional co-expression profile in prenatal frontal cortex (Bonferroni corrected p < 9.1 × 10−3), and in prenatal temporal and parietal regions (Bonferroni corrected p < 0.03). Also, four prenatal anatomical subregions (VCF, MFC, OFC and ITC) have shown significant enrichment of connectedness in co-expression networks. Moreover, four genes (LRP1, MACF1, DICER1 and ABCA2) harboring the damaging de novo mutations are strongly prioritized as susceptibility genes by multiple evidences. Our findings in Chinese schizophrenic patients indicate the pathogenic role of DNVs, supporting the hypothesis that schizophrenia is a neurodevelopmental disease.


Digit span, and Block design) from Wechsler Adult Intelligence Scale -Revised in China 4 ;
Immediate and delayed logical memory test, and Immediate and delayed Visual Reproduction test from Wechsler Memory Scale -Revised in China 5 . In addition, Trail Making, Part A and Part B-M 6,7 , Verbal fluency test 8 , Modified Wisconsin Card Sorting Test-Modified 9 (WCST-M; Nelson, 1976) and Tower of Hanoi 10 are included in the battery. Handedness was assessed by the Annett handedness scale 11 for all participants. All neuropsychological tests we used here have been reported in our previous studies, which analyzed neurocognitive deficits in first-episode schizophrenic patients and their first-degree relatives 12,13 . The details of the complete procedure for each test are reported elsewhere 4,5,9,12,14,15 .

Pathogenic scores detections for DNVs
Since only part of DNVs to be pathogenic, we limited our functional analyses to DNVs defined by the following categories: (1) KGGSeq 16 was used to map the extracted DNVs systematically; (2) The variants were mapped into genes according to three gene definitions (RefGene, KnownGene and GEncode); (3) Alternative allele frequencies from 1000 Genome Project 17

and NHLBI Grand
Opportunity Exome Sequencing Project (https://esp.gs.washington.edu/drupal/) were used to annotate the de novo variants. A variant was regarded as a non-synonymous mutation as long as it was supported by one of gene definitions mentioned above; (4) The non-synonymous variant was annotated for its protein damming or deleteriousness potential by seven in silico prediction algorithms (SIFT 18

Probability, and Residual Variation Intolerance Scores (RVIS)
We used the gene-based probability of exhibiting haploinsufficiency from supplementary data set 2 generated by Huang et al 24 to evaluate the potential intolerance of genes with DNV in the mutational class groupings. We compared the distribution of haploinsufficiency probabilities for genes in the defined mutational categories to the rest of the genome using a two-sided Wilcoxon rank sum test. We found that genes with de novo nonsense mutations ranked significantly higher with greater probabilities of haploinsufficiency. The genes with nonsense DNVs were ranked with high probabilities of haploinsufficiency as showing in Table S3. We

Brain-critical exon scores
Under purifying selection, the genes harbouring deleterious DNVs are difficult to transmit to next generation because of high effect sizes. In order to escape the pressure of selection, high expression exon in brain cannot have high burden of rare missense mutations. So, the specific exons (brain-critical exons), in which there are inverse correlations between their expression level in brain and burden of rare missense mutations, could be deleterious 27 . Uddin et al. found that brain-critical exons under purifying selection should be prioritized in genotype-phenotype studies for autism spectrum disorder and related neurodevelopmental conditions, including schizophrenia. 3955 brain-critical exons in 1744 genes were detected in their study 28 .

Gene ontology (GO) annotation
In gene ontology annotation, we determined that 8 GOs were significantly overrepresented in the 40 genes with predicted damaging DNVs (Bonferroni corrected p ≤ 0.05, hypergeometric distribution test) (see Figure S7). Of these GO terms, GO:00051015 is related mainly to actin filament binding and functions interactively with other molecule to keep cytoskeleton wholeness during cell development and migration 29 . Moreover, Analysis by geneMANIA (http://www.genemania.org/) detected physical interaction between MACF1 and DISC1 with both of these 2 genes being involved in function of microtubule-based process (P < 9.04 × 10 -5 ).
Although other terms were not reported to be directly involved in schizophrenia per se, some of them are linked to cell growth and DNA synthesis, such as nucleotide binding (GO:0000166), helicase activity (GO:0004386) and ATP-binding (GO:0005524), others like cell-cell junction (GO:0005911) and establishment of polarity linked to cell migration (GO:0007163).
The GO terms significantly enriched with genes harbouring damaging DNVs detected in this study are largely related to biological function and process in the early developmental stage, which is consistent with the fact that disruption of neural growth in schizophrenia happens as early as in the prenatal stage. GO:0051015 bears some resemblance to the category found in a DNVs study (GO:0051017) 30 . Both pathways work on actin filament which is highly enriched in dendritic spine 31 and keep cytoskeleton wholeness during cell development and migration 29 . Moreover, the dendritic spine that is one of the most important cytoskeletal components in the neural circuits of cortex was shown to interact profoundly with actin filament and to be significantly disrupted in schizophrenia patients 31,32 . Dendritic spine has also been shown to be significantly defective in dorsal prefrontal context (DPFLC) of schizophrenic patients 32 . GO:00051015 is related mainly to actin filament binding while GO:00051017 is related to actin filament bundling. Both molecules function interactively to keep cytoskeleton wholeness during cell development and migration 29 .
Another evidence worth mentioning is that many by-far-known candidate genes of schizophrenia like DISC1 and dysbindin take effect based on interfacing with actin cytoskeleton or cytoskeletal proteins such as gird 33 . Furthermore, one other study detected interaction among MACF1, DISC1 and dysbindin, which indicated that possible parts of MACF1 mutants played on neruodevelopmental pathway of schizophrenia was likely materialized in structure and function of synapses by disruption of intracellular transportation and cytoskeletal stability 34 . Besides spatial feature, two studies have already identified the temporal relationship of expression profile of enriched GO term with neuron development.

Supplementary Tables
Note: a, the percentage indicates the top 5% of constrained genes; b, the percentage indicates the top 5% of RVIS.  The figure showed that the relatedness based on the called genotypes was consistent with the kinship, suggesting very good quality of the sequencing data. Comparison of average sequence reads at > 1X, and > 10X coverage amongst different sub-groups. There were no significant differences amongst trios with sporadic case and family history, and those with or without de novo SNVs.    Note: The p value for each pathway is indicated by the green bar and is expressed as -1 times the log of the p value. The yellow line and point represents that, in a given pathway, the ratio of the number of genes that meet the cutoff criteria divided by the total number of genes that makes up the pathway (50 times the number of genes carrying de novo damaging mutations in each pathway/ the total number of genes in a given pathway). Top 8 GOs remain significance statistically after Bonferroni correction (p < 0.05). The subjects with missing value for a cognitive measurement were ignored. Note: The pipeline describes the procedure for sequence data analysis and de novo determination, which includes: (1) BWA alignment reads of raw data to the reference human genome (hg19); (2) Using Picard to collect quality statistics and fix readgroup problem; (3) GATK for IN/DEL alignment by using dbsnp132 and 1000indel as known IN/DEL sites; (4) Fix mate information and mark duplication with Picard; (5) Using Samtools to filter out low-quality reads or GATK clipping reads of low quality; (6) Using GATK to perform SNP and INDEL calling; (7) KGGseq filter De novo variants and functional impact annotation; (8) Experimental validation by Sanger sequencing.