Altered spinogenesis in iPSC-derived cortical neurons from patients with autism carrying de novo SHANK3 mutations

The synaptic protein SHANK3 encodes a multidomain scaffold protein expressed at the postsynaptic density of neuronal excitatory synapses. We previously identified de novo SHANK3 mutations in patients with autism spectrum disorders (ASD) and showed that SHANK3 represents one of the major genes for ASD. Here, we analyzed the pyramidal cortical neurons derived from induced pluripotent stem cells from four patients with ASD carrying SHANK3 de novo truncating mutations. At 40–45 days after the differentiation of neural stem cells, dendritic spines from pyramidal neurons presented variable morphologies: filopodia, thin, stubby and muschroom, as measured in 3D using GFP labeling and immunofluorescence. As compared to three controls, we observed a significant decrease in SHANK3 mRNA levels (less than 50% of controls) in correlation with a significant reduction in dendritic spine densities and whole spine and spine head volumes. These results, obtained through the analysis of de novo SHANK3 mutations in the patients’ genomic background, provide further support for the presence of synaptic abnormalities in a subset of patients with ASD.

: 3D structure of selective domains of SHANK3 protein Legend to Supplementary Figure S1: a. Left panel shows the delineation for the six ankyrin (ANK) domains ranging from residues 148 to 345 within the SHANK protein aminoacid structure. Right panel shows the juxtaposed ANK domains, which have been homology modeled according to a method reported previously 28,29 , using 1YYH (Ehebauer et al, 2005), 3B7B (Collins et al, 2004) and 1WDY (Tanaka et al, 2004) pdb identities as 3D templates, selected from HHPred detection (Söding et al, 2005). b. Illustration of the 3D structures of the SH3, PDZ and SAM domains, which are identified domains of full-length SHANK. All FASTA sequences were retrieved from the UnitProtKB server at http//www.uniprot.org/uniprot/Q9BYBO#structure.

Figure comment:
The six ankyrin (ANK) domains are presented in their putative 3D structure to illustrate precisely their numbers and complete juxtaposition with respect to ANK topology. Indeed, some contradictory results have been published regarding the topological characteristics of the ANK domains (Schuetz et al, 2004;Mameza et al, 2013). Although predicted benign, coding-sequence variants within the ANK domains have been identified in patients with ASD 3 . Other SHANK3 domains such as SH3, PDZ and SAM are also complex structures that are involved in the interaction of SHANK3 proteins with its intracellular partners (Mameza et al, 2013).

Supplementary References:
Collins, R.E., Northrop, J.P., Horton, J.R., Lee, D.Y., Zhang, X., Stallcup, M.R. & Cheng, X. The ankyrin repeats of G9a and GLP histone methyltransferases are mono-and dimethyllysine binding modules. Nat. Struct. Mol. Biol 15, 245-250 (2004).    Figure S4: Neuritogenesis, a complex dynamic process combining neurite outgrowth and branching, was followed in 5 distinct cultures of control iPSC-derived neurons (15 days post NSC). At the end of culture, neurons were fixed and immunostained with an anti-MAP2 antibody. Fluorescence images were acquired with an inverted Axio Observer.ZI (Carl Zeiss, Le Pecq, France) equipped with an AxioCam camera. Quantification of images was performed using the Acapella software (Perkin Elmer) as previously described 28 . Briefly, neuronal nuclei were detected according to the DAPI staining and for quantification all parameters were measured under the same defined threshold conditions. Measured parameters loaded from the specific module "neurite detection" include the length of the longest neurite per neuron, the total numbers of root extremities and segments. Scatter dot plots with a line at median with interquartile range are represented. Each dot represents one neuron. Statistical analysis was performed using a Kruskal-Wallis test and GraphPad Prism Version 6 software (GraphPad, sand Diego, California, USA). Outliers were identified using the ROUT method with a Q value of 1%. P values are directly indicated in the graphs. No statistical difference was found between the 5 cultures (C1 to C5).   Eisenberg et al. (2013). Analysis was performed on the genes within the two gene sets which were detected using DESeq2 package and observed as under-expressed (FDR < 5% and logFC < 0) and over-expressed (FDR < 5% and logFC > 0) between control and ASD neurons. A total of 15,665 genes was detected as expressed in the RNA-Seq data using a threshold of one count-per-million reads in at least three samples. The y-axis gives the percentage of genes from each gene set present in the population of genes detected as underexpressed and over-expressed. P-values of the Fisher tests are given for each gene set and each group of genes above each barplot.

Method:
Library construction and RNA-seq Stranded mRNAseq sequencing has been performed at the Centre National de Recherche en Génomique Humaine (CNRGH, CEA). After complete quality control of RNA on each sample (quantification in duplicate and RNA6000 Nano LabChip analysis on Bioanalyzer 2100 from Agilent), libraries were prepared using the "TruSeq Stranded mRNA Library Prep Kit" from Illumina, with an input of 1 µg and with selection of poly(A) RNAs, following the manufacturer's instructions. Library quality was checked by Bioanalyzer 2100 analysis, and sample libraries were pooled before sequencing to reach the expected sequencing depth. Sequencing has been performed on an Illumina HiSeq 4000 as paired-end 100 bp reads, using Illumina sequencing reagents and pooling 6 samples per lane (corresponding on average to 40 to 50 million sequenced fragments or 80 to 100 million total reads). The quality of the sequences was checked using an in-house CNRGH pipeline before transferring the fasq files to Institut Pasteur.

Bioinformatics analysis
The RNA-Seq reads were mapped to the genome with the STAR aligner v2.5.3a (Dobin et al. 2013) in two-pass mode to the GRCh37 v75genome, STAR was used to output uniquely Mapped read counts at the gene level. The 15,665 genes with at least one count-per-million reads in three samples were selected for further analysis. Differential expression analysis between ASD and Control samples was performed using DESeq2 v1.18.1 (Love et al., 2014). Plots were made in R using the package ggplot2 (Wickham 2009).  Figure S6: Each dot represents a single individual. The distance between two dots represents the genetic distances. As described in Materials and Methods, the stratification was performed using PLINK and 17K SNPs overlapping from genotyping data of HapMap3 populations (Illumina Human1M and Affymetrix SNP 6.0) and the cohort of controls and patients (Illumina Infinium Omni1/2.5; 1M/2.5M SNPs) and Illumina Infinium Humancore24 (300K SNPs). Primers were linked to the M13 adaptor sequence (red characters). Three sets of primers (duos B, C and D) were used for genomic DNA and cDNA amplification. Set A and E were not used for cDNA since AF and ER were located in intronic regions. Sequencing was performed using M13 sequences as primers.