Damaging coding variants within kainate receptor channel genes are enriched in individuals with schizophrenia, autism and intellectual disabilities

Schizophrenia (Scz), autism spectrum disorder (ASD) and intellectual disability are common complex neurodevelopmental disorders. Kainate receptors (KARs) are ionotropic glutamate ion channels involved in synaptic plasticity which are modulated by auxiliary NETO proteins. Using UK10K exome sequencing data, we interrogated the coding regions of KAR and NETO genes in individuals with Scz, ASD or intellectual disability and population controls; performed follow-up genetic replication studies; and, conducted in silico and in vitro functional studies. We found an excess of Loss-of-Function and missense variants in individuals with Scz compared with control individuals (p = 1.8 × 10−10), and identified a significant burden of functional variants for Scz (p < 1.6 × 10−11) and ASD (p = 6.9 × 10−18). Single allele associations for 6 damaging missense variants were significantly replicated (p < 5.0 × 10−15) and confirmed GRIK3 S310A as a protective genetic factor. Functional studies demonstrated that three missense variants located within GluK2 and GluK4, GluK2 (K525E) and GluK4 (Y555N, L825W), affect agonist sensitivity and current decay rates. These findings establish that genetic variation in KAR receptor ion channels confers risk for schizophrenia, autism and intellectual disability and provide new genetic and pharmacogenetic biomarkers for neurodevelopmental disease.


Results
Lof and rare missense variants are increased within individuals with neurodevelopmental disorders. The pipeline followed for the genetic analysis is presented in Fig. 1. In a first discovery phase, we identified 154 non-synonymous variants which included 4 LoF and 150 missense variants and 143 regulatory variants within GRIK1-5 and NETO1-2 genes (Table 1). 265 variants had a MAF < 1% and were classified as rare or ultra-rare. We postulated that genes, which in the general population, are characterised as having few LoF and missense variants would carry high numbers of LoF and damaging variants in individuals with neurodevelopmental disorders as these variants putatively contribute to disease risk. In the ExAC project database, GRIK2, GRIK3, GRIK5 and NETO1 are all classified as LoF intolerant genes (LoF pLI > 0.90) indicating that these genes www.nature.com/scientificreports www.nature.com/scientificreports/ are extremely intolerant of Loss-of-Function variation and hence few LoF mutations are present in the general population. As detailed in Table 1, we identified 4 LoF variants within affected individuals who had either ID or ASD or comorbid with Scz (GRIK1 L411X, Scz comorbid with ID; GRIK4 S98X, ID; GRIK5 Q848X, ASD; and GRIK5 19:42546908 splice acceptor, variant, ASD), and two of which were within the LoF intolerant gene GRIK5. No LoF variants were identified in the control cohort.
ExAC also categorizes GRIK2, GRIK3, GRIK4, GRIK5 and NETO1 as missense intolerant genes (missense z > 2.80) again suggesting that fewer missense mutations are present in the general population within these genes than expected. Of the 150 missense variants identified in the current study, 75 were considered as protein damaging, 40 as possibly damaging variants and 35 as benign. The number of rare and ultra-rare nonsynonymous variants within affected individuals was found to be higher than the number of rare nonsynonymous variants found within control individuals or shared despite the fact that the control population has twice the number of individuals (affected case frequency 0.58; control frequency 0.26; variants found in both cases and controls frequency 0.16) (Supplementary Tables S4-S7). As indicated in Fig. 2, the majority of LoF and predicted damaging missense variants were identified in individuals with Scz and ASD whilst most of the predicted benign missense variants were found either in controls or were shared between cases and controls subjects (Supplementary Table S8). In contrast to rare LoF and missense mutations, an equal frequency of common non-synonymous, synonymous and regulatory variants was found in both case and control individuals. The most common variants identified (MAF > 0.1) were synonymous variants (87.5%) and present in both case and control individuals.
We also found that the effect size of rare LoF and missense variants within LoF intolerant genes (LoF pLI > 0.90), was larger than the effect size (0.3 compared to −0.2) of rare LoF and missense variants within genes with a low LoF tolerance metric (Fig. 2d). In addition, as shown in Fig. 2 and Supplementary Fig. S1, rare LoF and missense variants were indicated to have a larger effect size compared to variants classified as regulatory and the effect size of rare regulatory variants did not differ between LoF intolerant and LoF tolerant genes.
Single coding alleles associated with risk or protection against developing disease. Two single coding variants were found to be significantly associated with disease (Table 1 and   www.nature.com/scientificreports www.nature.com/scientificreports/ GRIK3 S310A, located in the amino terminal domain (ATD), was found to be protective against a broader neurodevelopmental phenotype, e.g. conditions with psychosis and ASD/ID phenotypes (p = 1.01 × 10 −18 ; OR = 0.59, CI 0.52-0.66,). GRIK3 F586V, was associated with risk of developing ASD (p = 2.84 × 10 −7 ; OR = Inf). In addition, A895G located within the cytoplasmic protein domain of GRIK5, was found protective at a nominal level of significance against developing schizophrenia (p = 4.06 × 10 −5 ; OR = 44.83, CI 2.70-765). enrichment of coding variants within GRIK and NETO genes. We analysed the burden and accumulation rates of functional variants and found a significantly increased burden of LoF, missense and regulatory variants which included all allelic frequencies in the case population (p = 3.38 × 10 −20 ). Similarly, and as detailed in Table 2, we also found an increased burden of ultra-rare and rare LoF, missense and regulatory variants (p = 2.07 × 10 −15 ). When comparing a broader neurodevelopmental phenotype, e.g. ASD, psychosis and ID, or a narrower psychosis phenotype with control individuals, we observed a significantly increased burden of LoF and missense variants within all GRIK and NETO genes (broad neurodevelopmental, all allele frequencies, p = 2.97 × 10 −10 ; broad neurodevelopmental, ultra-rare and rare p = 6.02 × 10 −7 ; psychosis, all frequencies; p = 6.17 × 10 −7 ; psychosis, ultra-rare and rare p = 1.83 × 10 −7 ). We also found a significantly increased burden of LoF, missense and regulatory variants at all allele frequencies (p = 6.86 × 10 −18 ) and for ultra-rare and rare variants alone (p = 1.30 × 10 −9 ) for the combined intellectual disability/ASD cohorts. However, although we found a significant burden of common and rare variants (p = 3.15 × 10 −11 ) for ASD/ID, ultra-rare and rare variants alone did not reach genome-wide level of significance (p = 0.026).
GRIK and NETO genetic associations in two additional schizophrenia cohorts. In a second discovery phase, we investigated the exomes of two additional schizophrenia datasets, the UKSCZ (N = 553) and FSZNK (N = 285) cohorts. Unlike in the first discovery phase where LoF variants were identified in individuals with ID or ASD, we did not identify any LoF variants in this second phase. This may relate to the fact that all affected individuals had a diagnosis of Scz and not ID or ASD. However, we detected 197 coding variants of which 97 were missense variants. 58 missense variants were predicted damaging and 34 (59%) of these damaging missense variants were identified within affected individuals only (Supplementary Tables S12-S16). As before, we investigated GRIK and NETO single coding alleles for association with risk or protection against Scz (Figs. S1-S3). A splice variant within GRIK2 6:102337505, c.1525-10 C > T (p = 4.43 × 10 −8 ; OR = 42.26, CI 2.53-704) and a missense variant within GRIK3, R865G, (p = 6.8 × 10 −6 ; OR = 73, CI 4-1226) showed a significant nominal association with risk for schizophrenia (Table 1).
Consistent with our previous findings for a psychosis phenotype, we observed that GRIK and NETO genes had high accumulation rates of functional coding variants ( Table 2, Supplementary Table S17). For instance, we found a significantly increased burden of missense and regulatory variants at all allele frequencies (p = 1.26 × 10 −25 ), and a significantly increased burden of just missense variants (p = 4.39 × 10 −15 ). In addition, a higher burden of common and rare missense variants were found within GRIK3 (p = 5.24 × 10 −10 ) and both NETO1 and NETO2 genes had higher accumulation rates of common and rare regulatory variants (p = 1.60 × 10 −28 and p = 8.03 × 10 −10 ). We also observed a burden of rare regulatory variants within GRIK5 (p = 3.37 × 10 −5 ). evidence for the robustness of allele associations. To assess the robustness of significantly associated single alleles identified in discovery phases, we compared allelic frequencies of rare and common coding or splicing variants in the UK10K discovery phase affected cases with non-affected exomes from ExAC cohorts (N = 45,376). We identified 8 genome-wide significant associations (3 missense, 4 synonymous and 1 splice site) listed in Table 1 and Supplementary Table S18 Table S18.
Finally, using data from ExAC for well individuals (N = 45,376) and from the psychiatric disease arm of the ExAC study (N = 15,328) which includes individuals with additional neurological and psychiatric conditions, e.g. Tourette's syndrome, and hence relates to a yet broader neurodevelopmental phenotype, we compared allele frequencies for all damaging coding variants which we had previously identified in the earlier phases of the study. www.nature.com/scientificreports www.nature.com/scientificreports/ We observed nine significant associations (p < 2 × 10 −34 ); 5 missense variants, 3 synonymous and one splice variant, details of which are provided in Table 1 and Supplementary Table S19. The levels of significance (1 × 10 −6 to 1 × 10 −108 ) reflects the power to detect associations with very large sample numbers, e.g. N = 45,000, and is consistent with values reported for individual variants studied in large cohorts [44][45][46] .
Consistent with our previous findings, we again observed a significant difference in allele frequencies for the GRIK3 S310A variant (non-affected individuals MAF 0. 29

In silico and in vitro assays of rare variants within GluK2 and GluK4 TMD and LBD domains support a functional effect. Three predicted damaging missense mutations identified in individuals with
schizophrenia and located within 'key' ligand binding (GluK2 K525E) and transmembrane (GluK4 L825W; GluK4 Y555N) domains of GluK2 and GluK4 subunits, were examined using in silico modelling tools. We found that GluK4 Y555N disrupted a hydrogen bond and resulted in a significant destabilizing thermodynamic effect (∆∆G = 1.65). GluK4 L825W did not affect the formation of hydrogen bonds but did however, have a slightly destabilizing effect on the total energy (∆∆G = 0.755). The GluK2 LBD K525E mutation led to creation of a hydrogen bond but no predicted observable thermodynamic effect was predicted (∆∆G = 0.06), shown in Supplementary Figure S4. However, a decrease in predicted positive electrostatic potential was observed over the ligand binding domain area of the GluK2 K525E variant (Supplementary Figure S5) and which could influence cell surface expression 47 . Taken together, in silico protein modeling analysis suggests that these three predicted damaging mutations could affect protein conformation, structural relationships or electrostatic potential.
To further study functional changes, we expressed wild-type and mutated GluK2 homomers and GluK2/ GluK4 heteromers in Xenopus oocytes and measured their current responses to application of glutamate using a voltage-clamp. Currents rose to a peak then decayed to a steady state (Fig. 3b). Because of the temporal limitations of this system, currents represent a mixture of activation, desensitization and deactivation processes but presumably with activation dominating the rising phase and desensitization and deactivation dominating the decaying phase. We measured the agonist sensitivity of the peak current (EC 50 ) and the current decay rates (τ decay ) and although these current kinetics are not physiological, we compared changes in responses between wild-type and mutant receptors ( Fig. 3; Supplementary Table 20).  GluK4 L825W reduced glutamate potency of peak current generation for GluK2/GluK4 receptors by 42.6 -fold (p = 0.0001, n = 10-18). Similarly, the GluK4 Y555N mutant decreased glutamate potency for GluK2/GluK4 receptor peak currents by 15 -fold (p = 0.0001, n = 9-13). The GluK2 K525E LBD variant was not functional by itself but when co-expressed with GluK2, glutamate potency of GluK2 homomers was decreased by 4.5-fold (p < 0.05, n = 15-17) (Fig. 3a, b). These results imply decreased KAR channel activity.
The rate of current decay of GluK2/GluK4 L825W and GluK2/GluK4 Y555N heteromers was found to be 2.3-fold slower (p = 0.047 and p = 0.002 respectively) than for GluK2/GluK4 wild type receptors upon application of 0.1 mM glutamate (Fig. 3, Supplementary Table S20). This would suggest that both GluK4 mutations may result in mildly increased function through the channels potentially remaining open for a prolonged time. GluK2 K525E did not significantly affect the rate of current decay. Taken together, our observations support that there is likely an overall decreased function in all mutants.

Discussion
Our findings from an integrated analysis of ~4,580 genomes investigated in two discovery phases, supports the hypothesis that LoF and damaging variants within KAR subunit and NETO genes are enriched in individuals with schizophrenia, autism and ID. Our observations of a specific candidate gene set are congruent with recent www.nature.com/scientificreports www.nature.com/scientificreports/ large scale whole genome and exome studies of individuals with schizophrenia and schizophrenia with ID which report an increased burden of ultra-rare coding and common variants in genes characterised as missense and LoF variant depleted genes 9,48 . Likewise, our findings provide further support that, in addition to rare de novo variation as a strong causative factor for autism 5,49 , inherited LoF and damaging mutations can confer risk for autism and ID. We also confirm evidence that Scz, ASD and ID phenotypes share genetic predisposing factors and neuropathology [50][51][52] , and that variants with a spectrum of allele frequencies and effect size within GRIK and NETO genes contribute to these phenotypes.
We identified LoF and damaging missense variants across key protein domains of KAR subunits involved in specific functions. The majority of the significantly associated replicated alleles were found to be protective, e.g. GRIK3 S310A and are novel targets for future genetic studies. Our electrophysiological findings supported the idea that both LBD and TMD mutants altered channel gating behavior. However, variants may also impact upon KAR function by a number of alternative means. For instance, disruption of KAR and NETO interaction may affect the synaptic localisation of KARs 53,54 . Similarly, KAR CTD alterations could inhibit N-cadherin interaction and thereby influence synaptic compartmentalization and recruitment of KARs to the membrane 55 and mutations disrupting C terminal PDZ ligand binding might influence secretory pathway processes, feedback systems and neuronal activity 20 . These synaptic-population and subcellular specific processes highlight the importance of KAR subunit availability and how variants, whether coding or acting through epigenetic mechanisms, could affect the KAR spatio-temporal patterns which may cause downstream alterations in the glutamate neurotransmission.
As with many NGS studies, a limitation of this study was that genomic coverage was dependent on read depth and quality of sequencing. Several regulatory UTR variants had to be excluded as they were not consistently called across cohorts and hence our results may have missed important regulatory KAR and NETOs alleles contributing to disease risk. For instance, we excluded an indel within the 3′UTR of GRIK4 which we previously reported confers protection against developing bipolar disorder through altering GluK4 RNA and protein abundance 36,37 . Furthermore, intronic variants which were not assessed within GRIK4 and GRIK3 have also shown significantly association with response to antidepressant and antipsychotic medication 42,[56][57][58] .
Case-control GWAS studies of individuals with schizophrenia have recently indicated common alleles within transcripts of the C4 genes, which encodes a complement component 4 protein, and SNAP25, a vesicle fusion protein, contribute to risk of disease 59,60 . Both SNAP25 and members of a second complement cascade protein family (C1ql2 and C1ql3) are located at postsynaptic sites and bind to KAR subunits and thereby regulate KAR ion channel behaviour 61,62 . Further exploration of this emerging genetic risk pathway may aid in the development of new drugs to target neurodevelopmental conditions. Based on our findings, and with the need for translation from genetic risk factors to clear biomarkers of treatment response and disease prognosis, future studies using large population cohorts with collated phenotypic, genomic, medical and medication data (e.g. the UK biobank and USA ' All of Us research program') should involve the detailed characterisation of KAR and NETO risk alleles.

Methods and Materials
Individuals with a clinical diagnosis of neurodevelopmental disorders were exome sequenced as part of the neurodevelopmental collections in the UK10K sequencing project (further details are provided in Supplementary  Table S1). Ethical approval for genomic research and informed consent from all participants and/or their legal guardians was obtained previously by the UK10K consortium committee. The UK10K project was conducted in compliance with the Declaration of Helsinki statement of ethical principles. Access to the sequencing datasets was granted to Dr Knight under the UK10K Project access agreement ID5574. All individuals sequenced were of European ancestry.
In a first discovery phase, we examined approximately 846 individuals with schizophrenia or psychosis, 550 individuals with ASD, 124 individuals with intellectual disability, and individuals with a dual diagnosis of either ASD comorbid with ID (77) or psychosis comorbid with ID (175) (Fig. 1). The numbers assessed vary owing to exclusion of poor sequencing data within different KAR/NETO genes. In a second discovery phase, two additional schizophrenia cohorts, (the NEURO UKSCZ N = 553; and NEURO FSZNK N = 285) were investigated. Population controls came from the population control arms of the UK10K project (TwinsUK10K, Obesity UK10K; N = 2,095). In a follow-up phase of the study we attempted to replicate allelic associations by first comparing MAFs of variants identified in the case discovery cohorts with MAFs reported for the ExAC general population (N = 45,376). Subsequently, we compared alleles of interest in the psychiatric (N = 15,328) and non-psychiatric (N = 45,376) arms of the ExAC cohort.
Variant call files (VCF) were obtained from the European Genome-phenome Archive. VCF files for the non-psychiatric arm of ExAC was available from the ExAC website. The Genotype-Tissue Expression Project was used to identify primary transcripts expressed in brain and both brain-expressed and canonical transcripts were examined (Supplementary Table S2). A minimum of 12x read depth was accepted as a quality control. Functional annotation of coding variants was performed using snpEff, snpSift and dbNSFP. Variants were classified as: LoF variants (stop-gained, frameshift and splice-disrupting variants); missense; and regulatory (synonymous, non-damaging splicing site variants within 10 bp surrounding the exon and 3′UTR or 5′UTR variants). LoF annotation was conducted using the LoF Transcript Effect Estimator (LOFTEE, version 0.2). Mutation Taster, Panther DB, Align GVGD and PolyPhen2 were used to predict whether missense mutations were damaging. Splicing effects were assessed using the Human Splicing Finder (HSF 3.0). Minor allele frequencies (MAFs) were calculated for variants identified in the discovery phases and compared with MAFs derived from general population databases, e.g. GnoMAD. Variants were classified as common (MAF > 0.05), low frequency (MAF = 0.05-0.01), rare (MAF = 0.01-0.001) and ultra-rare (MAF = 0.001-0.0001).
Single allele association analysis was performed using either the Fisher's exact test or the chi square tests and were two tailed. P-values were adjusted for correction using the Holm-Bonferroni method and significance was set at two levels; a genome wide level (p < 5 × 10 −8 ) and a less conservative nominal level (p < 1 × 10 −6 ).
Odds Ratios (ORs) and confidence interval values were calculated using R software (v.3.4.1). Kernel methods SKAT and SKAT-O were implemented to identify genes carrying a significant burden of common, rare, and rare damaging variants 63 . Imputation of missing wild-type genotypes was conducted by using IMPUTE2 software 64 . Structural templates used for in silico protein modelling were acquired either from the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank or generated from Uniprot amino acid sequences using RaptorX software 65 -further details are provided in Supplementary Table S3. In silico mutagenesis was performed using the mutagenesis function of Pymol (PyMOL Version 2.0, Schrödinger, LLC) and predicted hydrogen bonds within 8 A were examined in wild type and mutated structures. Free energy calculations (ΔG) were performed using FoldX v3.0 66 . The Adaptive Poisson-Boltzmann Solver (APBS) software package was used to study surface potential changes in the electrostatic surface potential in mutant proteins 67 .
Wild-type and mutant KARs (GluK2 K525E, GluK4 Y555N, and GluK4 L825W) were expressed in Xenopus oocytes using methods described previously 68 . Xenopus oocytes were supplied as ovarian lobes by the European Xenopus Resource Centre, University of Portsmouth, UK. Animal care and treatment were conducted in compliance with national and international laws and policies. The electrophysiology research protocol was performed in accordance with the University of Nottingham institutional guidelines and regulations. Human cDNA clones for GluK2 and GluK4 were obtained from GenScript (USA). Mutations were introduced into constructs using the QuikChange II Mutagenesis kit (Agilent Technologies) and cRNA generated using a mMessage mMachine kit (Invitrogen). Oocytes were injected with 50.6 nL cRNA (250-300 ng/μL) and incubated in GTP solution. Oocytes were perfused with frog Ringer solution at 10 mL/min and two-electrode voltage-clamped at −80 mV (Geneclamp 500, Axon Instruments). Glutamate was perfused at a flow rate of 10 mL/min at concentrations ranging between 10 −9 M and 10 −3 M for 10 s. Peak current amplitudes were normalised and plotted against glutamate concentration to determine EC 50 . Current decay time constants (τ) were estimated over the 10-s glutamate application by fitting exponential equations. Experiments were performed in multiple cells (n ≥ 5).

Data availability
The datasets generated during and/or analysed during the current study are not publicly available due access being granted to Dr Knight under the UK10K Project access agreement ID5574 but further information concerning identified variants is available from the corresponding author on reasonable request.