Introduction

Intellectual disability (ID) is a common neurodevelopmental disorder, affecting 2% of the general population, which poses a considerable burden on the quality of life of the families of affected individuals and has a significant socioeconomic impact on society and the health-care system.1 It is an extremely heterogeneous group of disorders, as to date, over 700 genes have been implicated in syndromic and nonsyndromic ID.2 Despite the considerable progress in disease gene identification, especially after the introduction of next-generation sequencing, at least 50% of the estimated genetic causes of ID remain unknown.3 All Mendelian inheritance patterns have been reported with ID, and approximately half of them follow an autosomal recessive inheritance pattern (autosomal recessive intellectual disability).3

The study of consanguineous families has facilitated the discovery of pathogenic genetic variants in many autosomal recessive disorders.4,5,6 The frequency of autosomal recessive disorders is higher in populations where consanguineous marriages are frequent.7 Consanguinity results in higher risk for birth defects varying from 2.7% to 15.8%.8 The combination of homozygosity mapping and exome sequencing is a powerful and cost-effective tool for molecular diagnosis and discovery of novel genes in families with suspected autosomal recessive disorders.9,10,11,12,13

Using this approach, we studied two unrelated Pakistani consanguineous ID families, and identified two different homozygous missense variants in LINGO1 (leucine-rich repeat and immunoglobulin domain containing 1) that are likely to cause the intellectual disability.

Materials and methods

Patients

Family F162 was recruited and collected from the Institute of Basic Medical Sciences, Khyber Medical University, Peshawar, Pakistan, and family PKMR65 was enrolled at Allama Iqbal Medical Research Center, Lahore, Pakistan. The study was approved by the Bioethics Committee of the University Hospitals of Geneva (protocol CER 11–036); the Institutional Review Board of the Centre of Excellence in Molecular Biology, University of the Punjab, Lahore, Pakistan; and the Medical Ethical Committee Arnhem-Nijmegen, The Netherlands. Written informed consent was provided by all parents or legal representatives for the performed analyses.

Exome sequencing

In case of family F162, exome capture was performed with the SureSelect Human All Exon kit v5 (Agilent Technologies, Santa Clara, CA). The exome-captured library was sequenced on an Illumina HiSeq2000 with 125-bp paired-end reads yielding an average of 220 × coverage per targeted base. Exome-sequencing data was analyzed using an in-house customized pipeline. The pipeline is based on published algorithms including the Burrows–Wheeler aligner tool (BWA),14 SAMtools,14 Pindel,15 and ANNOVAR.16 The pipeline uses these algorithms to map the reads (BWA), detect variants (SAMtools) and indels (Pindel), and the annotation (ANNOVAR) in a sequential manner. To calculate coverage and on-target reads for the entire coding sequence, human RefSeq17 coding genes were used as the reference.

To identify the causative variant in the family PKMR65, exome sequencing was performed using similar reagents and platform as for family F162, but with 100-bp paired-end reads. Approximately 95% of the reads were mapped uniquely to the reference sequence with a mean coverage of 125 ×. The reads were aligned to the reference human genome (GRCh37/hg19), using the BWA.14 Polymerase chain reaction duplicates were identified using Picard (http://broadinstitute.github.io/picard).

Homozygosity mapping

Homozygosity mapping was performed in family F162 by 720-K single-nucleotide polymorphism (SNP) array (HumanOmniExpress Bead Chip by Illumina, San Diego, CA) with average SNP density of one SNP per 4 kb and a window of 50 consecutive homozygous SNPs, with maximum of one mismatch allowed in each homozygous region. The runs of homozygosity (ROH) regions were demarcated by the first heterozygous SNPs adjoining each homozygous region. Potential target genomic regions were identified as the ROHs shared by the affected individuals, but not by their parents or their unaffected siblings.

Bioinformatics analysis

For family 162, the software CATCH18 was used to analyze consanguineous families by combining family pedigree information, ROH, and exome-sequencing data. This software automatically marks homozygous variants as “putative” if they are present in ROHs of patients but not in unaffected individuals of a nuclear family. Subsequently, the variants were filtered manually using the criteria described in previous studies.11 As an initial filter in the target region of each family, we included all homozygous exonic and splicing variants (±6 bp of the intron–exon boundary) and excluded all synonymous variants, which are not in the splicing regions. We selected variants with a minor allele frequency <0.02 in the Exome Aggregation Consortium (ExAC) (http://exac.broadinstitute.org) and our local database, and the variants found within duplications of the genome were also filtered out. After the initial filtering, all remaining variants were evaluated and ranked on the basis of conservation (by GERP++);19 predicted pathogenicity scores such as by SIFT,20 PolyPhen2,21 and Mutation Taster;22 and the presence of variants in the professional version of the Human Gene Mutation Database.23 Illumina GenomeStudio software (http://www.illumina.com/software/genomestudio_software.ilmn) and the in-house built program CoverageMaster (F. Santoni, unpublished data) were used to perform the copy-number variation (CNV) analysis using SNP-array and exome-sequencing coverage data, respectively. The final list of variants was further verified by Sanger sequencing in all family members whose DNA samples were available to determine whether they segregated with the disease phenotype.

The exome sequencing data of family PKMR65 were analyzed by calculating the coverage and on-target reads for the entire coding sequence. CNV analysis on the exome data was performed following the method described previously.24 Next, the selection of variants was performed as described previously.25 In short, seven major steps were taken to select all high-quality potentially pathogenic variants: (i) inclusion of variants present in at least four reads and present in ≥80% of all reads; (ii) exclusion of those homozygous variants that are present in unaffected controls sequenced at the same time; (iii) exclusion of variants within intergenic, intronic (apart from the splice site variants), and untranslated regions; (iv) exclusion of variants present in dbSNP142, 1000 Genome, the National Heart, Lung, and Blood Institute Exome Variant Server database, or ExAC database with a frequency ≥1% (v) inclusion of loss-of-function variants (i.e., nonsense, frameshift, and splice site mutations) with a phyloP score ≥0; (v) inclusion of missense variants and in-frame deletions and duplication with a combined annotation dependent depletion score of ≥20; (vi) selected variants in genes that are expressed in the brain based on their expressed sequence tag profile in the Unigene database (transcripts per million ≥5); and (vii) inclusion of variants that segregate with the disease in the respective pedigree as determined using Sanger sequencing.

Molecular modeling analysis

The crystal structure of the LINGO1 protein tetramer stored under the 2ID5 code in the Protein Data Bank (http://www.rcsb.org/) was used for molecular modeling studies.25 The influence of the residues on the structural stability was studied by computational alanine scanning using the Foldx4 software.26 The structural stability of the wild-type protein was computed and compared with that of the alanine mutants. The stability is estimated as a difference between the free energy of the unfolded and folded states of the protein. The change in the stability upon mutation to alanine indicates the importance of each residue side chain for the structure of the LINGO1 protein.27 The protein was visualized with the University of California–San Francisco Chimera software.27

Results

Clinical evaluation

Family F162 (Figure 1a) has two affected individuals (IV:1 and IV:8) and six unaffected siblings (IV:3, IV:4, IV:5, IV:6, IV:9, and IV:10), the parents of whom are first cousins. One of the siblings (IV:2) died two days after birth, the cause of which is not known, and one stillbirth was also reported in this family. Both affected individuals of this family were clinically evaluated. At the time of clinical evaluation, age of female (IV:1) and male (IV:8) affected individuals was 10.5 and 21 years, respectively. They presented with severe ID, microcephaly, developmental delay, aggressive behavior, and slurred speech. Affected individuals (IV:1 and IV:8) started walking at the age of 2 and 10 years, respectively. In addition to these phenotypes, one of the patients (IV:8) also had uncontrolled epilepsy. Brain magnetic resonance imaging of both the affected individuals (IV:1 and IV:8) did not reveal any signs or symptoms of obvious myelination defects.

Figure 1: Pedigrees showing the segregation of found variants.
figure 1

(a) F162, a consanguineous family from Pakistan in which two variants (M1 and M2) segregate with the disease. (b) A variant (M3) segregating in the second Pakistani consanguineous family (PKMR65).

Family PKMR65 with probable autosomal recessive intellectual disability consisted of three affected female individuals (VI:2, VI:3, and VI:6) and three unaffected siblings (VI:1, VI:4, and VI:5) born to consanguineous parents. At the time of evaluation, affected individuals VI:2, VI:3, and VI:6 were aged 25, 20, and 7 years, respectively. Two affected individuals (VI:2 and VI:3) had severe and one affected individual (VI:6) had moderate intellectual disability. All childhood developmental milestones (gross motor, fine motor, speech, and social) were delayed. Affected individuals could not speak, and had poor social interaction and aggressive labile mood. Individuals VI:2 and VI:3 with severe ID also could not eat or drink independently. Individual VI:2 had a history of left hemiparesis at 9 years of age that had recovered over time and medical record was not available. There was no history of epilepsy.

Physical examination was remarkable for spastic hypertonia and exaggerated deep tendon reflexes without motor deficit in two individuals (VI:2 and VI:3). Two individuals (VI:3 and VI:6) were microcephalic. Except for a left dysplastic ear in individual VI:6, no other dysmorphic features were observed. Fundoscopic examination of all affected individuals was unremarkable. No brain magnetic resonance images were available for the affected individuals of family PKMR65. A considerable phenotypic concordance was found among all five affected individuals of both families, as summarized in Table 1.

Table 1 Comparison of phenotypes in F162 and PKMR65 families

Genetic analysis

The SNP-array analysis was performed to genotype six individuals (III:4, III:5, IV:1, IV:3, IV:6, and IV:8) of family F162, including both affected individuals, their unaffected siblings, and both parents. ROHs were calculated using PLINK25. Family structure was verified by calculating identity by descent through PLINK25 to estimate the relatedness among individuals of the family. Subsequently, exome sequencing was performed in one of the affected individuals (IV:8) of family F162. A total of 21,739 high-quality exonic and 1,302 canonical splice site (within ±6 nucleotides from the intron–exon boundary) variants were found, covering 97.2% of the target regions with at least 10 × resolution. The CATCH18 analysis resulted in two candidate variants (NM_032808.6:c.869G>A:p.(Arg290His) in LINGO1 (MIM 609791) and NM_004998.3:c.1406C>T:p.(Thr469Met) in MYO1E (MIM 601479)), which were present in the same ROH, segregating with the disease phenotype. The segregation of stated genotypes with the ID phenotype was confirmed by Sanger sequencing in all available family members in family F162 (Figure 1a). As mutations in MYO1E are known to cause autosomal recessive nephrotic syndrome (OMIM 601479)28 and the affected individuals did not show any symptoms of nephrotic syndrome, the variant found in MYO1E appears not pathogenic. To confirm this, we tested the excretion of albumin in urine samples, as nephrotic syndrome causes the excretion of high level of proteins in the urine,29 showing no secretion of albumin in the urine of either patient. Moreover, clinical examination of the affected individuals did not reveal edema or swelling of hand or face, confirming that the MYO1E variant is benign.

By sharing the data with collaborating investigators, another consanguineous family PKMR65 segregating a likely pathogenic variant (NM_032808.6:c.863 A>G: p.(Tyr288Cys)) in the LINGO1 gene was identified. Exome sequencing of VI:2 and VI:3 of family PKMR65 identified candidate missense variants in DNAJC2, LINGO1, and VAPA.25 Sanger sequencing showed that only the c.863 A>G (p.(Tyr288Cys)) in LINGO1 segregated with the ID phenotype in the family (Figure 1b), pointing to LINGO1 as the prime ID candidate gene in PKMR65. CNV analysis using SNP-array data of family F162 or exome-sequencing data of both families (F162 and PKMR65) did not reveal any likely pathogenic CNV.

Neither of the LINGO1 variants (NM_032808.6:c.869 G>A:p.(Arg290His)) and (NM_032808.6:c.863A>G:p.(Tyr288Cys)) found in families F162 and PKMR65, respectively, were present in the ethnically matched control cohort (n = 201) or in the South Asian population in the GnomAD database (Supplementary Table 1 online).

Molecular modeling analysis

In humans, at least 12 different isoforms of LINGO1 are expressed, which vary from each other at their 5′ untranslated regions. The coding exons of LINGO1 are identical for each isoform and encode a polypeptide of 620 amino acids that belongs to a large family of leucine-rich repeat immunoglobulin domains containing proteins. The extracellular amino terminal region known as the ectodomain has a tandem array of multiple leucine-rich repeats and a single immunoglobulin-like domain followed by a transmembrane domain and short cytoplasmic tail.25 Both mutant amino acid residues, at Arg290 and Tyr288, are located in the leucine-rich repeat region of LINGO1 and are conserved down to zebrafish (Figure 2). The leucine-rich repeat region of LINGO1 is thought to be essential for protein–protein interactions. Therefore, it is possible that the substitutions will affect binding of LINGO1 to other proteins. Visual inspection of the crystal structural model of the protein revealed that both mutated amino acid residues are close to a glycosylation site at position Asn264, suggesting that they may interfere with proper glycosylation (Figure 3). The estimated stability change upon alanine mutation were nonsignificant for both residues: 0.1 kcal/mol in the case of Arg290 and 0.3 kcal/mol loss in the case of Tyr288, suggesting negligible influence of both residues on the protein stability. This is not surprising since both residues are located on the protein surface, pointing toward the solution. Nevertheless, it cannot be excluded that the p.(Tyr288Cys) mutation is influencing the folding of LINGO1 as it introduces additional free thiol groups that can react during folding and create wrong disulfide bridges.

Figure 2: Protein sequence alignment.
figure 2

Both amino acids, p.(Arg290His) in family F162 and p.(Tyr288Cys) in family PKMR65, are highly conserved in the LINGO1 protein sequence.

Figure 3: Molecular modeling of LINGO1 protein.
figure 3

(a) A visual inspection of the structure suggests that both variants might interfere with glycosylation process or the recognition of the glycosylation site. Yellow spheres represent p.Arg290 and p.Tyr288, whereas glycan chain is in red and other glycan chains are also shown in sphere representation. (b) Tetrameric structure of lingo-1. Glycan on residue 264 not shown. The protein was visualized with the University of California–San Francisco Chimera software.

Discussion

We have identified two homozygous missense variants in LINGO1, p.(Arg290His) and p.(Tyr288Cys), that both fully segregated with the neurodevelopmental phenotype in family F162 and family PKMR65. All patients from both families presented with a similar phenotype consisting of severe intellectual disability, aggressive behavior, speech delay, and motor delay. Four of the five patients had microcephaly. One patient from family F162 presented with epilepsy. Brain magnetic resonance imaging on both affected individuals (IV:1 and IV:8) of family F162 did not detect structural abnormalities or myelination defects. The LINGO1 locus (15q24-26) has previously been associated with autism, schizophrenia, anxiety, and depression, and alterations in this region have been implicated as a susceptible factor for psychiatric disorders,30 supporting a role for this gene in neurodevelopmental disorders. The two mutated amino acid residues in LINGO1, at Arg290, and Tyr288 are conserved down to zebrafish (Figure 2).

It is interesting to note that LINGO1 may tolerate neither loss-of-function nor missense variants. Data from ExAC31 show that only half the expected number of missense variants have been detected (Z value of 4.00) and that the probability of loss-of-function intolerance score has been estimated to be 0.95 showing a considerable intolerance to loss-of-function variants. These metrics are usually found in genes that are involved in severe autosomal dominant disorders with reduced fecundity such as ID. However, autosomal recessive modes of inheritance may also be possible under the assumption that each of the identified variants is a hypomorph and does not fully abolish protein function. This suggests that de novo LINGO1 loss-of-function variants may be observed in the future in patients with dominant ID. Alternatively, these mutations might be incompatible with life.

LINGO1 is a transmembrane protein that is predominantly expressed in the central nervous system (CNS), especially in the oligodendrocytes and neuronal cells. It has been demonstrated that LINGO1 is part of the LINGO1–RTN4R/NGFR receptor complex that negatively regulates myelination, oligodendrocyte differentiation, axon regeneration, and neuronal survival.32,33 Increased LINGO1 expression has been found in various animal models with CNS injury and in the CNS diseases in humans.34 In transgenic mice, overexpression of LINGO1 causes reduction in myelination, and less-differentiated oligodendrocytes were observed.35 Overexpression of LINGO1 is also observed in several neuronal disorders including multiple sclerosis, Parkinson disease, and essential tremor.36 Various association studies have also shown the role of LINGO1 variants with essential tremor and Parkinson’s disease.37 Moreover, inhibitory anti-LINGO1 antibodies promoted illegitimate oligodendrocyte differentiation and myelination of neurons by inhibiting the function of LINGO1. A similar outcome was observed in the Lingo1 knockout mice and zebrafish lingo1b knockdown models, which revealed early improvement in neuronal myelination as compared to wild type.38,39 Taken together, these studies establish an essential role for LINGO1 in myelination, neuronal survival, and CNS repair in general.

In conclusion, we show that homozygous LINGO1 missense variants cause severe autosomal recessive intellectual disability, aggressive behavior, speech delay, motor delay, and microcephaly. Further functional studies are warranted to dissect the exact pathophysiological mechanism of this new autosomal recessive ID syndrome.