Introduction

Tay–Sachs disease (TSD) [MIM* 606869] is an autosomal recessive rare neurodegenerative lysosomal storage disorder, marked by the accumulation of GM2 gangliosidosis in the central nervous system (neuronal cell and spinal cell). In TSD, the GM2 gangliosides does not get metabolized into GM3 gangliosides due to deficiency of ß-hexosaminidase-A (Hex-A) (HEXA; EC: 3.2.1.52) enzyme, characterized by mutation in HEXA gene [1, 2]. It is one of the most common lipid storage disorders in India [3].

The clinical phenotype associated with TSD varies from infantile onset presented with progressive neuroregression, seizures, muscle weakness, hyperacusis, and cherry red spot on fundus to adult onset with progressive dementia and gait disturbance [1, 3]. The human HEXA gene (14 exons and 13 introns), located on chromosome 15q23-q24 with 35.56 kb spans. As per Human Gene Mutation Database (HGMD), nearly 190 disease causing mutations in HEXA gene have been reported till date. Mutations c.1421+1G>C, c.1274_1277dupTATC, and c.805G>A (p.G269S) are the most commonly prevalent in 98% of Ashkenazi Jews and 35% non-Ashkenazi Jews population [4,5,6]. Other mutations in the HEXA gene are described rarely in various populations across the globe [7]. Among these, some possible common origin or founder mutations have been reported, that includes 7.6 kb deletion commonly originated in ~70% of French-Canadian population [8]; c.459+5A>G mutation in ~27% Argentinean patients [6]; c.571–1G>T mutation in ~80% Japanese patients [9] and p.E462 mutation in ~22% Indian patients [10]. In addition to this, the B1 variant form of TSD, characterized by the presence of a R178H mutation in HEXA gene [11] and two pseudodeficiency alleles c.739C>T (p.R247W) and c.745C>T (p.R249W) are also reported in ~35% non-Ashkenazi Jewish population [6].

Present study is a part of National Task Force by Government of India to identify burden of various lysosomal storage disorders, its geographical distribution and prevalence and mutations spectrum in various ethnicity in Indian population. Our earlier studies on 41 unrelated Indian patients carried out for almost 6 years in clinically suspected cases with TSD revealed that this disease though rare, seems to have significant burden in our population but remains unnoticed, under diagnosed due to infantile form [10, 12, 13]. As a result families planning for the next pregnancy often ends up having a similar child due to its autosomal recessive inheritance pattern and putting tremendous social, psychological, and financial burden. From our earlier study we have pointed towards the likely founder mutation in certain community in Gujarat and present study is an attempt to confirm our earlier observations and identify additonal novel variants that are prevalent in other Indian population and report their clinical findings.

Materials and methods

This study was carried out as a part of multicentric task force project approved by Indian Council of Medical Research (ICMR) and Department of Health Research (DHR), Government of India from January 2015 to December 2017. The study protocol and enrollment of the patients was approved by institutional ethics committee and written informed consent was taken from the parents or guardian of all the participating subjects as per Helsinki declaration.

Patients

Thirty-four unrelated affected families by TSD with different geographical background were enrolled in this study from the period of January 2015–December 2017. The study comprises of 21 males and 10 females in the age range of 12–36 months and three carrier parents were investigated (where proband were not alive) having confirmed diagnosis of TSD. The diagnosis in these children was confirmed by the demonstration of reduced or absent activity of Hex-A with normal activity of total-Hex in the leukocytes. All patients included in the study presented with developmental delay since infancy, seizures, exaggerated startle, hyperacusis, muscle weakness, cherry red spot on the fundus and mild or absent hepatomegaly. Table 1 provides the demographic distribution and clinical details of the patients.

Table 1 Clinical phenotypes and demographic distribution of TSD patients

Measurement of β-hexosaminidase activity

The enzyme activity of β-hexosaminidase-A (Hex-A) was determined in leukocytes by fluorometric method using sulfated substrate 4-methylumbelliferyl-N-acetyl-β-d-glucosamine-6-sulfate (MUGS), while for carrier analysis Hex A% and total-hexosaminidase activities were measured in leukocytes by a standard heat-inactivation fluorometric method using artificial 4-methylumbelliferyl-N-acetyl-β-d-glucosamine (MUG) substrate [14]. Protein concentrations of the samples were measured using the Lowry method [15].

Mutation analysis of HEXA gene

Genomic DNA was isolated by salting out method [16]. The bi-directional sequencing of the HEXA gene was carried out by automated sequencer (ABI-3130 genetic analyzer) as previously reported [17]. The mutations identified were then looked up in public databases such as Human Genome Mutation Database (http://www.hgmd.cf.ac.uk), Genome aggregation database (http://gnomad.broadinstitute.org/), dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/ index.html), and McGill University database (http://www.hexdb.mcgill.ca).

In silico analysis

In silico prediction tools like PolyPhen2, SIFT, PROVEN, FATHMM, FATHMM-MKL, Mutation Taster, and Mutation Assessor to observe the functional effect of novel variants [18].

Structural studies

The structural study of novel missense mutant alleles was carried out using the crystallographic structure of Hex-A (PDB ID: 2GJX) as described in our earlier report [10] and the root mean square deviation (RMSD) of the mutant structures with respect to the wild-type structure was calculated [19].

Results

In the present study the clinical assessment followed by biochemical and molecular investigations confirmed the diagnosis of 34 TSD patients. The mean age at the time of diagnosis was 15.8 (±5.45) months. Consanguinity was present in 10/34 (29.4%) families. The most common clinical indications observed in these patients were regression of milestones (cognitive and physical) (100%), cherry red spot on fundus (94.1%), generalized hypotonia (85.3%), hyperacusis (79.4%) followed by seizures (73.5%). The MRI study of the brain showed symmetrical abnormal hyperintense signal on T2-weighted images in bilateral basal ganglia with white matter changes (n = 20), with demyelination observed in three cases. Details of geographic background, age at diagnosis, and clinical findings are shown in Table 1.

Biochemical investigations revealed significant deficiency of Hex-A with normal total-Hex activity was observed in the leukocytes of all 31 patients and carrier detection was carried out in three couples (where proband was not alive). The deficient Hex-A enzyme activity is consistent with our previous study in infantile TSD patients [10, 12, 13].

Overall analysis by bidirectional Sanger sequencing has identified 25 mutant alleles in 34 affected/carrier (31 affected patients and 3 carrier couples) by TSD. Fifteen of these were novel found to be in 17 TSD patients with damaging effect on the protein function. This includes seven homozygous missense variants [p.V206L, p.Y213H, p.R252C, p.F257S, p.C328G, p.G454R, and p.P475R] one each; three patients were homozygous for novel nonsense variants that include [p.W420X] in two while p.S9X in one patient; three patients were compound heterozygous for novel nonsense variants [p.E91X, p.W420X and p.E482X] with known 4 bp insertion [c.1278insTATC], missense mutation [p.D322Y] and intronic mutation [c.459+4A>G], respectively. Two homozygous small deletions [c.1349delC (p.A450Vfs*3) and c.52delG (p.G18Dfs*82)] were observed in two patients, and two homozygous splice-site variants c.460-1G>A and c.347-1G>A was observed in one each, respectively (Table 2). Chromatograms for all novel variants are shown in Fig. 1.

Table 2 Biochemical and Molecular analysis in patients with Tay-Sachs disease
Fig. 1
figure 1

Sequence chromatogram of novels variants identified in HEXA gene. a c.26C>A (p.S9X) (Homozygous). b c.52delG (p.G18Dfs*82) (homozygous). c c.271G>T (p.E91X) (heterozygous). d c.347-G>A (homozygous). e c.460-1G>A (homozygous). f c.616G>C (p.V206L) (homozygous). g c.637T>C (p.Y213H) (homozygous). h c.754C>T (p.R252C) (homozygous). i c.770T>C (p.F257S) (homozygous). j c.982T>G (p.C328G) (homozygous). k c.1259G>A (p.W420X) (homozygous). l c.1349delC (p.A450VfsX3) (homozygous). m c.1360G>C (p.G454R) (homozygous). n c.1424C>G (p.P475R) (homozygous)

Present study also identifies 10 previously known mutations in 17 TSD patients, that includes homozygous 4 bp insertion [c.1278insTATC (p.Y427IfsX5)] identified in five patients. Four patients harbor homozygous [p.E462V] mutation. Six patients carried homozygous missense mutations [p.M1T, p.R170Q, D322Y, p.D322N, and p.R499C] one each. One patient was observed having compound heterozygous mutation p.E462V and c.1278insTATC while one patient was compound heterozygous for nonsense variant p.Q106X with splice site variant c.1073+1G>A.

In silico prediction analyses of all novel variants of HEXA gene was found to be probably damaging with a deleterious effect on protein function. The predictions of the PolyPhen2, SIFT, PROVEN, FATHMM, FATHMM-MKL, Mutation Taster, and Mutation Assessor were in concordance with the observed pathogenicity in patients with all novel variants and the prediction scores are presented in Table 3. The root mean square deviation (RMSD) values for the modeled mutants were also found to be significant to prove the pathogenicity of all missense mutations. In addition to this, Genome aggregation database allele frequency in general, as well as in south Asian populations are also mentioned in Table 3. Functional study was not conducted due to unavailability of patients after identification of the novel variants. However, we have analysed 500 normal healthy unrelated individuals to identify the frequency of novel variants and none were found to be positive.

Table 3 In-silico predictions for novel variants detected in HEXA gene

Discussion

Present study reports large series of children with TSD that were presented with infantile form of the disease, which is in concordance with the well-defined clinical presentation of the disease. The Hex-A enzyme activity observed in all children was in the range of 0–1.6% with normal total-Hex activity which is consistent with previously observed enzyme activity in infantile TSD cases [20]. In three cases where proband was not available for analysis, parental study has shown the enzyme activity in the carrier range and subsequent molecular study has confirmed the diagnosis as infantile TSD in the index case.

In the present study 25 variants were identified that includes 15 (60%) novel and 10 (40%) previously known mutations. The missense mutations were observed in 18 (52.9%) followed by nonsense mutation in 7 (20.58%), splicing in 4 (11.76%), and frameshift mutation in 9 (26.47%) patients. The previously known deleterious Indian mutations (p.E462V, p.D322Y, p.D322N, and c.1278insTATC) were observed in 41.2% of patients in the present study.

Among 25 variants identified here, 10 have been earlier reported in diverse populations that include missense mutations p.D322Y, p.D322N, and p.E462V from the Indian patients [10, 12, 13]. Other mutations were; p.M1T from non-Jewish English patient [21], p.R170Q from Japanese [22], and Moroccan patients [23], p.R499C from Irish, Italian, and Japanese patients [7], four base pair insertion c.1278insTATC from Indian, Ashkenazi Jews, and non-Jewish patients [7, 10, 24], nonsense mutation p.Q106X from Hispanic-German-Jewish patient [24, 25], splice site mutation c.459+4A>G from Indian patients and c.1073+1G>A from Cajun, non-Jewish, and w. European patients [26,27,28]. The p.E462V mutation is again found in 5/34 (14.7%) patients that reconfirm the founder effect of this mutation in Indian Gujarati patients. In addition, c.1278insTATC mutation is also more commonly seen 7/34 (20.58%) in Indian patients with TSD.

In present study, we have also found seven novel missense variants that include (p.V206L, p.Y213H, p.R252C, p.F257S, p.C328G, p.G454R, and p.P475R). All these mutations were found to be affecting highly conserved domains and replace the amino acid residues among α-subunit of Hex-A [29]. In addition to this majority of novel variants detected in the present study were neither reported in Genome aggregation database nor 1000 Genome. Moreover, a structural protein analysis was performed using crystallographic structure of HEXA (PDB ID: 2 GJX) as template [30]. The analysis showed that variant allele p.V206L disrupts the β-sheet and over packing of residues due to an amino acid change from a relatively medium size chain group. Here, uncharged amino acid residue Valine is replaced by non-polar, large size chain group amino acid Leucine at codon 206 in exon-6 (Fig. 2a). Similarly, in p.Y213H variant, an aromatic, cyclic uncharged residues Tyrosine is replaced by a positively charged Histidine residue and disrupts the β-sheet (Fig. 2b). Furthermore, all the variants identified so far in residues proximal to Val206 and Tyr213 (V200M, W203G, H204R, D207E, S210R, F211S) have been shown to cause severe TSD phenotypes [17, 28]. Novel missense variant p.R252C is highly likely to be disease-causing mutation because a different mutation in the same codon (R252H and R252L) were previously reported in a Portuguese [31] and Japanese patients [32], respectively, and the bioinformatics predictions scores support pathogenicity further. A novel variant p.R252C disrupts the β-helix with a loss of hydrogen bonding (Fig. 2c) while p.F257S also seems to be disrupting β-sheet and lost hydrogen bonding (Fig. 2d).

Fig. 2
figure 2

Superimposed native structures (blue) and mutant structure (brown) of the α subunit produced using UCSF Chimera. a p.V206L creates overpacking of residues and disruption of β-sheet. b p.Y213H creates overpacking of residues and disruption of β-sheet. c p.R252C disrupt the β-helix and lost hydrogen bonding. d p.F257S disrupt the β-Sheet and lost hydrogen bonding. e p.C328G disrupt the charge distribution near the active site and altering its configuration. f p.G454R creates overpacking, backbone and β-sheet disruption and g p.P475R alter the enzyme interaction properties

The p.C328G variant that likely disturbs the charge distribution near the active site and alters its configuration (Fig. 2e). The p.G454R disrupts the backbone and β-sheet causing conformational changes in the protein (Fig. 2f). This variant occurred at near proximal to the active residue αGlu462, hence likely to be altering the function of this active site. Other substitutions at the same codon (G454S and G454D) have been previously reported in infantile TSD from non-Ashkenazi [27] and Turkish [33] patients, respectively. Another variant p.P475R most likely changes the enzyme interaction properties by increasing the hydrophobic potential on the surface and also destabilizes the protein structure (Fig. 2g).

The two novel splice site variants (c.347-1G>A and c.460-1G>A) seems to be affecting mRNA splicing. The homozygous variants c.347-1G>A and c.460-1G>A found in one patient each were in concordance with the earlier report in American Black patient having mutation at the same position c.460-1G>T, causing infantile TSD [33, 34]. Both the novel splice site variants were predicted to be disease causing or damaging by mutation taster and FATHMM-MKL (Table 3). Nonetheless, numerous homozygote or compound heterozygotes splicing mutations have been reported in HEXA gene, resulting in a very low or undetectable mRNA levels during functional analysis. Both the splicing variants found in this study were likely associated with severe infantile phenotypes of the disease, as it has been previously reported [2, 7].

The novel nonsense variants p.R9X, p.E91X, p.W420X, and p.E482X that were detected in a homozygous or compound heterozygous state are highly likely to cause premature termination of the protein and mRNA reduction with a deleterious effect on protein function.

A novel 1 bp deletion (c.1349delC) was observed in exon-12 in an infantile TSD patient that causes the frameshift beginning at the 450th amino acid in the amino acid sequence of the resultant protein, which introduces a downstream stop codon at 453 amino acid (p.A450Vfs*3) which truncates the protein from its original length of 529 amino acids. Another, one bp deletion c.52delG was observed in Exon-2 that changes the reading frame at amino acid 82 and introduces a downstream stop codon at amino acid 100 (p.G18Dfx*82) also leading to a protein truncation from its original protein length. Both novel 1 bp deletions were also evaluated for pathogenicity using Mutation Taster and was found to be disease causing as shown in Table 3. However, further functional study was not conducted due to unavailability of patients after identification of the novel variants. But all patients with novel variants had significantly reduced activity of Hexosaminidase-A enzyme in the leukocytes which is likely to reflect the functional effect of the genotype.

A large number of mutations have been previously reported in the HEXA gene. Among them, the majority were found in the Hex α-subunit sequence that is generally found in the severe infantile TSD. We have not found any genotype of late-onset phenotype, B1 phenotype and pseudodeficiency phenotype in the present study, as well as our earlier series reported from India. Nonetheless, there are mutations that have been reported in the B1 variants in TSD patients [6].

Conclusion

Present study identifies 15 novel variants that are prevalent in Indian infantile TSD patients which provide newer insight into the molecular pathology of the TSD. Combining present study and our earlier studies, we have observed that 67% genotypes found in Indian TSD patients are novel, which are associated with severe infantile phenotypes, while rest 33% genotypes found in our cohort were previously reported in various populations. Additionally, presence of p.E462V, p.D322Y, and c.1278insTATC variant are the most common hot spot mutations for TSD patients from India, which can be utilized as the first tier molecular screening test for carrier analysis of the large population and to develop therapeutic targets in future.