INTRODUCTION

Alternative splicing is a major contributor to transcriptome diversity [1]. This diversity is most apparent in the brain, which exhibits the highest messenger RNA (mRNA) isoform complexity of all tissues [2, 3]. Splicing patterns vary considerably throughout development and are orchestrated by RNA-binding proteins [1]. These splicing factors recognize specific RNA motifs and influence spliceosome assembly at nearby splice sites. Through selective inclusion or exclusion of exons, alternative splicing generates multiple mRNA and protein isoforms with different functional properties from single genes [1]. Splicing factors shape mouse neurodevelopment, directing neurogenesis, cell migration, and synaptogenesis [4]. They also regulate neuronal excitability and thereby play a pivotal role in brain homeostasis [4]. Despite these observations, few splicing factors have been shown to underlie specific Mendelian disease traits [5].

Nuclear speckles are punctate membraneless subnuclear organelles rich in splicing factors, including small nuclear ribonucleoproteins, spliceosomal subunits, and arginine/serine-rich splicing factors [6, 7]. One widely expressed, speckle-associated splicing factor is Nuclear Speckle Splicing Regulatory Protein 1 [8, 9]. In humans, it is encoded by the seven exon NSRP1 gene, which maps to chromosome 17q11.2. Mice heterozygous for a Nsrp1 null allele had no overt phenotype, but Nsrp−/− mice were not detected from heterozygous crosses as early as embryonic day 6.5, demonstrating null associated embryonic lethality and a surmised essential role for NSRP1 in development [8]. Yet, NSRP1 has not been previously implicated in human Mendelian disorders and no NSRP1 variant alleles have been investigated.

Here, we identify six individuals from three unrelated families with severe neurodevelopmental disorders (NDDs) with brain abnormalities and biallelic loss-of-function (LoF) variants in NSRP1, implicating Nuclear Speckle Splicing Regulatory Protein 1 as a key splicing factor in human brain development.

MATERIALS AND METHODS

All individuals in this study provided informed consent, including consent to publish photographs. Collaborators were connected through GeneMatcher [10]. Exome sequencing (ES) was performed either through research institutions or clinical diagnostic labs. Additional experimental details, including absence of heterozygosity (AOH), as a surrogate measure of runs of homozygosity (ROH) and genomic intervals identical by descent, and inbreeding coefficient calculations can be found in the Supplemental Material. All identified variants were deposited in ClinVar under SCV001547247, SCV001547247, and SCV001547249.

RESULTS

Clinical data

Using ES and family-based rare variant analysis, we identified six individuals from three unrelated families with distinct homozygous LoF variants in NSRP1 (Fig. 1a–c) [11]. After VCF file parsing/filtering, analysis, and testing Mendelian expectations for either dominant or recessive disease trait models, no other variants in known disease genes or novel genes could parsimoniously explain the clinical synopsis of observed rare phenotypes. Two families had a known family history of consanguinity. Inbreeding coefficients calculated from ES data were higher than expected for each family (Tables S1,S2), with family 3’s calculated inbreeding coefficients (F = 0.1093, 0.1257) approaching that of an uncle–niece marriage (F = 0.125) rather than the value expected based on their third cousin marriage (F = 0.003906) [12]. Similarly, AOH calculations demonstrated large AOH blocks (3.1 Mb, 7.7 Mb, and 6.9 Mb) surrounding each NSRP1 trait-associated variant consistent with clan genomics identity-by-descent (IBD) (Fig. 1a–c) [13, 14].

Deep phenotyping was performed, and detailed clinical data are available in the Supplemental Material (Table S3). Core clinical findings include developmental delay (DD, 6/6), epilepsy (6/6), hypotonia (6/6), appendicular spasticity (6/6), microcephaly (5/6, Z-scores −0.95 to −5.60), dysphagia (4/6), and dysmorphic facies (4/6) (Fig. 1d–h). Most individuals were nonverbal (5/6) and 3/6 were nonambulatory. Seizures began in infancy and were often drug-resistant (3/6). Brain abnormalities included underopercularization (3/4), simplified gyral pattern (3/4), superior and/or inferior cerebellar vermian hypoplasia (3/4), corpus callosum dysgenesis (1/4), and thin brainstem (1/4) (Fig. 1i–t). All patients had abnormal electroencephalography with epileptiform discharges being the most common finding (5/6) (Table S3, Fig. S1).

Fig. 1: Pedigrees, photographs, and brain imaging of individuals with biallelic NSRP1 loss-of-function (LoF) variants.
figure 1

(a) Pedigree of family 1, a consanguineous family from Egypt. The proband, BAB13228, is indicated with a black arrow. B-allele frequency for BAB13228 calculated from exome variant data is shown below the pedigree and demonstrates a 3.1-Mb block of absence of heterozygosity (AOH) on chromosome 17 (gray) around NSRP1 variant (red line). (b) Pedigree of family 2, a consanguineous family from Iran. The proband, F799-004, is indicated with a black arrow. B-allele frequency for F799-004 calculated from exome variant data is shown below the pedigree and demonstrates a 7.7-Mb block of AOH on chromosome 17 (gray) around NSRP1 variant (red line). (c) Pedigree of family 3, a family from Pakistan. The proband, BAB14701, is indicated with a black arrow. B-allele frequency for affected sibling BAB14708 calculated from exome variant data demonstrates a 6.9-Mb block of AOH on chromosome 17 (gray) around NSRP1 variant (red line). (df) BAB13228 (family 1) at 1 year of age showing microcephaly, appendicular spasticity, clenched fists, long face with prominent chin, high forehead, prominent metopic suture, sparse hair and eyebrows, hypertelorism, broad depressed nasal bridge, upturned nose, long philtrum, V-shaped upper lip, large and low set ears, and overriding digits. (g) F799-003 at 9 years of age showing microcephaly, appendicular spasticity, high forehead, tented mouth, short philtrum, high arched palate, downslanting palpebral fissures, and simplified, prominent ears. (h) F799-004 at 3 years of age showing microcephaly, appendicular spasticity, clenched fists, high forehead, tented mouth, short philtrum, high arched palate, downslanting palpebral fissures, and simplified, prominent ears. (il) Representative brain magnetic resonance image (MRI) findings for BAB13228 (family 1) at four months of age. Representative axial T2 (ik) and sagittal T2 (l) images are shown. Imaging features include underopercularization (red arrows), an immature, simplified gyral pattern, generous extra-axial cerebrospinal fluid (CSF) spaces, inferior cerebellar vermian hypoplasia (black arrowhead), dysgenesis of the corpus callosum, paucity of deep white matter, and thin brainstem. (mp) Representative brain MRI findings for BAB14701 (family 3) at three months of age. Representative axial T2 (mo) and sagittal T2 (p) images are shown. Imaging features include underopercularization (red arrows), simplified gyral pattern, generous extra-axial CSF spaces, and mild superior and inferior cerebellar vermian hypoplasia (white arrowheads). (q, r) Representative brain MRI findings for F799-003 (family 2) at ten years of age. Representative axial T2 (q) and sagittal T2 (r) images are shown. Underopercularization (red arrows), mild gyral simplification, generous extra-axial CSF spaces, and left posterior plagiocephaly can be seen. (s, t) Representative brain MRI findings for BAB14706 (family 3) at three years seven months. Representative axial T2/fluid-attenuated inversion recovery (FLAIR) (s) and sagittal T1 (t) are shown. The folia of the superior cerebellar vermis are prominent (white arrowhead).

Molecular findings

NSRP1 (NM_032141.4) contains 7 exons and encodes a 558 amino acid (AA) protein (Fig. 2a, b) [8]. The NSRP1 protein contains two RNA recognition motif (RRM) domains and an arginine-serine (RS)-like domain (Fig. 2b) [8]. There are two coiled-coil domains that overlap with the RRM and RS-like domains and are involved in self-oligomerization and splicing activity (Fig. 2b) [9]. Finally, a C-terminal nuclear localization signal (AA 531–540) lies within the RS-like domain and is required for nuclear localization and splicing activity (Fig. 2b). There is also a major alternate transcript, ENST00000612959.4, which lacks the second exon of NM_032141.4 (Fig. 2a).

Fig. 2: Variant location on NSRP1 schematic and impact on protein sequence.
figure 2

(a) Schematic diagram of NSRP1 and the position of pathogenic variants identified in this study. 5’UTR and 3’UTR are labeled (white). Exons are indicated in gray. The structure of the main transcript NM_032141.4 (ENST00000247026.9) and the alternate transcript ENST00000612959.4 are provided. (b) Structure of Nuclear Speckle Splicing Regulatory Protein 1 and the position of pathogenic variants identified in this study. NSRP1 contains two RNA recognition motif (RRM) domains (blue), an RS-like domain (orange), two coiled-coil domains that overlap with the RRM and RS-like domains (black bars), and a C-terminal nuclear localization signal (black). The two predicted truncated proteins (p.Glu455AlafsTer20 and p.Lys425GlufsTer5) are depicted. Novel amino acids following the frameshifts are indicated in red. All diagrams are approximately to scale. (c) Model of splicing dysfunction caused by last exon frameshift variants. As previously demonstrated, wild type (WT) NSRP1 is found within the nucleus where it promotes exon inclusion or exclusion by antagonizing the activity of serine/arginine-rich splicing factors 1 and 2 (SRSF1, SRSF2) [8, 9]. Truncated variants of NSRP1 lacking the C-terminal nuclear localization signal (amino acids 531–540) show abnormal cytoplasmic localization and lack splicing activity.

All variants were orthogonally studied and segregation in accordance with Mendelian expectations was confirmed by Sanger sequencing (Fig. S2). Variant allele details are summarized in Fig. 2 and Supplemental Table 4. All variants are ultrarare (minor allele frequency [MAF] < 1/10,000) [15]. Two are private variants (c.1359_1362delAAAG and c.52C>T) found neither in gnomAD nor our internal database of >13,000 exomes. The third variant, c.1272dupG, which occurs in a 13-base polypurine stretch after a GG dinucleotide, is found in gnomAD only once in the heterozygous state. Both frameshift variants occur in the last exon and are predicted to disrupt the critical nuclear localization signal required for NSRP1-mediated alternative splicing (Fig. 2b) [8]. c.52C>T (p.Gln18Ter) results in a premature termination codon (PTC) within the second exon of NM_032141.4 and has a CADD score (GRCh37-v1.6) of 37. This variant is noncoding (c.−49 + 1248C>T) in ENST00000612959.4 (Fig. 2a). Homozygous LoF variants are absent from gnomAD, and biallelic LoF variants are otherwise absent in our internal database.

DISCUSSION

NSRP1 is a widely expressed nuclear speckle protein [8]. In splicing assays, human NSRP1 modulated splice site selection, resulting in exon inclusion or exclusion in a gene-specific fashion [8, 9]. NSRP1 interacts with splicing factors SRSF1 and SRSF2 and counteracts their alternative splicing activities [9]. Mice heterozygous for an Nsrp1 null allele had no obvious deficits, but homozygous Nsrp1 null mice exhibit embryonic lethality as early as embryonic day 6.5 [8]. Similarly, homozygosity for a null allele in C. elegans ortholog ccdc-55 resulted in early larval arrest, whereas RNAi allowed larval development but resulted in abnormal distal tip cell migration [16]. Finally, homozygosity for a null allele in D. melanogaster ortholog CG15747 (also known as nito) caused larval lethality [17]. While early lethality has limited investigation of NSRP1’s role in the brain, nito knockdown in D. melanogaster CCAP/buriscon neurons reduced axon outgrowth during the transition from larval to adult form [18].

Here, we report biallelic LoF variants in NSRP1 in six individuals from three unrelated families. A clinical synopsis of the key phenotypic features defining this autosomal recessive (AR) disease trait are DD, epilepsy, hypotonia, appendicular spasticity, microcephaly, dysphagia, and dysmorphic facies. As half of the cohort has drug-resistant epilepsy, electrographic abnormalities, and DD, the NSRP1 associated disease trait fulfills clinical criteria for developmental and epileptic encephalopathies [19]. While dysmorphic features are common, a consistent clinically recognizable pattern was not discernible. The cause of early death, seen in two of the six patients and an additional affected sibling who was not genotyped, is unclear. Brain abnormalities included simplified gyral pattern, underopercularization, and superior and/or inferior vermian hypoplasia. These abnormalities were more striking in individuals who underwent imaging in infancy rather than later in childhood. This may represent an early delay in brain maturation that subsequently improves with age. Further identification of patients with biallelic pathogenic NSRP1 variants and serial imaging will clarify the associated imaging spectrum.

Only two families in this report are consanguineous based on historical report. However, analysis of AOH regions and inbreeding coefficient (F) calculation using unphased ES data demonstrated higher than expected degrees of consanguinity (Table S1,S2). This was particularly striking for family 3 who would be regarded as nonconsanguineous based on clinical history (third cousins) yet exhibited calculated inbreeding coefficients (F = 0.1093, 0.1257) in excess of the clinical genetics definition for a consanguineous union (F = 0.0156, second cousins). Such discrepancies from expectations can result from high consanguinity rates in the underlying population or from unknown loops of consanguinity within consecutive generations. Indeed, the calculated inbreeding coefficients in a study of 1,020 healthy individuals from Pakistan, family 3’s country of origin, ranged from 0.029 to 0.091 [20]. Thus, it is critical to utilize ES-derived AOH data and inbreeding coefficients in personal genome analysis to assess the likelihood of a recessive disorder and to identify regions of homozygosity harboring pathogenic alleles regardless of known family structure recorded by clinical history.

Three homozygous LoF NSRP1 variants were identified: c.1359_1362delAAAG (family 1), c.1272dupG (family 2), and c.52C>T (family 3) in families from Egypt, Iran, and Pakistan, respectively. While c.1359_1362delAAAG and c.1272dupG mRNA result in PTCs in the final exon and likely escape nonsense mediated decay (NMD), both are predicted to result in a mutant protein with loss of the nuclear localization signal (Fig. 2b) [8]. Prior studies of human NSRP1 demonstrated this signal’s deletion results in abnormal cytoplasmic localization and loss of alternative splicing activity (Fig. 2c) [8]. Both variants are therefore regarded as LoF variant alleles. c.52C>T results in a PTC in exon 2, and the mutant mRNA transcript is predicted to be unstable and subject to NMD. Thus, the association of human disease with biallelic LoF NSRP1 variants is congruent with studies in model organisms showing tolerance of haploinsufficiency but not biallelic null alleles [8, 16, 17]. As homozygous null NSRP1 alleles are incompatible with life in M. musculus, D. melanogaster, and C. elegans, the survival of these patients suggests these disease associated LoF variants may not be null alleles but rather hypomorphic LoF [8, 16, 17]. Alternatively, there may be reduced dependence on NSRP1 or greater redundancy during early embryonic development in humans.

It is important to acknowledge that NSRP1 itself undergoes alternative splicing. There are three major transcripts expressed within the adult human brain: ENST00000247026.9, ENST00000612959.4, and ENST00000394826.8 (Fig. S3, https://www.gtexportal.org/home/). ENST00000247026.9 is equivalent to NM_032141.4 and encodes the full-length 558-AA protein (Fig. 2a, S3). ENST00000612959.4 results in the N-terminal truncation of the first 54 AA, resulting in a 504-AA protein. Finally, ENST00000394826.8 has an alternative second exon that introduces a PTC in exon 3 and is therefore predicted to undergo NMD. The relative abundance of ENST00000247026.9 and ENST00000612959.4 varies between brain regions, with ENST00000612959.4 being the dominant transcript in some areas (e.g., the substantia nigra) and ENST00000247026.9 in others (e.g., the cerebellar hemispheres) (Fig. S3). As the variant found in family 3 is coding in ENST00000247026.9 (c.52C>T) and noncoding in ENST00000394826.8, it is possible this variant does not cause complete LoF, with the important caveat that GTEx reflects the adult rather than developmental brain transcriptome. In fact, this may help explain the phenotypic spectrum of the cohort. While the phenotypes highly overlap and form a consistent syndrome, the phenotype of family 3 with the c.52C>T variant allele is slightly milder. For example, the three siblings have less severe epilepsy and achieved more developmental milestones than families 1 and 2 (Table S3). This is congruent with model organism data showing milder phenotypic manifestations from incomplete loss of NSRP1 in C. elegans and X. laevis [16, 21].

There is presently little evidence for a neurodegenerative course in NSRP1-related disease. Considering the static nature of the syndrome and the occurrence of spastic quadriparesis, NSRP1 could be potentially considered as a cerebral palsy (CP) gene. While CP was traditionally attributed to perinatal insults, many patients lack such history, and genetic etiologies are increasingly recognized [22]. In a large CP cohort, an enrichment of de novo variants (DNV) was detected [22]. This enrichment is consistent with the epidemiology of CP, with most cases occurring sporadically [22]. However, biallelic damaging variants in several hereditary spastic paraplegia genes were also identified, demonstrating both dominant and recessive disease traits contribute to CP genetics [22]. Network analysis revealed enrichment of damaging variants in genes involved in neuritogenesis including extracellular matrix, focal adhesions, cytoskeleton, and Rho GTPases [22]. Given NSRP1’s role in splicing regulation, it will be important to examine how genes downstream of NSRP1 intersect with genes previously associated with CP, intellectual disability (ID), or epilepsy.

Splicing dysfunction is a major cause of Mendelian disorders. The proportion of human pathogenic variants affecting cis-acting elements is estimated between 15% and 60% [23]. In contrast, few Mendelian genetic diseases result from pathogenic variation in splicing factors [5]. Until recently, Mendelian disorders involving spliceosomal components or splicing factors fit into two categories: craniofacial–skeletal disorders or isolated retinitis pigmentosa with rare overlap (Fig. S4) [5]. It is unclear why ubiquitously expressed splicing factors segregate into these disease categories but may reflect differential tissue sensitivities. Many splicing factors involved in mouse brain development (e.g. PTBP1, PTBP2, NOVA1) exhibit LoF intolerance (pLI 0.98-1) yet presently lack human disease associations, suggesting many Mendelian splicing factor disorders remain to be discovered [4]. Supporting this contention, of 28 novel candidate genes identified in a statistical analysis of 31,058 parent–offspring trios of individuals with developmental disorders, 3 encode splicing factors or spliceosomal components (SRRM2, U2AF2, and HNRNPD) [24]. However, they may also cause human disease through biallelic recessive trait mechanisms as seen with NSRP1 and therefore should be prioritized in gene discovery efforts concentrated on consanguineous families.

In conclusion, these data establish that biallelic pathogenic variants in the splicing regulator NSRP1 cause an AR NDD trait characterized by DD, epilepsy, microcephaly, and spastic cerebral palsy and implicate NSRP1 as a key splicing factor in human brain development. Neuron-specific knockout of NSRP1 in model organisms, and studies of an NSRP1 allelic series during development, will provide further insight into its role. Such studies may also provide insights into the downstream gene conformers involved in mammalian brain development.