Splicing variants are less commonly reported than other variant types [1,2,3]. However, despite most being functional nulls, previous reports suggested that they are under-recognized [2]. Our recent experience of investigating a cohort of 38 individuals with a severe, genetically heterogeneous Mendelian phenotype shows that this continues to be a problem; three variants that affected splicing were initially “missed” because they were not detected by current splice site detection algorithms. Our concern is that splicing variants will continue to be overlooked in clinical laboratory settings because the quantity of data generated per person by “exomes” and “genomes” necessitates the use of splice site detection programs. Our cases highlight significant deficiencies in current standard programs, where variants at the U2 canonical AG (acceptor) and GT (donor) splice sites are reliably detected, but variants at other positions with more loosely defined consensus sequences, or U12 splice sites, are rarely detected [4].
We analyzed 38 sequentially ascertained samples from individuals who were born unable to feel pain within an UK NHS genetic service. We initially found 28 of the 38 cases had bi-allelic variants that affected function in SCN9A (17), NTRK1 (13) and NGF (1); all causing autosomal recessive painless disorders. Given the specific phenotype and limited genotypes, we hand-curated the remaining cases. In three unrelated index individuals rare variants within intronic regions were present on sequencing (ExAC frequencies of 2/113990, 0/61864, and 2/120742). Splice site prediction program analysis of each variant was performed using Alamut (http://www.interactive-biosoftware.com/alamut-visual/), which incorporates five different programs all of which have been developed for U2 splice sites rather than U12 splice sites (see Table 1). Whilst the three variants were not flagged, we considered they could alter splice site function due to their proximity to known splice sites, and assessed each by a minigene splicing assay [5], as transcriptomic sequencing was not possible (see Supplement for methodology).
Each variant was proven to alter splicing by comparing the results to those of normal wild-type splicing (see Fig. 1). The first case had a hereditary sensory and autonomic neuropathy type 4 (HSAN4) phenotype but sequencing analysis had detected no variants, however we noted a homozygous NTRK1 variant, c.575–19 G > A (reference sequence NM_002529.3; exons are numbered as in the reference sequence NG_007493.1) [6]. Bioinformatics analysis suggested that this potentially created a new AG splice acceptor site, which was predicted to be superior to the existing site; this was the case, with the introduction of a 17 bp frame-shifting insertion into the NTRK1 mRNA (Fig. 1b.ii). The second case also had a HSAN4 phenotype but sequencing had revealed only one NTRK1 heterozygous variant proven to affect function c.1550 G > A6. On sequence inspection we noted a heterozygous splice donor site variant c.717 + 4 A > T. Although usually + 4 can be any base, in NTRK1 this + 4 position is invariant [7]. The minigene assay showed that the variant resulted in complete loss of NTRK1 exon 6 in the mature transcript (Fig. 1b.iii). The third case had a phenotype consistent with congenital insensitivity to pain (CIP)—a lack of pain and smell perception with normal intelligence. Only a single heterozygous variant in SCN9A was detected, c.2686 C > T (reference sequence NM_002977.3; exons are numbered as in the reference sequence NG_012798.1). On inspection of the sequencing data we noted a heterozygous variant c.377 + 5 C > T in SCN9A occurring in a U12 splice site. The U12 donor site sequence is RTATCCTT where +5 C is invariant [8], in contrast to the more ubiquitous U2 splice site where +5 can vary [7]. The variant caused complete loss of exon 3 and aberrant splicing into a cryptic U2 acceptor site resulting in a +1 frame shift (Fig. 1c.ii). All three variants were predicted to lead to nonsense-mediated decay and hence be nulls. Early nonsense and frame shift variants have been identified in other cases of HSAN4 and CIP patients and hence are likely to explain the disease in our patients [9].
In our cohort, a tenth of cases harbored missed variants that affected splicing. Their detection was considerably aided by the clear phenotype and the limited number of genes that required analysis. Had the phenotype been more variable, or did not resemble a single-gene disorder, or the number of potentially causative genes greater, it is possible that these variants would have gone undetected. These cases also illustrate the need to consider seeking a second variant in autosomal recessive phenotypes when only a heterozygous variant is found; generally the chance of there being a second variant is greater than the chance the person is an incidental carrier.
As the volume of genetic data generated per person continues to increase (from exon-by-exon analysis, through gene panels, to exomes, and now whole genome sequencing), this has inevitably led to a greater reliance on variant detection programs. The limitations of these algorithms to detect splicing variants, especially those occurring in U12 introns and less well defined consensus sequences, needs to be better recognized and urgently remedied (for instance, by the use of the Spliceman program, see Table. 1), otherwise, the full potential of genetic testing will be limited [10]. Until then, researchers in clinical laboratories should be vigilant in seeking splicing variants and perhaps should hand-curate for rare variations occurring beyond −1, −2, +1, +2 sites. If splicing variants that affect function are missed by splicing prediction programs, or by a conservatism to prevent the identification of too many variants of unclear significance in clinical laboratories, then this has two important consequences. Firstly, it decreases the utility of exome/genome scale sequencing, and secondly, it increases the risk that other variations may be erroneously regarded as disease-causing.
References
Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N Engl J Med. 2013;369:1502–11.
Baralle D, Baralle M. Splicing in action: assessing disease causing sequence changes. J Med Genet. 2005;42:737–48.
Lynch M. Rate, molecular spectrum, and consequences of human mutation. Proc Natl Acad Sci Usa. 2010;107:961–8.
Grozeva D, Carss K, Spasic-Boskovic O, et al. Targeted next-generation sequencing analysis of 1000 individuals with intellectual disability. Human Mutat. 2015;36:1197–204.
Spickett C, Hysi P, Hammond CJ, et al. Deep intronic sequence variants in COL2A1 affect the alternative splicing efficiency of exon 2, and may confer a risk for rhegmatogenous retinal detachment. Hum Mutat. 2016;37:1085–96.
Shaikh SS, Chen YC, Halsall SA, et al. A Comprehensive functional analysis of NTRK1 missense mutations causing hereditary sensory and autonomic neuropathy type IV (HSAN IV). Hum Mutat. 2017;38:55–63.
Zhang MQ. Statistical features of human exons and their flanking regions. Hum Mol Genet. 1998;7:919–32.
Clark F, Thanaraj TA. Categorization and characterization of transcript-confirmed constitutively and alternatively spliced introns and exons from human. Hum Mol Genet. 2002;11:451–64.
Indo Y. Molecular basis of congenital insensitivity to pain with anhidrosis (CIPA): mutations and polymorphisms in TRKA (NTRK1) gene encoding the receptor tyrosine kinase for nerve growth factor. Hum Mutat. 2001;18:462–71.
Lewandowska MA. The missing puzzle piece: splicing mutations. Int J Clin Exp Pathol. 2013;6:2675–82.
Acknowledgements
SS acknowledges the support of a MRC CASE studentship MR/K017551/1, and MSN the support of a Wellcome Trust Collaborative award 200183/Z/15/Z.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Shaikh, S.S., Nahorski, M.S., Rai, H. et al. Before progressing from “exomes” to “genomes”… don’t forget splicing variants. Eur J Hum Genet 26, 1559–1562 (2018). https://doi.org/10.1038/s41431-018-0214-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41431-018-0214-3
This article is cited by
-
Mitochondrial proteome research: the road ahead
Nature Reviews Molecular Cell Biology (2024)